Flot de conception d applications parallèles sur plateforme reconfigurable dynamiquement

Size: px
Start display at page:

Download "Flot de conception d applications parallèles sur plateforme reconfigurable dynamiquement"

Transcription

1 Flot de conception d applications parallèles sur plateforme reconfigurable dynamiquement Clément Foucher, Fabrice Muller et Alain Giulieri Université de Nice-Sophia Antipolis (UNS), (LEAT/ CNRS) {Clement.Foucher ; Fabrice.Muller ; Alain.Giulieri}@unice.fr

2 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 2 /28

3 Parallel systems Personal computers Low amount of software cores 3 /28

4 Parallel systems Personal computers Low amount of software cores Manycore systems Still rising 3 /28

5 Parallel systems Personal computers Low amount of software cores Manycore systems Still rising High performance computers Massively parallel software systems 3 /28

6 Reconfigurable systems Generic hardware Arrays of reconfigurable elements linked by a configurable network Configurable into particular systems Blank FPGA Routed FPGA 4 /28

7 Reconfigurable systems Generic hardware Arrays of reconfigurable elements linked by a configurable network Configurable into particular systems Evolution: partial dynamic reconfiguration Change only a part of the device on the fly Dynamically reconfigurable areas Implementation 1 Implementation 2 FPGA 4 /28

8 Reconfigurable parallel systems High Performance Reconfigurable Computers (HPRC) Introduce generic hardware Accelerate specific portions of code ( application kernels ) by devolving computations to hardware accelerators Node Node Node Node Network Standard HPC structure 5 /28

9 Reconfigurable parallel systems High Performance Reconfigurable Computers (HPRC) Introduce generic hardware Accelerate specific portions of code ( application kernels ) by devolving computations to hardware accelerators Two kinds [1] Nonuniform Node, Uniform System (NNUS) Hw Sw Hw Sw Hw Sw Hw Sw Uniform Node, Nonuniform System (UNNS) Hw node Hw node Sw node Sw node 5 /28

10 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 6 /28

11 Software nature of applications Historically, systems were software only By the time, hardware resources are added Hardware is static while software is flexible Applications are still mainly software, even if some particular computations are hardware accelerated Reconfigurable systems Add more flexibility to hardware elements But applications conception did not change: software-based, with some computations devolved to hardware 7 /28

12 Applications linked to execution platform HPRCs Ability to use hardware resources depends on the ratio hardware vs. software resources UNNSs are more flexible than NNUSs Communication performances between software and hardware depends on underlying buses Platform change can lead to performances collapse Hw Sw Hw Sw Hw Sw Hw Sw NNUS Hw node Hw node Sw node Sw node UNNS 8 /28

13 Applications linked to execution platform HPRCs Ability to use hardware resources depends on the ratio hardware vs. software resources UNNSs are more flexible than NNUSs Communication performances between software and hardware depends on underlying buses Platform change can lead to performances collapse Applications are thus deeply linked to the underlying hardware Changing the execution platform of legacy application can force partial application re-write To maintain performances Or even to make the application compatible 8 /28

14 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 9 /28

15 Our proposal Applications Build applications as sets of kernels Kernels linked by data flows A kernel is what to do without knowing how to do Kernel 2 Encoder Audio encoder Data in MPEG flow Kernel 1 MPEG2 decoder OR Data out flow H264 encoder 10 /28

16 Our proposal Kernels Kernel implementation is handled independently from the application Each kernel can have various implementations, hardware ones and/or software ones Kernel Implementation 2 Implementation1 Bitstream Initial context Accessors Bitstream Accessors Initial context 11 /28

17 Our proposal Execution platform The platform Various nodes connected through a network The nodes A host cell, in charge of inter-node communication Various computing cells The computing cells Reconfigurable This is the Simple Parallel platform for Reconfigurable Environment (SPORE) 12 /28

18 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 13 /28

19 Application design Application Kernel 2 Kernel 2 Impl. 1 Impl. 2 Kernel 1 Impl. 3 Kernel 3 Implementation 3 All these elements are described using XML files Descriptors Bitstream Accessors 14 /28

20 Accessors Different actions are needed by kernels Context Passing input data Retrieving results Same kernel, various implementations The same elements are needed But the way to provide them can differ E.g. to start a computation Implementation 1 requires writing the value 0x in register 2 Implementation 2 requires writing the value 0x in register 4 Accessors Sets of specific interactions to execute to realize a particular action Read / Write Registers / Memory range / FIFO 15 /28

21 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 16 /28

22 SPORE: 1 st implementation [2] Purpose Propose a HPC-like platform allowing evolving to reconfigurable architectures Concept proof on actual implementation Architecture based on HPCs Globally distributed locally shared memory architecture Implementing MPI for communication Software only execution units Based on Xilinx ml507 board Virtex 5 fx70t FPGA Contains a PowerPC MiB DDR2 Ethernet interface CompactFlash reader 17 /28

23 SPORE: 1 st implementation [2] Node Linux 18 /28

24 SPORE: 1 st implementation results [2] Communication time versus number of jobs/board 19 /28

25 SPORE: 2 nd implementation Application development flow concept proof Includes reconfigurable hardware Dynamic scheduling of kernels Bus-based communication Try to reduce memory issues No MPI communication 20 /28

26 SPORE: 2 nd implementation Data server Global scheduler Ethernet network Local sched. OS Node Host cell Storage manager Local storage Xilinx s ICAP superset [3] Local sched. OS Node Host cell Storage manager Local storage Cell mana. Reconfig. manager FARM Cell mana. Reconfig. manager FARM Bus Bus Kernel controller Kernel controller Kernel controller Kernel controller Kernel controller Kernel controller Thread Thread Kernel host Computing cells Kernel host Thread Kernel host 21 /28

27 SPORE: 2 nd implementation Control Linux-based control Scheduler Data server communication Dynamic reconfiguration Linux driver for FARM Cells management (accessors) Linux driver for cells XML parsing 22 /28

28 SPORE: 2 nd implementation results Only basic tests performed until now Simple AES encrypt-then-decrypt application 2 channels in parallel Fully functional for this basic test XML-based application description Hardware kernels reconfiguration Kernels configuration and other accessors 23 /28

29 Plan Parallel and reconfigurable computing today Context Systems limitations Our proposal: the SPORE system Implementing SPORE Application design The hardware platform and its implementations Conclusion & future works 24 /28

30 Conclusion SPORE is a platform virtualization tool Can be adapted to any underlying reconfigurable hardware Preliminary SPORE implementations working General application flow validated Still need a complete SPORE implementation Flow and HPC Software and hardware 25 /28

31 Future work 2 nd platform Perform further tests Application containing more kernels Video (H264) 3 rd platform Include elements from both previous Software AND reconfigurable hardware MPI communication Improvements NOC-based communication Scheduling No real algorithm for now 26 /28

32 And why not Xilinx s Linux Zynq Cell Cell Cell Cell Cell Cell Cell 27 /28

33 Questions.

34 [Bibliography] [1] Tarek El-Ghazawi, Esam El-Araby, Miaoqing Huang, Kris Gaj, Volodymyr Kindratenko, and Duncan Buell. The promise of High-Performance Reconfigurable Computing. Computer, 41 :69 76 February 2008 [2] Clément Foucher, Fabrice Muller, and Alain Giulieri. Exploring FPGAs capability to host a HPC design. 28 th Norchip Conference (Norchip 2010), pages 1 4, Tampere Finland November 2010 [3] François Duhem, Fabrice Muller, and Philippe Lorenzini. FaRM: Fast reconfiguration manager for reducing reconfiguration time overhead on FPGA. 7 th International Symposium on Applied Reconfigurable Computing (ARC 2011), Belfast, United Kingdom March 2011

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab

FPGA Accelerator Virtualization in an OpenPOWER cloud. Fei Chen, Yonghua Lin IBM China Research Lab FPGA Accelerator Virtualization in an OpenPOWER cloud Fei Chen, Yonghua Lin IBM China Research Lab Trend of Acceleration Technology Acceleration in Cloud is Taking Off Used FPGA to accelerate Bing search

More information

Kirchhoff Institute for Physics Heidelberg

Kirchhoff Institute for Physics Heidelberg Kirchhoff Institute for Physics Heidelberg Norbert Abel FPGA: (re-)configuration and embedded Linux 1 Linux Front-end electronics based on ADC and digital signal processing Slow control implemented as

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Laboratoryof Electronics, Antennas and Telecommunications (UMR 7248)

Laboratoryof Electronics, Antennas and Telecommunications (UMR 7248) INSIS Laboratoryof Electronics, Antennas and Telecommunications (UMR 7248) LEAT - Université Nice-Sophia Antipolis, UMR CNRS 7248 Campus Sophi@Tech - Bâtiment Forum 930 route des Colles, BP 145, 06903

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

Design and Implementation of the Heterogeneous Multikernel Operating System

Design and Implementation of the Heterogeneous Multikernel Operating System 223 Design and Implementation of the Heterogeneous Multikernel Operating System Yauhen KLIMIANKOU Department of Computer Systems and Networks, Belarusian State University of Informatics and Radioelectronics,

More information

Extending the Power of FPGAs. Salil Raje, Xilinx

Extending the Power of FPGAs. Salil Raje, Xilinx Extending the Power of FPGAs Salil Raje, Xilinx Extending the Power of FPGAs The Journey has Begun Salil Raje Xilinx Corporate Vice President Software and IP Products Development Agenda The Evolution of

More information

FPGA Music Project. Matthew R. Guthaus. Department of Computer Engineering, University of California Santa Cruz http://vlsida.soe.ucsc.

FPGA Music Project. Matthew R. Guthaus. Department of Computer Engineering, University of California Santa Cruz http://vlsida.soe.ucsc. Department of Computer Engineering, University of California Santa Cruz http://vlsida.soe.ucsc.edu Biographic Info 2006 PhD, University of Michigan in Electrical Engineering 2003-2005 Statistical Physical

More information

Router Architectures

Router Architectures Router Architectures An overview of router architectures. Introduction What is a Packet Switch? Basic Architectural Components Some Example Packet Switches The Evolution of IP Routers 2 1 Router Components

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and Simulink targeting ASIC/FGPA. Previously Worked as logic

More information

Data Center and Cloud Computing Market Landscape and Challenges

Data Center and Cloud Computing Market Landscape and Challenges Data Center and Cloud Computing Market Landscape and Challenges Manoj Roge, Director Wired & Data Center Solutions Xilinx Inc. #OpenPOWERSummit 1 Outline Data Center Trends Technology Challenges Solution

More information

FPGA area allocation for parallel C applications

FPGA area allocation for parallel C applications 1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University

More information

Run-Time Scheduling Support for Hybrid CPU/FPGA SoCs

Run-Time Scheduling Support for Hybrid CPU/FPGA SoCs Run-Time Scheduling Support for Hybrid CPU/FPGA SoCs Jason Agron jagron@ittc.ku.edu Acknowledgements I would like to thank Dr. Andrews, Dr. Alexander, and Dr. Sass for assistance and advice in both research

More information

Next Generation Operating Systems

Next Generation Operating Systems Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015 The end of CPU scaling Future computing challenges Power efficiency Performance == parallelism Cisco Confidential 2 Paradox of the

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

ReCoSoC'11 Montpellier, France. Implementation Scenario for Teaching Partial Reconfiguration of FPGA

ReCoSoC'11 Montpellier, France. Implementation Scenario for Teaching Partial Reconfiguration of FPGA ReCoSoC'11 Montpellier, France Implementation Scenario for Teaching Partial Reconfiguration of FPGA Pierre Leray, Amor Nafkha, Christophe Moy SUPELEC/IETR 22 June 2011 SUPELEC - Campus de Rennes - France

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Open Flow Controller and Switch Datasheet

Open Flow Controller and Switch Datasheet Open Flow Controller and Switch Datasheet California State University Chico Alan Braithwaite Spring 2013 Block Diagram Figure 1. High Level Block Diagram The project will consist of a network development

More information

Performance Oriented Management System for Reconfigurable Network Appliances

Performance Oriented Management System for Reconfigurable Network Appliances Performance Oriented Management System for Reconfigurable Network Appliances Hiroki Matsutani, Ryuji Wakikawa, Koshiro Mitsuya and Jun Murai Faculty of Environmental Information, Keio University Graduate

More information

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland ory nel Storage ( M C S ) Demystified Jerome McFarland Principal Product Marketer AGENDA + INTRO AND ARCHITECTURE + PRODUCT DETAILS + APPLICATIONS THE COMPUTE-STORAGE DISCONNECT + Compute And Data Have

More information

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE Guillène Ribière, CEO, System Architect Problem Statement Low Performances on Hardware Accelerated Encryption: Max Measured 10MBps Expectations: 90 MBps

More information

Cryptography & Network-Security: Implementations in Hardware

Cryptography & Network-Security: Implementations in Hardware Kris Gaj joined ECE GMU in Fall 1998 Cryptography & Network-Security: Implementations in Hardware http://ece.gmu.edu/crypto-text.htm 6 Ph.D. Students Pawel Chodowiec Charikleia Zouridaki Chang Shu Sashisu

More information

How To Write Security Enhanced Linux On Embedded Systems (Es) On A Microsoft Linux 2.2.2 (Amd64) (Amd32) (A Microsoft Microsoft 2.3.2) (For Microsoft) (Or

How To Write Security Enhanced Linux On Embedded Systems (Es) On A Microsoft Linux 2.2.2 (Amd64) (Amd32) (A Microsoft Microsoft 2.3.2) (For Microsoft) (Or Security Enhanced Linux on Embedded Systems: a Hardware-accelerated Implementation Leandro Fiorin, Alberto Ferrante Konstantinos Padarnitsas, Francesco Regazzoni University of Lugano Lugano, Switzerland

More information

Simple Introduction to Clusters

Simple Introduction to Clusters Simple Introduction to Clusters Cluster Concepts Cluster is a widely used term meaning independent computers combined into a unified system through software and networking. At the most fundamental level,

More information

High-performance reconfigurable computers

High-performance reconfigurable computers G U E S T E D I T O R S I N T R O D U C T I O N High- Performance Reconfigurable Computing Duncan Buell, University of South Carolina Tarek El-Ghazawi, George Washington University Kris Gaj, George Mason

More information

A General Framework for Tracking Objects in a Multi-Camera Environment

A General Framework for Tracking Objects in a Multi-Camera Environment A General Framework for Tracking Objects in a Multi-Camera Environment Karlene Nguyen, Gavin Yeung, Soheil Ghiasi, Majid Sarrafzadeh {karlene, gavin, soheil, majid}@cs.ucla.edu Abstract We present a framework

More information

Operating System for the K computer

Operating System for the K computer Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements

More information

VPX Implementation Serves Shipboard Search and Track Needs

VPX Implementation Serves Shipboard Search and Track Needs VPX Implementation Serves Shipboard Search and Track Needs By: Thierry Wastiaux, Senior Vice President Interface Concept Defending against anti-ship missiles is a problem for which high-performance computing

More information

Arquitectura Virtex. Delay-Locked Loop (DLL)

Arquitectura Virtex. Delay-Locked Loop (DLL) Arquitectura Virtex Compuesta de dos elementos principales configurables : CLBs y IOBs. Los CLBs se interconectan a través de una matriz general de routeado (GRM). Posse una intefaz VersaRing que proporciona

More information

Distributed Reconfigurable Hardware for Image Processing Acceleration

Distributed Reconfigurable Hardware for Image Processing Acceleration Distributed Reconfigurable Hardware for Image Processing Acceleration Julio D. Dondo, Jesús Barba, Fernando Rincón, Francisco Sánchez, David D. Fuente and Juan C. López School of Computer Engineering,

More information

Cellular Computing on a Linux Cluster

Cellular Computing on a Linux Cluster Cellular Computing on a Linux Cluster Alexei Agueev, Bernd Däne, Wolfgang Fengler TU Ilmenau, Department of Computer Architecture Topics 1. Cellular Computing 2. The Experiment 3. Experimental Results

More information

The MeeGo Multimedia Stack. Dr. Stefan Kost Nokia - The MeeGo Multimedia Stack - CELF Embedded Linux Conference Europe

The MeeGo Multimedia Stack. Dr. Stefan Kost Nokia - The MeeGo Multimedia Stack - CELF Embedded Linux Conference Europe The MeeGo Multimedia Stack The MeeGo Multimedia Stack MeeGo Intro Architecture Development GStreamer Quick MeeGo Intro MeeGo = Moblin + Maemo Linux distribution for CE devices Netbook, Phone (Handset),

More information

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1

Distributed Systems. REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1 Distributed Systems REK s adaptation of Prof. Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 1 1 The Rise of Distributed Systems! Computer hardware prices are falling and power increasing.!

More information

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing

Petascale Software Challenges. Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Petascale Software Challenges Piyush Chaudhary piyushc@us.ibm.com High Performance Computing Fundamental Observations Applications are struggling to realize growth in sustained performance at scale Reasons

More information

High-Density Network Flow Monitoring

High-Density Network Flow Monitoring Petr Velan petr.velan@cesnet.cz High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa Motivation What is high-density flow monitoring? Monitor high traffic in as little rack units as possible

More information

Laboratory of Electronics, Antennas and Telecommunications (UMR 7248)

Laboratory of Electronics, Antennas and Telecommunications (UMR 7248) INSIS Laboratory of Electronics, Antennas and Telecommunications (UMR 7248) LEAT - Université Nice-Sophia Antipolis, UMR CNRS 7248 Campus Sophi@Tech - Bâtiment Forum 930 route des Colles, BP 145, 06903

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

7a. System-on-chip design and prototyping platforms

7a. System-on-chip design and prototyping platforms 7a. System-on-chip design and prototyping platforms Labros Bisdounis, Ph.D. Department of Computer and Communication Engineering 1 What is System-on-Chip (SoC)? System-on-chip is an integrated circuit

More information

Is High-Performance Reconfigurable Computing the Next Supercomputing Paradigm?

Is High-Performance Reconfigurable Computing the Next Supercomputing Paradigm? Is High-Performance Reconfigurable Computing the Next Supercomputing Paradigm? Tarek El-Ghazawi The George Washington University 1 Background High-Performance Reconfigurable Computers (HPRCs) based on

More information

AHCI and NVMe as Interfaces for SATA Express Devices - Overview

AHCI and NVMe as Interfaces for SATA Express Devices - Overview AHCI and NVMe as Interfaces for SATA Express Devices - Overview By Dave Landsman, SanDisk Page 1 Table of Contents 1 Introduction... 3 2 SATA Express Interface Architecture... 4 3 NVMe And AHCI Comparison...

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

Intel DPDK Boosts Server Appliance Performance White Paper

Intel DPDK Boosts Server Appliance Performance White Paper Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks

More information

Operating Systems (Linux)

Operating Systems (Linux) G51CSA Computer Systems Architecture Operating Systems (Linux) Red Hat Jon Masters About the speaker Jon Masters is a Senior Software Engineer at Red Hat History in embedded devices with

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

Experience with the integration of distribution middleware into partitioned systems

Experience with the integration of distribution middleware into partitioned systems Experience with the integration of distribution middleware into partitioned systems Héctor Pérez Tijero (perezh@unican.es) J. Javier Gutiérrez García (gutierjj@unican.es) Computers and Real-Time Group,

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc. Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE

More information

Accelerate Cloud Computing with the Xilinx Zynq SoC

Accelerate Cloud Computing with the Xilinx Zynq SoC X C E L L E N C E I N N E W A P P L I C AT I O N S Accelerate Cloud Computing with the Xilinx Zynq SoC A novel reconfigurable hardware accelerator speeds the processing of applications based on the MapReduce

More information

Component Based Software Design using CORBA. Victor Giddings, Objective Interface Systems Mark Hermeling, Zeligsoft

Component Based Software Design using CORBA. Victor Giddings, Objective Interface Systems Mark Hermeling, Zeligsoft Component Based Software Design using CORBA Victor Giddings, Objective Interface Systems Mark Hermeling, Zeligsoft Component Based Software Design using CORBA Victor Giddings (OIS), Mark Hermeling (Zeligsoft)

More information

Reconfig'09 Cancun, Mexico

Reconfig'09 Cancun, Mexico Reconfig'09 Cancun, Mexico New OPBHW Interface for Real-Time Partial Reconfiguration of FPGA Julien Delorme, Amor Nafkha, Pierre Leray, Christophe Moy SUPELEC/IETR 10 December 2009 SUPELEC - Campus de

More information

Design Patterns for Packet Processing Applications on Multi-core Intel Architecture Processors

Design Patterns for Packet Processing Applications on Multi-core Intel Architecture Processors White Paper Cristian F. Dumitrescu Software Engineer Intel Corporation Design Patterns for Packet Processing Applications on Multi-core Intel Architecture Processors December 2008 321058 Executive Summary

More information

Integrated Application and Data Protection. NEC ExpressCluster White Paper

Integrated Application and Data Protection. NEC ExpressCluster White Paper Integrated Application and Data Protection NEC ExpressCluster White Paper Introduction Critical business processes and operations depend on real-time access to IT systems that consist of applications and

More information

Open Network Install Environment (ONIE) LinuxCon North America 2015

Open Network Install Environment (ONIE) LinuxCon North America 2015 Open Network Install Environment (ONIE) LinuxCon North America 2015 Curt Brune, Member of Technical Staff August 2015 Agenda What is It? ONIE Solves a Real Problem ONIE Design Approach ONIE Adoption ONIE

More information

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware Mohamed Taher 1, Kris Gaj 2, Tarek El-Ghazawi 1, and Nikitas Alexandridis 1 1 The George Washington University 2 George Mason

More information

FlexPath Network Processor

FlexPath Network Processor FlexPath Network Processor Rainer Ohlendorf Thomas Wild Andreas Herkersdorf Prof. Dr. Andreas Herkersdorf Arcisstraße 21 80290 München http://www.lis.ei.tum.de Agenda FlexPath Introduction Work Packages

More information

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007 Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

An Open Architecture through Nanocomputing

An Open Architecture through Nanocomputing 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore An Open Architecture through Nanocomputing Joby Joseph1and A.

More information

Running Native Lustre* Client inside Intel Xeon Phi coprocessor

Running Native Lustre* Client inside Intel Xeon Phi coprocessor Running Native Lustre* Client inside Intel Xeon Phi coprocessor Dmitry Eremin, Zhiqi Tao and Gabriele Paciucci 08 April 2014 * Some names and brands may be claimed as the property of others. What is the

More information

Speaker: Dr. Whai-En Chen

Speaker: Dr. Whai-En Chen ing peaker: Dr. Whai-En Chen Assistant Professor Institute of Computer cience and Information Engineering National Ilan University (NIU) Email: wechen@niu.edu.tw The source is obtained from Prof. Nen-Fu

More information

Asymmetry Everywhere (with Automatic Resource Management) Onur Mutlu onur@cmu.edu

Asymmetry Everywhere (with Automatic Resource Management) Onur Mutlu onur@cmu.edu Asymmetry Everywhere (with Automatic Resource Management) Onur Mutlu onur@cmu.edu The Setting Hardware resources are shared among many threads/apps in a data center (or peta-scale) system Sockets, cores,

More information

Using PCI Express Technology in High-Performance Computing Clusters

Using PCI Express Technology in High-Performance Computing Clusters Using Technology in High-Performance Computing Clusters Peripheral Component Interconnect (PCI) Express is a scalable, standards-based, high-bandwidth I/O interconnect technology. Dell HPC clusters use

More information

Optimizing service availability in VoIP signaling networks, by decoupling query handling in an asynchronous RPC manner

Optimizing service availability in VoIP signaling networks, by decoupling query handling in an asynchronous RPC manner Optimizing service availability in VoIP signaling networks, by decoupling query handling in an asynchronous RPC manner Voichiţa Almăşan and Iosif Ignat Technical University of Cluj-Napoca Computer Science

More information

PCI Express and Storage. Ron Emerick, Sun Microsystems

PCI Express and Storage. Ron Emerick, Sun Microsystems Ron Emerick, Sun Microsystems SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature

More information

UNLOCK YOUR IEC 61850 TESTING EXCELLENCE

UNLOCK YOUR IEC 61850 TESTING EXCELLENCE IMPROVE EFFICIENCY TEST WITH CONFIDENCE OF KNOW-HOW LEARN AND EXPAND YOUR IEC 61850 SKILLS MASTER YOUR NETWORK KNOWLEDGE GENERATE TEST RESULTS UNLOCK YOUR IEC 61850 TESTING EXCELLENCE Connect To & Read

More information

Beyond Virtualization: A Novel Software Architecture for Multi-Core SoCs. Jim Ready September 18, 2012

Beyond Virtualization: A Novel Software Architecture for Multi-Core SoCs. Jim Ready September 18, 2012 Beyond Virtualization: A Novel Software Architecture for Multi-Core SoCs Jim Ready September 18, 2012 How HW guys view the world SW Software HW How SW guys view the world SW HW Reality The SoC Software

More information

Embedded Systems: map to FPGA, GPU, CPU?

Embedded Systems: map to FPGA, GPU, CPU? Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven jos@vectorfabrics.com Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware

More information

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links Filippo Costa on behalf of the ALICE DAQ group DATE software 2 DATE (ALICE Data Acquisition and Test Environment) ALICE is a

More information

Energiatehokas laskenta Ubi-sovelluksissa

Energiatehokas laskenta Ubi-sovelluksissa Energiatehokas laskenta Ubi-sovelluksissa Jarmo Takala Tampereen teknillinen yliopisto Tietokonetekniikan laitos email: jarmo.takala@tut.fi Energy-Efficiency Comparison: VGA 30 frames/s, 512kbit/s Software

More information

Energy-aware job scheduler for highperformance

Energy-aware job scheduler for highperformance Energy-aware job scheduler for highperformance computing 7.9.2011 Olli Mämmelä (VTT), Mikko Majanen (VTT), Robert Basmadjian (University of Passau), Hermann De Meer (University of Passau), André Giesler

More information

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture

Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture Review from last time CS 537 Lecture 3 OS Structure What HW structures are used by the OS? What is a system call? Michael Swift Remzi Arpaci-Dussea, Michael Swift 1 Remzi Arpaci-Dussea, Michael Swift 2

More information

Linux Process Scheduling Policy

Linux Process Scheduling Policy Lecture Overview Introduction to Linux process scheduling Policy versus algorithm Linux overall process scheduling objectives Timesharing Dynamic priority Favor I/O-bound process Linux scheduling algorithm

More information

Solid State Storage in Massive Data Environments Erik Eyberg

Solid State Storage in Massive Data Environments Erik Eyberg Solid State Storage in Massive Data Environments Erik Eyberg Senior Analyst Texas Memory Systems, Inc. Agenda Taxonomy Performance Considerations Reliability Considerations Q&A Solid State Storage Taxonomy

More information

COS 318: Operating Systems. Virtual Machine Monitors

COS 318: Operating Systems. Virtual Machine Monitors COS 318: Operating Systems Virtual Machine Monitors Kai Li and Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318/ Introduction u Have

More information

HP StorageWorks MPX200 Simplified Cost-Effective Virtualization Deployment

HP StorageWorks MPX200 Simplified Cost-Effective Virtualization Deployment HP StorageWorks MPX200 Simplified Cost-Effective Virtualization Deployment Executive Summary... 2 HP StorageWorks MPX200 Architecture... 2 Server Virtualization and SAN based Storage... 3 VMware Architecture...

More information

Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U

Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U Hitachi Virtage Embedded Virtualization Hitachi BladeSymphony 10U Datasheet Brings the performance and reliability of mainframe virtualization to blade computing BladeSymphony is the first true enterprise-class

More information

Programming and Scheduling Model for Supporting Heterogeneous Architectures in Linux

Programming and Scheduling Model for Supporting Heterogeneous Architectures in Linux Programming and Scheduling Model for Supporting Heterogeneous Architectures in Linux Third Workshop on Computer Architecture and Operating System co-design Paris, 25.01.2012 Tobias Beisel, Tobias Wiersema,

More information

How Solace Message Routers Reduce the Cost of IT Infrastructure

How Solace Message Routers Reduce the Cost of IT Infrastructure How Message Routers Reduce the Cost of IT Infrastructure This paper explains how s innovative solution can significantly reduce the total cost of ownership of your messaging middleware platform and IT

More information

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC Wenhao Wu Program Manager Windows HPC team Agenda Microsoft s Commitments to HPC RDMA for HPC Server RDMA for Storage in Windows 8 Microsoft

More information

Developing reliable Multi-Core Embedded-Systems with NI Linux Real-Time

Developing reliable Multi-Core Embedded-Systems with NI Linux Real-Time Developing reliable Multi-Core Embedded-Systems with NI Linux Real-Time Oliver Bruder National Instruments Switzerland oliver.bruder@ Embedded Product Design Surveys 66% Product designs complete over budget

More information

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

More information

Mellanox Academy Online Training (E-learning)

Mellanox Academy Online Training (E-learning) Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),

More information

EMC ISILON AND ELEMENTAL SERVER

EMC ISILON AND ELEMENTAL SERVER Configuration Guide EMC ISILON AND ELEMENTAL SERVER Configuration Guide for EMC Isilon Scale-Out NAS and Elemental Server v1.9 EMC Solutions Group Abstract EMC Isilon and Elemental provide best-in-class,

More information

Cray DVS: Data Virtualization Service

Cray DVS: Data Virtualization Service Cray : Data Virtualization Service Stephen Sugiyama and David Wallace, Cray Inc. ABSTRACT: Cray, the Cray Data Virtualization Service, is a new capability being added to the XT software environment with

More information

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University

Virtual Machine Monitors. Dr. Marc E. Fiuczynski Research Scholar Princeton University Virtual Machine Monitors Dr. Marc E. Fiuczynski Research Scholar Princeton University Introduction Have been around since 1960 s on mainframes used for multitasking Good example VM/370 Have resurfaced

More information

GATEWAY TRAFFIC COMPRESSION

GATEWAY TRAFFIC COMPRESSION GATEWAY TRAFFIC COMPRESSION Name: Devaraju. R Guide Name: Dr. C. Puttamadappa Research Centre: S.J.B. Institute of Technology Year of Registration: May 2009 Devaraju R 1 1. ABSTRACT: In recent years with

More information

Managing Variability in Software Architectures 1 Felix Bachmann*

Managing Variability in Software Architectures 1 Felix Bachmann* Managing Variability in Software Architectures Felix Bachmann* Carnegie Bosch Institute Carnegie Mellon University Pittsburgh, Pa 523, USA fb@sei.cmu.edu Len Bass Software Engineering Institute Carnegie

More information

A Distributed Render Farm System for Animation Production

A Distributed Render Farm System for Animation Production A Distributed Render Farm System for Animation Production Jiali Yao, Zhigeng Pan *, Hongxin Zhang State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China {yaojiali, zgpan, zhx}@cad.zju.edu.cn

More information

The Design and Implementation of Content Switch On IXP12EB

The Design and Implementation of Content Switch On IXP12EB The Design and Implementation of Content Switch On IXP12EB Thesis Proposal by Longhua Li Computer Science Department University of Colorado at Colorado Springs 5/15/2001 Approved by: Dr. Edward Chow (Advisor)

More information

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance...

More information

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces White Paper Introduction The DDR3 SDRAM memory architectures support higher bandwidths with bus rates of 600 Mbps to 1.6 Gbps (300 to 800 MHz), 1.5V operation for lower power, and higher densities of 2

More information

Recent Advances in Circuits, Communications and Signal Processing

Recent Advances in Circuits, Communications and Signal Processing Tarek FRIKHA, Nader BEN AMOR, Mohamed Ramzi Ben Yemna, Jean-Philippe DIGUET*, Mohamed ABID CES-Laboratory,Lab-STICC* Sfax University, National Engineering School of Sfax, Sfax, TUNISIE Université de Bretagne

More information

How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance

How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance www. ipro-com.com/i t Contents Overview...3 Introduction...3 Understanding Latency...3 Network Latency...3

More information

Switch Fabric Implementation Using Shared Memory

Switch Fabric Implementation Using Shared Memory Order this document by /D Switch Fabric Implementation Using Shared Memory Prepared by: Lakshmi Mandyam and B. Kinney INTRODUCTION Whether it be for the World Wide Web or for an intra office network, today

More information

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers. White Paper Virtualized SAP: Optimize Performance with Cisco Data Center Virtual Machine Fabric Extender and Red Hat Enterprise Linux and Kernel-Based Virtual Machine What You Will Learn The virtualization

More information

COS 318: Operating Systems. Virtual Machine Monitors

COS 318: Operating Systems. Virtual Machine Monitors COS 318: Operating Systems Virtual Machine Monitors Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall10/cos318/ Introduction Have been around

More information

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture. Chirag Gupta,Sumod Mohan K cgupta@clemson.edu, sumodm@clemson.edu Abstract In this project we propose a method to improve

More information

Cryptanalysis with a cost-optimized FPGA cluster

Cryptanalysis with a cost-optimized FPGA cluster Cryptanalysis with a cost-optimized FPGA cluster Jan Pelzl, Horst Görtz Institute for IT-Security, Germany UCLA IPAM Workshop IV Special Purpose Hardware for Cryptography: Attacks and Applications December

More information

Improving Scalability for Citrix Presentation Server

Improving Scalability for Citrix Presentation Server VMWARE PERFORMANCE STUDY VMware ESX Server 3. Improving Scalability for Citrix Presentation Server Citrix Presentation Server administrators have often opted for many small servers (with one or two CPUs)

More information