Single-chip Cloud Computer IA Tera-scale Research Processor

Size: px
Start display at page:

Download "Single-chip Cloud Computer IA Tera-scale Research Processor"

Transcription

1 Single-chip Cloud Computer IA Tera-scale esearch Processor Jim Held Intel Fellow & Director Tera-scale Computing esearch Intel Labs August 31,

2 Agenda Tera-scale esearch SCC Architecture Software environment Co-travelers Program Summary 2

3 Performance Scaling Challenges Energy Efficiency Design Complexity Programming Models Emerging Applications 3

4 Tera-scale esearch Applications Identify, characterize & optimize Programming Empower the mainstream System Software Scalable services Memory Hierarchy Feed the compute engine Interconnects High bandwidth, low latency Cores power efficient general & special function 4

5 21.72mm Teraflops esearch Processor Technology Transistors Die Area PLL I/O Area 12.64mm I/O Area single tile 1.5mm C4 bumps # 2.0mm 65nm, 1 poly, 8 metal (Cu) 100 Million (full-chip) 1.2 Million (tile) 275mm 2 (full-chip) 3mm 2 (tile) 8390 TAP Goals: Deliver Tera-scale performance Single precision TFLOP at desktop power Frequency target 5GHz Bi-section B/W order of Terabits/s Link bandwidth in hundreds of GB/s Prototype two key technologies On-die interconnect fabric 3D stacked memory Develop a scalable design methodology d design approach Mesochronous clocking Power-aware capability 5

6 Within-Die Variation-Aware DVFS and scheduling Max Frequency variation per core 28% at 1.2V 62% at 0.8V No correlation die to die individual characterization required Improved performance or energy efficiency with: Multiple frequency islands Dynamic scheduling of processing to core 6 Dighe, S, et al., Within-Die Variation-Aware Dynamic Voltage-Frequency Scaling, Core Mapping and Thread Hopping for an 80-Core Processor, in Proceedings of ISSCC 2010 (IEEE International Solid-State Circuits Conference), Feb. 2010

7 Cloud datacenters: Cloud Computing Today 1000s of networked computers Millions of threads & petabytes of data Opportunity: Lower power, higher density via integration Greater efficiency and better programmability 45 Mb/s T3 to Internet (x4) (x4) (x4) (x8) (x4) (x4) (x4) Example: Intel s Open Cirrus testbed Intel Labs Pittsburgh (x2x5 p2p) (x8 p2p) (x4) (x4x4 p2p) (x4x4 p2p) (x15 p2p) (x15 p2p) (x15 p2p) 7

8 Motivations for SCC Many-core processor research High-performance power-efficient fabric Fine-grain power management Message-based programming support Parallel Programming research Better support for scale-out server model Operating system, communication architecture Scale-out programming model for client Programming languages, runtimes An experimental processor, not a product! 8

9 5.2mm 21.4mm Single-chip Cloud Computer Experimental Processor 3.6mm 26.5mm L2$1 Core1 outer MPB L2$0 Core0 DD3 MC DD3 MC PLL TILE TILE DD3 MC DD3 MC JTAG Technology Interconnect 45nm Hi-K CMOS 9 Metal (Cu) VC System Interface + I/O Transistors Die: 1.3B, : 48M Area 18.7mm 2 Die Area 567.1mm 2 9

10 Memory Controller Memory Controller Memory Controller Memory Controller Architectural Overview 2 nd Generation Intel Labs experimental processor IA-based software research vehicle Cluster-on-die architecture 48 Pentium Processor cores (P54C - x87fp only) L2$1 Core 1 outer MPB L2$0 Core 0 System I/F 10 Howard, J, et al., A 48-Core IA-32 Message-Passing Processor with DVFS in 45nm CMOS, in Proceedings of ISSCC 2010 (IEEE International Solid-State Circuits Conference), Feb. 2010

11 Freq (GHz) On-die Interconnect Architecture 6x4 2D Mesh NOC 16B wide data links + 2B sideband 8 Virtual Channels in 2 classes Fixed (X-Y) routing Performance Target freq: 1.1V Link Bandwidth 64GB/s 4 cycle latency Power Management Independent Frequency & Voltage control Sleep mode, clock gating, low power F V 60MHz 50 C 0.94V 1.4GHz 0.73V 300MHz 0.94V 0.9GHz Supply (V) 1.34V 2.6GHz 1.32V 1.3GHz outer Core 11

12 Memory Memory Architecture Up to 64GB DD3 via 4 memory 21.3GB/s 16KB SAM in each tile as Message Passing Buffer (MPB) Caching 32KB L1 per core (16KB I,D), 12MB L2 cache (256KB/core) No HW cache-coherent shared memory Addressing Core physical to system physical addresses in 16MB sections Memory mapped configuration & control registers Core Physical Address Space Core Physical Address Space Physical-Physical Mapping Physical-Physical Mapping 12 System Physical Address Space

13 Memory Controller Memory Controller Memory Controller Memory Controller Power Management Configurable MC, Mesh, SIF Voltage & Frequency Software-controlled DVFS* of cores Fine-grain voltage control at 4 tile cluster level (6.25mV) Frequency control at tile level (16bit divider) Closed loop - thermal sensors per tile, current through BMC V F 0 n V F n 1 F n F n DVFS gives wide operating range: 1.14V 1GHz 0.7v 125MHz System I/F 13 *Dynamic voltage and frequency scaling

14 Measured full chip power 14 14

15 Power breakdown MC & DD % Full Power Breakdown Total W Cores 69% MC & DD % Low Power Breakdown Total W outers & 2Dmesh 10% Global Clocking 2% outers & 2Dmesh 5% Global Clocking 5% Cores 21% Clocking: 1.9W outers: 12.1W Cores: 87.7W MCs: 23.6W Cores-1GHz, Mesh-2GHz, 1.14V, 50 C Clocking: 1.2W outers: 1.2W Cores: 5.1W MCs: 17.2W Cores-125MHz, Mesh-250MHz, 0.7V, 50 C 15 15

16 ocky Lake SCC platform eplacement for evaluation board 100 boards with more I/O, more robust, less expensive BIOS/Firmware in definition 16

17 SCC Chipset System Interface FPGA Connects to SCC Mesh interconnect IO capabilities like PCIe, Ethernet & SATA Bitstream is part of scckit distribution Board Management Controller (BMC) JTAG interface for Clocking, Power etc. USB Stick with FPGA bitstream Network interface for User interaction via Telnet Status monitoring Firmware is part of scckit distribution 17

18 SCC Software Software Environment Bare Metal Customized Linux CCE communication & power management API Tools Selected Intel tools (e.g., icc, ifort,...) Microsoft research release of SCC extensions to Visual Studio Management Console PC Software PCIe driver with integrated TCP/IP driver Programming API for communication with SCC platform GUI for interaction with SCC platform Command line tools for interaction with SCC platform 18

19 CCE Communication API A compact, lightweight communication environment. SCC and CCE were designed together side by side: a true HW/SW co-design project. A research vehicle to understand how message passing APIs map onto many core chips. For experienced parallel programmers willing to work close to the hardware. Static SPMD Execution Model: identical UEs created together when a program starts (this is a standard approach familiar to message passing programmers) UE: Unit of Execution a software entity that advances a program counter (e.g. process of thread). 19

20 SCC Disclosure Demos Financial Analytics w/ shared virtual memory Microsoft Visual Studio Advanced Power Management JavaScript Physics Modeling HPC Parallel Workloads Hadoop Web Search 20

21 SCC Co-Travelers Program Currently building SCC software research community 100 systems total, with 40 in Oregon Datacenter esearch partners for 2010 have been selected SCC community website available today Communities.intel.com/community/marc To share ideas, HowTo s, code, tools 21

22 Summary SCC provides a unique experimental platform for many-core research Better support for Cloud data center servers Scale-out programming model for client We are sharing SCC with selected researchers in academia and industry Documentation and presentations

23 SCC Team Jason Howard, Saurabh Dighe, Yatin Hoskote, Sriram Vangal, David Finan, Gregory uhl, David Jenkins, Howard Wilson, Nitin Borkar, Gerhard Schrom, Fabrice Pailet, Shailendra Jain, Tiju Jacob, Satish Yada, Sraven Marella, Praveen Salihundam, Vasantha Erraguntla, Michael Konow, Michael iepen, Guido Droege, Joerg Lindemann, Matthias Gries, Thomas Apel, Kersten Henriss, Tor Lund-Larsen, Sebastian Steibl, Shekhar Borkar, Vivek De, ob Van Der Wijngaart, Timothy Mattson 23

24 24 Questions?

25

Single-chip Cloud Computer A many-core research platform from Intel Labs

Single-chip Cloud Computer A many-core research platform from Intel Labs Single-chip Cloud Computer A many-core research platform from Intel Labs Compute evolving to Tera- Scale Entertainment, Learning Performance TIPS GIPS MIPS KIPS Multimedia Model- Based Apps 3D and Video

More information

How To Build A Cloud Computer

How To Build A Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Early experience with the Barrelfish OS and the Single-Chip Cloud Computer

Early experience with the Barrelfish OS and the Single-Chip Cloud Computer Early experience with the Barrelfish OS and the Single-Chip Cloud Computer Simon Peter, Adrian Schüpbach, Dominik Menzi and Timothy Roscoe Systems Group, Department of Computer Science, ETH Zurich Abstract

More information

DEPLOYING AND MONITORING HADOOP MAP-REDUCE ANALYTICS ON SINGLE-CHIP CLOUD COMPUTER

DEPLOYING AND MONITORING HADOOP MAP-REDUCE ANALYTICS ON SINGLE-CHIP CLOUD COMPUTER DEPLOYING AND MONITORING HADOOP MAP-REDUCE ANALYTICS ON SINGLE-CHIP CLOUD COMPUTER ANDREAS-LAZAROS GEORGIADIS, SOTIRIOS XYDIS, DIMITRIOS SOUDRIS MICROPROCESSOR AND MICROSYSTEMS LABORATORY ELECTRICAL AND

More information

Comparing the Power and Performance of Intel s SCC to State-of-the-Art CPUs and GPUs

Comparing the Power and Performance of Intel s SCC to State-of-the-Art CPUs and GPUs Comparing the Power and Performance of Intel s SCC to State-of-the-Art CPUs and GPUs Ehsan Totoni, Babak Behzad, Swapnil Ghike, Josep Torrellas Department of Computer Science, University of Illinois at

More information

The 48-core SCC processor: the programmer s view

The 48-core SCC processor: the programmer s view 1 The 48-core SCC processor: the programmer s view Timothy G. Mattson, Rob F. Van der Wijngaart, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram Vangal, Nitin

More information

Fast Fluid Dynamics on the Single-chip Cloud Computer

Fast Fluid Dynamics on the Single-chip Cloud Computer Fast Fluid Dynamics on the Single-chip Cloud Computer Marco Fais, Francesco Iorio High-Performance Computing Group Autodesk Research Toronto, Canada francesco.iorio@autodesk.com Abstract Fast simulation

More information

- Nishad Nerurkar. - Aniket Mhatre

- Nishad Nerurkar. - Aniket Mhatre - Nishad Nerurkar - Aniket Mhatre Single Chip Cloud Computer is a project developed by Intel. It was developed by Intel Lab Bangalore, Intel Lab America and Intel Lab Germany. It is part of a larger project,

More information

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Seeking Opportunities for Hardware Acceleration in Big Data Analytics Seeking Opportunities for Hardware Acceleration in Big Data Analytics Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto Who

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

Efficient Big Data Analytics Computing: A Research Challenge

Efficient Big Data Analytics Computing: A Research Challenge Efficient Big Data Analytics Computing: A Research Challenge Wilfred Pinfold Director, Extreme Scale Programs 1 Agenda Intel Big Data Context Overview Key Research Areas Challenges Partnerships 2 Meeting

More information

Intel Labs at ISSCC 2012. Copyright Intel Corporation 2012

Intel Labs at ISSCC 2012. Copyright Intel Corporation 2012 Intel Labs at ISSCC 2012 Copyright Intel Corporation 2012 Intel Labs ISSCC 2012 Highlights 1. Efficient Computing Research: Making the most of every milliwatt to make computing greener and more scalable

More information

Parallel sorting on Intel Single-Chip Cloud computer

Parallel sorting on Intel Single-Chip Cloud computer Parallel sorting on Intel Single-Chip Cloud computer Kenan Avdic, Nicolas Melot, Jörg Keller 2, and Christoph Kessler Linköpings Universitet, Dept. of Computer and Inf. Science, 5883 Linköping, Sweden

More information

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed Computing Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies

More information

Digital Design for Low Power Systems

Digital Design for Low Power Systems Digital Design for Low Power Systems Shekhar Borkar Intel Corp. Outline Low Power Outlook & Challenges Circuit solutions for leakage avoidance, control, & tolerance Microarchitecture for Low Power System

More information

Accelerating the Data Plane With the TILE-Mx Manycore Processor

Accelerating the Data Plane With the TILE-Mx Manycore Processor Accelerating the Data Plane With the TILE-Mx Manycore Processor Bob Doud Director of Marketing EZchip Linley Data Center Conference February 25 26, 2015 1 Announcing the World s First 100-Core A 64-Bit

More information

A Scalable VISC Processor Platform for Modern Client and Cloud Workloads

A Scalable VISC Processor Platform for Modern Client and Cloud Workloads A Scalable VISC Processor Platform for Modern Client and Cloud Workloads Mohammad Abdallah Founder, President and CTO Soft Machines Linley Processor Conference October 7, 2015 Agenda Soft Machines Background

More information

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright 2013, Oracle and/or its affiliates. All rights reserved. 1 Oracle SPARC Server for Enterprise Computing Dr. Heiner Bauch Senior Account Architect 19. April 2013 2 The following is intended to outline our general product direction. It is intended for information

More information

Exploring the Intel Single-Chip Cloud Computer and its possibilities for SVP

Exploring the Intel Single-Chip Cloud Computer and its possibilities for SVP Master Computer Science Computer Science University of Amsterdam Exploring the Intel Single-Chip Cloud Computer and its possibilities for SVP Roy Bakker 0583650 bakkerr@science.uva.nl October 7, 2011 Supervisors:

More information

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading

More information

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM

A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM A Fast Inter-Kernel Communication and Synchronization Layer for MetalSVM Pablo Reble, Stefan Lankes, Carsten Clauss, Thomas Bemmerl Chair for Operating Systems, RWTH Aachen University Kopernikusstr. 16,

More information

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking

More information

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation Exascale Challenges and General Purpose Processors Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation Jun-93 Aug-94 Oct-95 Dec-96 Feb-98 Apr-99 Jun-00 Aug-01 Oct-02 Dec-03

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors. NoCArc 09

Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors. NoCArc 09 Performance Evaluation of 2D-Mesh, Ring, and Crossbar Interconnects for Chip Multi- Processors NoCArc 09 Jesús Camacho Villanueva, José Flich, José Duato Universidad Politécnica de Valencia December 12,

More information

SeaMicro SM10000-64 Server

SeaMicro SM10000-64 Server SeaMicro SM10000-64 Server Building Datacenter Servers Using Cell Phone Chips Ashutosh Dhodapkar, Gary Lauterbach, Sean Lie, Ashutosh Dhodapkar, Gary Lauterbach, Sean Lie, Dhiraj Mallick, Jim Bauman, Sundar

More information

The search engine you can see. Connects people to information and services

The search engine you can see. Connects people to information and services The search engine you can see Connects people to information and services The search engine you cannot see Total data: ~1EB Processing data : ~100PB/day Total web pages: ~1000 Billion Web pages updated:

More information

Intel Xeon Processor E5-2600

Intel Xeon Processor E5-2600 Intel Xeon Processor E5-2600 Best combination of performance, power efficiency, and cost. Platform Microarchitecture Processor Socket Chipset Intel Xeon E5 Series Processors and the Intel C600 Chipset

More information

Invasive MPI on Intel s Single-Chip Cloud Computer

Invasive MPI on Intel s Single-Chip Cloud Computer Invasive MPI on Intel s Single-Chip Cloud Computer Isaías A. Comprés Ureña 1, Michael Riepen 2, Michael Konow 2, and Michael Gerndt 1 1 Technical University of Munich (TUM), Institute of Informatics, Boltzmannstr.

More information

The Transition to PCI Express* for Client SSDs

The Transition to PCI Express* for Client SSDs The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers

More information

Operating System Support for Multiprocessor Systems-on-Chip

Operating System Support for Multiprocessor Systems-on-Chip Operating System Support for Multiprocessor Systems-on-Chip Dr. Gabriel marchesan almeida Agenda. Introduction. Adaptive System + Shop Architecture. Preliminary Results. Perspectives & Conclusions Dr.

More information

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,

More information

760 Veterans Circle, Warminster, PA 18974 215-956-1200. Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974.

760 Veterans Circle, Warminster, PA 18974 215-956-1200. Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974. 760 Veterans Circle, Warminster, PA 18974 215-956-1200 Technical Proposal Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA 18974 for Conduction Cooled NAS Revision 4/3/07 CC/RAIDStor: Conduction

More information

OpenSoC Fabric: On-Chip Network Generator

OpenSoC Fabric: On-Chip Network Generator OpenSoC Fabric: On-Chip Network Generator Using Chisel to Generate a Parameterizable On-Chip Interconnect Fabric Farzad Fatollahi-Fard, David Donofrio, George Michelogiannakis, John Shalf MODSIM 2014 Presentation

More information

Power-Aware High-Performance Scientific Computing

Power-Aware High-Performance Scientific Computing Power-Aware High-Performance Scientific Computing Padma Raghavan Scalable Computing Laboratory Department of Computer Science Engineering The Pennsylvania State University http://www.cse.psu.edu/~raghavan

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

Scaling from Datacenter to Client

Scaling from Datacenter to Client Scaling from Datacenter to Client KeunSoo Jo Sr. Manager Memory Product Planning Samsung Semiconductor Audio-Visual Sponsor Outline SSD Market Overview & Trends - Enterprise What brought us to NVMe Technology

More information

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA

More information

CFD Implementation with In-Socket FPGA Accelerators

CFD Implementation with In-Socket FPGA Accelerators CFD Implementation with In-Socket FPGA Accelerators Ivan Gonzalez UAM Team at DOVRES FuSim-E Programme Symposium: CFD on Future Architectures C 2 A 2 S 2 E DLR Braunschweig 14 th -15 th October 2009 Outline

More information

HP Z Turbo Drive PCIe SSD

HP Z Turbo Drive PCIe SSD Performance Evaluation of HP Z Turbo Drive PCIe SSD Powered by Samsung XP941 technology Evaluation Conducted Independently by: Hamid Taghavi Senior Technical Consultant June 2014 Sponsored by: P a g e

More information

A Multi-Level Routing Scheme and Router Architecture to support Hierarchical Routing in Large Network on Chip Platforms

A Multi-Level Routing Scheme and Router Architecture to support Hierarchical Routing in Large Network on Chip Platforms A Multi-Level Routing Scheme and Router Architecture to support Hierarchical Routing in Large Network on Chip Platforms Rickard Holsmark, Shashi Kumar and Maurizio Palesi 2 School of Engineering, Jönköping

More information

ICRI-CI Retreat Architecture track

ICRI-CI Retreat Architecture track ICRI-CI Retreat Architecture track Uri Weiser June 5 th 2015 - Funnel: Memory Traffic Reduction for Big Data & Machine Learning (Uri) - Accelerators for Big Data & Machine Learning (Ran) - Machine Learning

More information

Xeon+FPGA Platform for the Data Center

Xeon+FPGA Platform for the Data Center Xeon+FPGA Platform for the Data Center ISCA/CARL 2015 PK Gupta, Director of Cloud Platform Technology, DCG/CPG Overview Data Center and Workloads Xeon+FPGA Accelerator Platform Applications and Eco-system

More information

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission.

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal. 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Stovepipes to Clouds Rick Reid Principal Engineer SGI Federal 2013 by SGI Federal. Published by The Aerospace Corporation with permission. Agenda Stovepipe Characteristics Why we Built Stovepipes Cluster

More information

Sentinel-SSO: Full DDR-Bank Power and Signal Integrity. Design Automation Conference 2014

Sentinel-SSO: Full DDR-Bank Power and Signal Integrity. Design Automation Conference 2014 Sentinel-SSO: Full DDR-Bank Power and Signal Integrity Design Automation Conference 2014 1 Requirements for I/O DDR SSO Analysis Modeling Package and board I/O circuit and layout PI + SI feedback Tool

More information

FUJITSU Enterprise Product & Solution Facts

FUJITSU Enterprise Product & Solution Facts FUJITSU Enterprise Product & Solution Facts shaping tomorrow with you Business-Centric Data Center The way ICT delivers value is fundamentally changing. Mobile, Big Data, cloud and social media are driving

More information

Infrastructure Matters: POWER8 vs. Xeon x86

Infrastructure Matters: POWER8 vs. Xeon x86 Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report

More information

DESIGN CHALLENGES OF TECHNOLOGY SCALING

DESIGN CHALLENGES OF TECHNOLOGY SCALING DESIGN CHALLENGES OF TECHNOLOGY SCALING IS PROCESS TECHNOLOGY MEETING THE GOALS PREDICTED BY SCALING THEORY? AN ANALYSIS OF MICROPROCESSOR PERFORMANCE, TRANSISTOR DENSITY, AND POWER TRENDS THROUGH SUCCESSIVE

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

GPU Architecture. Michael Doggett ATI

GPU Architecture. Michael Doggett ATI GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super

More information

SERVER CLUSTERING TECHNOLOGY & CONCEPT

SERVER CLUSTERING TECHNOLOGY & CONCEPT SERVER CLUSTERING TECHNOLOGY & CONCEPT M00383937, Computer Network, Middlesex University, E mail: vaibhav.mathur2007@gmail.com Abstract Server Cluster is one of the clustering technologies; it is use for

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing II (Qloud) 15 319, spring 2010 3 rd Lecture, Jan 19 th Majd F. Sakr Lecture Motivation Introduction to a Data center Understand the Cloud hardware in CMUQ

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Intel Itanium Architecture

Intel Itanium Architecture Intel Itanium Architecture Roadmap and Technology Update Dr. Gernot Hoyler Technical Marketing EMEA Intel Itanium Architecture Growth MARKET Over 3x revenue growth Y/Y* More than 10x growth* in shipments

More information

Fastboot Techniques for x86 Architectures. Marcus Bortel Field Application Engineer QNX Software Systems

Fastboot Techniques for x86 Architectures. Marcus Bortel Field Application Engineer QNX Software Systems Fastboot Techniques for x86 Architectures Marcus Bortel Field Application Engineer QNX Software Systems Agenda Introduction BIOS and BIOS boot time Fastboot versus BIOS? Fastboot time Customizing the boot

More information

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers The Orca Chip... Heart of IBM s RISC System/6000 Value Servers Ravi Arimilli IBM RISC System/6000 Division 1 Agenda. Server Background. Cache Heirarchy Performance Study. RS/6000 Value Server System Structure.

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

Achieving Performance Isolation with Lightweight Co-Kernels

Achieving Performance Isolation with Lightweight Co-Kernels Achieving Performance Isolation with Lightweight Co-Kernels Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015

More information

Open Cirrus: Towards an Open Source Cloud Stack

Open Cirrus: Towards an Open Source Cloud Stack Open Cirrus: Towards an Open Source Cloud Stack Karlsruhe Institute of Technology (KIT) HPC2010, Cetraro, June 2010 Marcel Kunze KIT University of the State of Baden-Württemberg and National Laboratory

More information

Kalray MPPA Massively Parallel Processing Array

Kalray MPPA Massively Parallel Processing Array Kalray MPPA Massively Parallel Processing Array Next-Generation Accelerated Computing February 2015 2015 Kalray, Inc. All Rights Reserved February 2015 1 Accelerated Computing 2015 Kalray, Inc. All Rights

More information

Going Linux on Massive Multicore

Going Linux on Massive Multicore Embedded Linux Conference Europe 2013 Going Linux on Massive Multicore Marta Rybczyńska 24th October, 2013 Agenda Architecture Linux Port Core Peripherals Debugging Summary and Future Plans 2 Agenda Architecture

More information

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...

More information

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures Ada Gavrilovska Karsten Schwan, Mukil Kesavan Sanjay Kumar, Ripal Nathuji, Adit Ranadive Center for Experimental

More information

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano Intel Itanium Quad-Core Architecture for the Enterprise Lambert Schaelicke Eric DeLano Agenda Introduction Intel Itanium Roadmap Intel Itanium Processor 9300 Series Overview Key Features Pipeline Overview

More information

SGI High Performance Computing

SGI High Performance Computing SGI High Performance Computing Accelerate time to discovery, innovation, and profitability 2014 SGI SGI Company Proprietary 1 Typical Use Cases for SGI HPC Products Large scale-out, distributed memory

More information

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719

Putting it all together: Intel Nehalem. http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Putting it all together: Intel Nehalem http://www.realworldtech.com/page.cfm?articleid=rwt040208182719 Intel Nehalem Review entire term by looking at most recent microprocessor from Intel Nehalem is code

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

White Paper. Innovate Telecom Services with NFV and SDN

White Paper. Innovate Telecom Services with NFV and SDN White Paper Innovate Telecom Services with NFV and SDN 2 NEXCOM White Paper As telecommunications companies seek to expand beyond telecommunications services to data services, they find their purposebuilt

More information

High-performance Energy-efficient NoC Fabrics. Mark Anders Circuit Research Lab, Intel Labs Intel Corporation, Hillsboro, OR

High-performance Energy-efficient NoC Fabrics. Mark Anders Circuit Research Lab, Intel Labs Intel Corporation, Hillsboro, OR High-performance Energy-efficient NoC Fabrics Mark Anders Circuit Research Lab, Intel Labs Intel Corporation, Hillsboro, OR Outline Technology Scaling NoC Trends Recent NoCs Intel Research NoCs Summary

More information

enabling Ultra-High Bandwidth Scalable SSDs with HLnand

enabling Ultra-High Bandwidth Scalable SSDs with HLnand www.hlnand.com enabling Ultra-High Bandwidth Scalable SSDs with HLnand May 2013 2 Enabling Ultra-High Bandwidth Scalable SSDs with HLNAND INTRODUCTION Solid State Drives (SSDs) are available in a wide

More information

Industry First X86-based Single Board Computer JaguarBoard Released

Industry First X86-based Single Board Computer JaguarBoard Released Industry First X86-based Single Board Computer JaguarBoard Released HongKong, China (May 12th, 2015) Jaguar Electronic HK Co., Ltd officially launched the first X86-based single board computer called JaguarBoard.

More information

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland ory nel Storage ( M C S ) Demystified Jerome McFarland Principal Product Marketer AGENDA + INTRO AND ARCHITECTURE + PRODUCT DETAILS + APPLICATIONS THE COMPUTE-STORAGE DISCONNECT + Compute And Data Have

More information

Awareness of MPI Virtual Process Topologies on the Single-Chip Cloud Computer

Awareness of MPI Virtual Process Topologies on the Single-Chip Cloud Computer Awareness of MPI Virtual Process Topologies on the Single-Chip Cloud Computer Steffen Christgau, Bettina Schnor Potsdam University Institute of Computer Science Operating Systems and Distributed Systems

More information

FLOW-3D Performance Benchmark and Profiling. September 2012

FLOW-3D Performance Benchmark and Profiling. September 2012 FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute

More information

Cloud Data Center Acceleration 2015

Cloud Data Center Acceleration 2015 Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA

More information

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket node for PRIMERGY CX420 cluster Strong Performance and Cluster

More information

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server HUAWEI FusionServer X6800 Data Center Server Data Center Cloud Internet App Big Data HPC As the IT infrastructure changes with

More information

IOS110. Virtualization 5/27/2014 1

IOS110. Virtualization 5/27/2014 1 IOS110 Virtualization 5/27/2014 1 Agenda What is Virtualization? Types of Virtualization. Advantages and Disadvantages. Virtualization software Hyper V What is Virtualization? Virtualization Refers to

More information

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule All Programmable Logic Hans-Joachim Gelke Institute of Embedded Systems Institute of Embedded Systems 31 Assistants 10 Professors 7 Technical Employees 2 Secretaries www.ines.zhaw.ch Research: Education:

More information

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Yunhong Jiang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Virtualised MikroTik

Virtualised MikroTik Virtualised MikroTik MikroTik in a Virtualised Hardware Environment Speaker: Tom Smyth CTO Wireless Connect Ltd. Event: MUM Krackow Feb 2008 http://wirelessconnect.eu/ Copyright 2008 1 Objectives Understand

More information

The Bus (PCI and PCI-Express)

The Bus (PCI and PCI-Express) 4 Jan, 2008 The Bus (PCI and PCI-Express) The CPU, memory, disks, and all the other devices in a computer have to be able to communicate and exchange data. The technology that connects them is called the

More information

ZigBee Technology Overview

ZigBee Technology Overview ZigBee Technology Overview Presented by Silicon Laboratories Shaoxian Luo 1 EM351 & EM357 introduction EM358x Family introduction 2 EM351 & EM357 3 Ember ZigBee Platform Complete, ready for certification

More information

The team that wrote this redbook Comments welcome Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft

The team that wrote this redbook Comments welcome Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft Foreword p. xv Preface p. xvii The team that wrote this redbook p. xviii Comments welcome p. xx Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft Technologies

More information

A Scalable Large Format Display Based on Zero Client Processor

A Scalable Large Format Display Based on Zero Client Processor International Journal of Electrical and Computer Engineering (IJECE) Vol. 5, No. 4, August 2015, pp. 714~719 ISSN: 2088-8708 714 A Scalable Large Format Display Based on Zero Client Processor Sang Don

More information

Virtuoso and Database Scalability

Virtuoso and Database Scalability Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of

More information

Networking Virtualization Using FPGAs

Networking Virtualization Using FPGAs Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Massachusetts,

More information

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011 Oracle Database Reliability, Performance and scalability on Intel platforms Mitch Shults, Intel Corporation October 2011 1 Intel Processor E7-8800/4800/2800 Product Families Up to 10 s and 20 Threads 30MB

More information

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010 ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010 Lecture 25: Clocking Architectures Sam Palermo Analog & Mixed-Signal Center Texas A&M University Announcements Project Preliminary

More information

Springpath Data Platform with Cisco UCS Servers

Springpath Data Platform with Cisco UCS Servers Springpath Data Platform with Cisco UCS Servers Reference Architecture March 2015 SPRINGPATH DATA PLATFORM WITH CISCO UCS SERVERS Reference Architecture 1.0 Introduction to Springpath Data Platform 1 2.0

More information

NAND Flash Architecture and Specification Trends

NAND Flash Architecture and Specification Trends NAND Flash Architecture and Specification Trends Michael Abraham (mabraham@micron.com) NAND Solutions Group Architect Micron Technology, Inc. August 2012 1 Topics NAND Flash Architecture Trends The Cloud

More information

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage

Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with Internal PCIe SSD Storage Achieving a High Performance OLTP Database using SQL Server and Dell PowerEdge R720 with This Dell Technical White Paper discusses the OLTP performance benefit achieved on a SQL Server database using a

More information

Unit two is about the components for cloud computing.

Unit two is about the components for cloud computing. Unit two is about the components for cloud computing. Copyright IBM Corporation 2012 1 Please study this units learning objectives. Copyright IBM Corporation 2015 2 The diagram illustrates the virtual

More information

Architecting High-Speed Data Streaming Systems. Sujit Basu

Architecting High-Speed Data Streaming Systems. Sujit Basu Architecting High-Speed Data Streaming Systems Sujit Basu stream ing [stree-ming] verb 1. The act of transferring data to or from an instrument at a rate high enough to sustain continuous acquisition or

More information

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing. Appro HyperBlade A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing. Appro HyperBlade clusters are flexible, modular scalable offering a high-density

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information