HETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS

Size: px
Start display at page:

Download "HETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS"

Transcription

1 HETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS JASON POWER*, ARKAPRAVA BASU*, JUNLI GU, SOORAJ PUTHOOR, BRADFORD M BECKMANN, MARK D HILL*, STEVEN K REINHARDT, DAVID A WOOD* *University of Wisconsin-Madison Advanced Micro Devices, Inc.

2 Powerpoint version available on: 2 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

3 ABSTRACT Hardware coherence can increase the utility of heterogeneous systems Major bottlenecks in current coherence implementations High bandwidth difficult to support at directory Extreme resource requirements We propose Heterogeneous System Coherence Leverages spatial locality and region coherence Reduces bandwidth by 94% Reduces resource requirements by 95% 4 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

4 PHYSICAL INTEGRATION 5 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

5 PHYSICAL INTEGRATION 6 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

6 PHYSICAL INTEGRATION 7 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

7 PHYSICAL INTEGRATION Stacked High-bandwidth DRAM GPU CPU Credit: IBM Cores 8 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

8 LOGICAL INTEGRATION General-purpose GPU computing OpenCL CUDA Heterogeneous Uniform Memory Access (huma) Shared virtual address space Cache coherence Allows new heterogeneous apps 9 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

9 OUTLINE Motivation Background System overview Cache architecture reminder Heterogeneous System Bottlenecks Heterogeneous System Coherence Details Results Conclusions 10 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

10 SYSTEM OVERVIEW SYSTEM LEVEL Highbandwidth interconnect Accelerated Processing Unit (APU) DRAM Channels 11 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

11 SYSTEM OVERVIEW APU GPU compute accesses must stay coherent Direct-access bus (used for graphics) GPU Cluster APU CPU Cluster Directory Arrow thickness bandwidth Invalidation traffic To DRAM 12 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

12 SYSTEM OVERVIEW GPU CU L1 CU L1 CU L1 I-Fetch / Decode GPU Cluster Very high bandwidth: CU CU CU CU CU CU CU CU CU CU CU CU L2 has high Local miss Scratchpad rate L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 Register File CU Memory CU L1 Ex Ex Ex Ex Ex Ex Ex Ex Ex Ex Ex Ex GPU L2 Cache L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 L1 CU CU CU CU CU CU CU CU CU CU CU CU CU CU CU CU Ex Ex Ex Ex Coalescer To L1 13 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

13 SYSTEM OVERVIEW CPU Cluster CPU Core Low bandwidth: Low L2 miss rate L1 L2 CPU Core L1 To Dir L1 L1 CPU Core CPU Core 14 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

14 CACHE ARCHITECTURE REMINDER CPU/GPU L2 CACHE Demand Requests Core Data Responses Demand requests Cache Tag Arrays Searches cache tags from L1 Allocates cache an for MSHR a tag match Tag hit on probe: send MSHRs entry On a directory data to other core On a miss, send probe, check On request a hit, return to directory data to the L1 MSHR Entries Miss Requests Data MSHRs Hit and tags Responses Probe Requests Coherent Network Interface 15 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

15 DIRECTORY ARCHITECTURE REMINDER DIRECTORY Coherent Block Requests Demand Block requests Directory Searches Tag Array cache Block tags Probe Requests/ from L2 Allocates cache an for MSHR a tag match Responses On a miss, the entry data comes Allocate and send MSHRs from DRAM Probe Request RAM probes to L2 caches Hit MSHR Entries Miss PR Entries To DRAM 16 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

16 BACKGROUND SUMMARY System under investigation Heterogeneous CPU-GPU on chip High-bandwidth DRAM Directory pipeline complex MSHR array is associative Difficult to pipeline with more than 1 request per cycle Important resources: MSHR entries 17 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

17 OUTLINE Motivation Background Heterogeneous System Bottlenecks Simulation overview Directory bandwidth MSHRs Performance is significantly affected Heterogeneous System Coherence Details Results Conclusions 18 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

18 SIMULATION DETAILS gem5 simulator Simple CPU GPU simulator based on AMD GCN All memory requests through gem5 CPU Clock 2 GHz CPU Cores 2 CPU Shared L2 2 MB (16-way banked) GPU Clock 1 GHz Compute Units 32 GPU Shared L2 4 MB (64-way banked) L3 (Memory-side) 16 MB (16-way banked) DRAM DDR3, 16 channels Peak Bandwidth 700 GB/s Baseline Directory 256k entries (8-way banked) Workloads Modified to use huma Rodinia & AMD APP SDK 19 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

19 GPGPU BENCHMARKS Rodinia benchmarks bp trains the connection weights on a neural network bfs breadth-first search hs performs a transient 2D thermal simulation (5-point stencil) lud matrix decomposition nw performs a global optimization for DNA sequence alignment km does k-means clustering sd speckle-reducing anisotropic diffusion AMD SDK bn bitonic sort dct discrete cosine transform hg histogram mm matrix multiplication 20 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

20 SYSTEM BOTTLENECKS Difficult to scale directory bandwidth Difficult to multi-port GPU Complicated pipeline Cluster High resource usage APU CPU Cluster Must allocate MSHR for entire duration of request MSHR array difficult to scale Directory High bandwidth Designed to support CPU bandwidth To DRAM 21 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

21 DIRECTORY TRAFFIC 4.5 Directory accesses per GPU cycle Difficult to support >1 request per cycle bp bfs hs lud nw km sd bn dct hg mm 22 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

22 RESOURCE USAGE Maximum MSHRs Very difficult to scale MSHR array Steady state at 700 GB/s Causes significant back-pressure on L2s bp bfs hs lud nw km sd bn dct hg mm 23 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

23 PERFORMANCE OF BASELINE COMPARED TO UNCONSTRAINED RESOURCES Slow down Back-pressure from limited MSHRs and bandwidth bp bfs hs lud nw km sd bn dct hg mm 24 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

24 BOTTLENECKS SUMMARY Directory bandwidth Must support up to 4 requests per cycle Difficult to construct pipeline Resource usage MSHRs are a constraining resource Need more than 10,000 Without resource constraints, up to 4x better performance 25 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

25 OUTLINE Motivation Background Heterogeneous System Bottlenecks Heterogeneous System Coherence Details Overall system design Region buffer design Region directory design Example Hardware complexity Results Conclusions 26 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

26 BASELINE DIRECTORY COHERENCE GPU Cluster APU CPU Cluster Initialization Kernel Launch Directory Read result To DRAM 27 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

27 HETEROGENEOUS SYSTEM COHERENCE (HSC) GPU Cluster APU CPU Cluster Initialization Kernel Launch Directory To DRAM 28 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

28 HETEROGENEOUS SYSTEM COHERENCE (HSC) Direct-access bus GPU Cluster Region Buffer APU CPU Region Cluster Buffer Region buffers coordinate with region directory Region Directory Directory To DRAM 29 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

29 HSC: EXAMPLE MEMORY REQUEST GPU L2 Cache APU GPU Region Buffer GPU Region Cluster Buffer CPU Region Cluster Buffer Region Directory Region Directory To DRAM 32 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

30 HSC: L2 CACHE & REGION BUFFER Demand Requests MSHRs Core Data Responses Core Data Responses Region tags and Cache Tag Arrays Cache Tag Arrays permissions Only region-level MSHRs permission traffic Interface for Miss direct-access bus Hit MSHR Entries MSHR Entries Hit Miss Region Buffer Hit Probe Requests Direct Access Bus Interface Miss Requests Miss Hit Probe Data Requests Responses Coherent Coherent Network Network Interface 33 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

31 HSC: REGION DIRECTORY Region Permission Requests MSHRs Coherent Block Requests MSHRs MSHR Entries MSHR Entries Block Directory Tag Array Region Directory Tag Array Region tags, sharers, and permissions Miss Miss Hit Hit Block Probe Requests/ Responses Block Probe Requests/Responses Probe Request RAM Probe Request RAM PR Entries PR Entries To DRAM 34 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

32 HSC: HARDWARE COMPLEXITY Region protocols reduce directory size Region directory: 8x fewer entries Region buffers At each L2 cache 1-KB region (16 64-B blocks) 16-K region entries Overprovisioned for low-locality workloads (a) Region Directory Entry Region Tag 18 bits (b) Region Buffer Entry State CPU GPU 2 bits 1 valid bit per cluster Region Tag State B 0 B 1 B 2... B bits 2 bits 1 valid bit per block in the region 35 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

33 HSC SUMMARY Key insight GPU-CPU applications exhibit high spatial locality Use direct-access bus present in systems Offload bandwidth onto direct-access bus Use coherence network only for permission Add region buffer to track region information At each L2 cache Bypass coherence network and directory Replace directory with region directory Significantly reduces total size needed 36 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

34 OUTLINE Motivation Background Heterogeneous System Bottlenecks Heterogeneous System Coherence Details Results Speed-up Latency of loads Bandwidth MSHR usage Conclusions 37 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

35 THREE CACHE-COHERENCE PROTOCOLS Broadcast: Null-directory that broadcasts on all requests Baseline: Block-based, mostly inclusive, directory HSC: Region-based directory with 1-KB region size 38 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

36 HSC PERFORMANCE Normalized speed-up Largest slow-downs slowdowns from constrained resources Broadcast Baseline HSC bp bfs hs lud nw km sd bn dct hg mm 39 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

37 DIRECTORY TRAFFIC REDUCTION Normalized directory bandwidth broadcast baseline HSC Average bandwidth significantly reduced Theoretical reduction from 16 block regions bp bfs hs lud nw km sd bn dct hg mm 40 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

38 HSC RESOURCE USAGE Normalized directory MSHRs required Maximum MSHRs significantly reduced bp bfs hs lud nw km sd bn dct hg mm 41 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

39 RESULTS SUMMARY Used a detailed timing simulator for CPU and GPU HSC significantly improves performance Reduces the average load latency Decreases bandwidth requirement of directory HSC reduces the required MSHRs at the directory 42 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

40 RELATED WORK Coarse-grained coherence Region coherence Applied to snooping systems [Cantin, ISCA 2005] [Moshovos, ISCA 2005] [Zebchuk, MICRO 2007] Extended to directories [Fang, PACT 2013] [Zebchuk, MICRO 2013] Spatiotemporal coherence [Alisafaee, MICRO 2012] Dual-grain directory coherence [Basu, UW-TR 2013] Primarily focused on directory size GPU coherence [Singh et al. HPCA 2013] Intra-GPU coherence 43 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

41 CONCLUSIONS Hardware coherence can increase the utility of heterogeneous systems Major bottlenecks in current coherence implementations High bandwidth difficult to support at directory Extreme resource requirements We propose Heterogeneous System Coherence Leverages spatial locality and region coherence Reduces bandwidth by 94% Reduces resource requirements by 95% 44 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

42 Questions? Contact: 45 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

43 DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners. 46 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

44 Backup Slides

45 LOAD LATENCY Normalized load latency Average load time significantly reduced broadcast baseline HSC bp bfs hs lud nw km sd bn dct hg mm 48 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

46 EXECUTION TIME BREAKDOWN GPU CPU Execution time (%) bp bfs hs lud nw km sd bn dct hg mm 49 HETEROGENEOUS SYSTEM COHERENCE DECEMBER 11, 2013 MICRO-46

A Vision for Tomorrow s Hosting Data Center

A Vision for Tomorrow s Hosting Data Center A Vision for Tomorrow s Hosting Data Center JOHN WILLIAMS CORPORATE VICE PRESIDENT, SERVER MARKETING MARCH 2013 THE EVOLVING HOSTING MARKET NEW OPPORTUNITIES: HOSTING IN THE CLOUD Hosted Server shipments

More information

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group ATI Radeon 4800 series Graphics Michael Doggett Graphics Architecture Group Graphics Product Group Graphics Processing Units ATI Radeon HD 4870 AMD Stream Computing Next Generation GPUs 2 Radeon 4800 series

More information

GPU ACCELERATED DATABASES Database Driven OpenCL Programming. Tim Child 3DMashUp CEO

GPU ACCELERATED DATABASES Database Driven OpenCL Programming. Tim Child 3DMashUp CEO GPU ACCELERATED DATABASES Database Driven OpenCL Programming Tim Child 3DMashUp CEO SPEAKERS BIO Tim Child 35 years experience of software development Formerly VP Engineering, Oracle Corporation VP Engineering,

More information

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 AGENDA The Kaveri Accelerated Processing Unit (APU) The Graphics Core Next Architecture and its Floating-Point Arithmetic

More information

"JAGUAR AMD s Next Generation Low Power x86 Core. Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012

JAGUAR AMD s Next Generation Low Power x86 Core. Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012 "JAGUAR AMD s Next Generation Low Power x86 Core Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012 TWO X86 CORES TUNED FOR TARGET MARKETS Mainstream Client and Server Markets Bulldozer

More information

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008 Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer

More information

AMD Product and Technology Roadmaps

AMD Product and Technology Roadmaps AMD Product and Technology Roadmaps AMD 2013 DESKTOP OEM GRAPHICS ROADMAP 2012: AMD RADEON 7000/6000 SERIES 2013: AMD RADEON HD 8000 SERIES Enthusiast AMD Radeon 7900 Series GPU AMD Radeon 8990 Series

More information

Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator

Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator White Paper Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator Delivering Accelerated Application Performance, Microsoft AlwaysOn High Availability and Fast Data Replication

More information

PHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER

PHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER PHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER Preface Today s world is ripe with computing technology. Computing technology is all around us and it s often difficult

More information

MS Exchange Server Acceleration

MS Exchange Server Acceleration White Paper MS Exchange Server Acceleration Using virtualization to dramatically maximize user experience for Microsoft Exchange Server Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba

More information

Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center

Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center White Paper Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center Provide accelerated data access and an immediate performance boost of businesscritical applications with caching

More information

Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator

Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator enterprise White Paper Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator Performance Test Results for Analytical (OLAP) and Transactional (OLTP) SQL Server 212 Loads Allon

More information

Accelerating Database Applications on Linux Servers

Accelerating Database Applications on Linux Servers White Paper Accelerating Database Applications on Linux Servers Introducing OCZ s LXL Software - Delivering a Data-Path Optimized Solution for Flash Acceleration Allon Cohen, PhD Yaron Klein Eli Ben Namer

More information

ECLIPSE Performance Benchmarks and Profiling. January 2009

ECLIPSE Performance Benchmarks and Profiling. January 2009 ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster

More information

Accelerating Server Storage Performance on Lenovo ThinkServer

Accelerating Server Storage Performance on Lenovo ThinkServer Accelerating Server Storage Performance on Lenovo ThinkServer Lenovo Enterprise Product Group April 214 Copyright Lenovo 214 LENOVO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER

More information

White Paper AMD PROJECT FREESYNC

White Paper AMD PROJECT FREESYNC White Paper AMD PROJECT FREESYNC TABLE OF CONTENTS INTRODUCTION 3 PROJECT FREESYNC USE CASES 4 Gaming 4 Video Playback 5 System Power Savings 5 PROJECT FREESYNC IMPLEMENTATION 6 Implementation Overview

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

Supreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution

Supreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution enterprise Case Study Supreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution Combination of Z-Drive R4 PCIe SSDs and

More information

Redefining Flash Storage Solution

Redefining Flash Storage Solution Redefining Flash Storage Solution Through Capacity + Efficiency + Performance + Form PRODUCT GUIDE Holistic Approach to Redefine Flash Storage Novachips is a leading provider of a broad range of Flash

More information

New!! - Higher performance for Windows and UNIX environments

New!! - Higher performance for Windows and UNIX environments New!! - Higher performance for Windows and UNIX environments The IBM TotalStorage Network Attached Storage Gateway 300 (NAS Gateway 300) is designed to act as a gateway between a storage area network (SAN)

More information

LS DYNA Performance Benchmarks and Profiling. January 2009

LS DYNA Performance Benchmarks and Profiling. January 2009 LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The

More information

Accelerating MS SQL Server 2012

Accelerating MS SQL Server 2012 White Paper Accelerating MS SQL Server 2012 Unleashing the Full Power of SQL Server 2012 in Virtualized Data Centers Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba Group Company 1

More information

Family 12h AMD Athlon II Processor Product Data Sheet

Family 12h AMD Athlon II Processor Product Data Sheet Family 12h AMD Athlon II Processor Publication # 50322 Revision: 3.00 Issue Date: December 2011 Advanced Micro Devices 2011 Advanced Micro Devices, Inc. All rights reserved. The contents of this document

More information

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server

Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Technology brief Introduction... 2 GPU-based computing... 2 ProLiant SL390s GPU-enabled architecture... 2 Optimizing

More information

CUDA Optimization with NVIDIA Tools. Julien Demouth, NVIDIA

CUDA Optimization with NVIDIA Tools. Julien Demouth, NVIDIA CUDA Optimization with NVIDIA Tools Julien Demouth, NVIDIA What Will You Learn? An iterative method to optimize your GPU code A way to conduct that method with Nvidia Tools 2 What Does the Application

More information

AMD Processor Performance. AMD Phenom II Processors Discrete Platform Benchmarks December 2008

AMD Processor Performance. AMD Phenom II Processors Discrete Platform Benchmarks December 2008 AMD Processor Performance AMD Phenom II Processors Discrete Platform Benchmarks December 2008 AMD Phenom II Performance Overall Performance of Office Productivity + Digital Media + Games AMD Phenom II

More information

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.

More information

Accelerating Microsoft Exchange Servers with I/O Caching

Accelerating Microsoft Exchange Servers with I/O Caching Accelerating Microsoft Exchange Servers with I/O Caching QLogic FabricCache Caching Technology Designed for High-Performance Microsoft Exchange Servers Key Findings The QLogic FabricCache 10000 Series

More information

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli Department of Electrical and Computer Engineering Northeastern University,

More information

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers

The Orca Chip... Heart of IBM s RISC System/6000 Value Servers The Orca Chip... Heart of IBM s RISC System/6000 Value Servers Ravi Arimilli IBM RISC System/6000 Division 1 Agenda. Server Background. Cache Heirarchy Performance Study. RS/6000 Value Server System Structure.

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

HOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu:

HOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu: THE QUESTION HOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu: How many users can I get on a server? NVIDIA: What is their primary

More information

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION QlikView Scalability Center Technical Brief Series September 2012 qlikview.com Introduction This technical brief provides a discussion at a fundamental

More information

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Measuring Cache and Memory Latency and CPU to Memory Bandwidth White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary

More information

EUCIP IT Administrator - Module 1 PC Hardware Syllabus Version 3.0

EUCIP IT Administrator - Module 1 PC Hardware Syllabus Version 3.0 EUCIP IT Administrator - Module 1 PC Hardware Syllabus Version 3.0 Copyright 2011 ECDL Foundation All rights reserved. No part of this publication may be reproduced in any form except as permitted by ECDL

More information

Motivation: Smartphone Market

Motivation: Smartphone Market Motivation: Smartphone Market Smartphone Systems External Display Device Display Smartphone Systems Smartphone-like system Main Camera Front-facing Camera Central Processing Unit Device Display Graphics

More information

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting Introduction Big Data Analytics needs: Low latency data access Fast computing Power efficiency Latest

More information

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC Driving industry innovation The goal of the OpenPOWER Foundation is to create an open ecosystem, using the POWER Architecture to share expertise,

More information

Parallel Firewalls on General-Purpose Graphics Processing Units

Parallel Firewalls on General-Purpose Graphics Processing Units Parallel Firewalls on General-Purpose Graphics Processing Units Manoj Singh Gaur and Vijay Laxmi Kamal Chandra Reddy, Ankit Tharwani, Ch.Vamshi Krishna, Lakshminarayanan.V Department of Computer Engineering

More information

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

GPU File System Encryption Kartik Kulkarni and Eugene Linkov GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through

More information

By Citrix Consulting Services. Citrix Systems, Inc.

By Citrix Consulting Services. Citrix Systems, Inc. By Citrix Consulting Services Citrix Systems, Inc. Disclaimer The objective of this white paper is to provide recommendations for ICA Client settings based on network environment configuration. The testing

More information

Configuring Memory on the HP Business Desktop dx5150

Configuring Memory on the HP Business Desktop dx5150 Configuring Memory on the HP Business Desktop dx5150 Abstract... 2 Glossary of Terms... 2 Introduction... 2 Main Memory Configuration... 3 Single-channel vs. Dual-channel... 3 Memory Type and Speed...

More information

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering DELL Virtual Desktop Infrastructure Study END-TO-END COMPUTING Dell Enterprise Solutions Engineering 1 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL

More information

Virtualizing a Virtual Machine

Virtualizing a Virtual Machine Virtualizing a Virtual Machine Azeem Jiva Shrinivas Joshi AMD Java Labs TS-5227 Learn best practices for deploying Java EE applications in virtualized environment 2008 JavaOne SM Conference java.com.sun/javaone

More information

Summary. Key results at a glance:

Summary. Key results at a glance: An evaluation of blade server power efficiency for the, Dell PowerEdge M600, and IBM BladeCenter HS21 using the SPECjbb2005 Benchmark The HP Difference The ProLiant BL260c G5 is a new class of server blade

More information

solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms?

solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms? solution brief September 2011 Can You Effectively Plan For The Migration And Management of Systems And Applications on Vblock Platforms? CA Capacity Management and Reporting Suite for Vblock Platforms

More information

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging In some markets and scenarios where competitive advantage is all about speed, speed is measured in micro- and even nano-seconds.

More information

Leveraging Aparapi to Help Improve Financial Java Application Performance

Leveraging Aparapi to Help Improve Financial Java Application Performance Leveraging Aparapi to Help Improve Financial Java Application Performance Shrinivas Joshi, Software Performance Engineer Abstract Graphics Processing Unit (GPU) and Accelerated Processing Unit (APU) offload

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics GPU Architectures A CPU Perspective Derek Hower AMD Research 5/21/2013 Goals Data Parallelism: What is it, and how to exploit it? Workload characteristics Execution Models / GPU Architectures MIMD (SPMD),

More information

NVIDIA GRID DASSAULT CATIA V5/V6 SCALABILITY GUIDE. NVIDIA Performance Engineering Labs PerfEngDoc-SG-DSC01v1 March 2016

NVIDIA GRID DASSAULT CATIA V5/V6 SCALABILITY GUIDE. NVIDIA Performance Engineering Labs PerfEngDoc-SG-DSC01v1 March 2016 NVIDIA GRID DASSAULT V5/V6 SCALABILITY GUIDE NVIDIA Performance Engineering Labs PerfEngDoc-SG-DSC01v1 March 2016 HOW MANY USERS CAN I GET ON A SERVER? The purpose of this guide is to give a detailed analysis

More information

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

can you effectively plan for the migration and management of systems and applications on Vblock Platforms? SOLUTION BRIEF CA Capacity Management and Reporting Suite for Vblock Platforms can you effectively plan for the migration and management of systems and applications on Vblock Platforms? agility made possible

More information

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Intel Data Direct I/O Technology (Intel DDIO): A Primer > Intel Data Direct I/O Technology (Intel DDIO): A Primer > Technical Brief February 2012 Revision 1.0 Legal Statements INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Hardware Level IO Benchmarking of PCI Express*

Hardware Level IO Benchmarking of PCI Express* White Paper James Coleman Performance Engineer Perry Taylor Performance Engineer Intel Corporation Hardware Level IO Benchmarking of PCI Express* December, 2008 321071 Executive Summary Understanding the

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

THE AMD MISSION 2 AN INTRODUCTION TO AMD NOVEMBER 2014

THE AMD MISSION 2 AN INTRODUCTION TO AMD NOVEMBER 2014 THE AMD MISSION To be the leading designer and integrator of innovative, tailored technology solutions that empower people to push the boundaries of what is possible 2 AN INTRODUCTION TO AMD NOVEMBER 2014

More information

Parallel Programming Survey

Parallel Programming Survey Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory

More information

Rambus Smart Data Acceleration

Rambus Smart Data Acceleration Rambus Smart Data Acceleration Back to the Future Memory and Data Access: The Final Frontier As an industry, if real progress is to be made towards the level of computing that the future mandates, then

More information

Virtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Virtualized Environments

Virtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Virtualized Environments Virtualization Performance Analysis November 2010 Effect of SR-IOV Support in Red Hat KVM on Network Performance in Steve Worley System x Performance Analysis and Benchmarking IBM Systems and Technology

More information

Memory Architecture and Management in a NoC Platform

Memory Architecture and Management in a NoC Platform Architecture and Management in a NoC Platform Axel Jantsch Xiaowen Chen Zhonghai Lu Chaochao Feng Abdul Nameed Yuang Zhang Ahmed Hemani DATE 2011 Overview Motivation State of the Art Data Management Engine

More information

Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server

Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server Product Brief Intel Xeon processor 7400 series Fewer servers. More performance. With the architecture that s specifically

More information

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive

More information

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Accelerating High-Speed Networking with Intel I/O Acceleration Technology White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing

More information

Enhancing Cloud-based Servers by GPU/CPU Virtualization Management

Enhancing Cloud-based Servers by GPU/CPU Virtualization Management Enhancing Cloud-based Servers by GPU/CPU Virtualiz Management Tin-Yu Wu 1, Wei-Tsong Lee 2, Chien-Yu Duan 2 Department of Computer Science and Inform Engineering, Nal Ilan University, Taiwan, ROC 1 Department

More information

Infor Web UI Sizing and Deployment for a Thin Client Solution

Infor Web UI Sizing and Deployment for a Thin Client Solution Infor Web UI Sizing and Deployment for a Thin Client Solution Copyright 2012 Infor Important Notices The material contained in this publication (including any supplementary information) constitutes and

More information

APU/GPGPU-BASED SECURITY SOLUTIONS. Vikenty Frantsev ALTELL CEO

APU/GPGPU-BASED SECURITY SOLUTIONS. Vikenty Frantsev ALTELL CEO APU/GPGPU-BASED SECURITY SOLUTIONS Vikenty Frantsev ALTELL CEO ALTELL: KEY FACTS Core business: IT security, software development, network appliances design & manufacturing Founded: Year 2006 Vertical

More information

HP Z Turbo Drive PCIe SSD

HP Z Turbo Drive PCIe SSD Performance Evaluation of HP Z Turbo Drive PCIe SSD Powered by Samsung XP941 technology Evaluation Conducted Independently by: Hamid Taghavi Senior Technical Consultant June 2014 Sponsored by: P a g e

More information

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?

More information

Performance Analysis and Software Optimization on Systems Using the LAN91C111

Performance Analysis and Software Optimization on Systems Using the LAN91C111 AN 10.12 Performance Analysis and Software Optimization on Systems Using the LAN91C111 1 Introduction This application note describes one approach to analyzing the performance of a LAN91C111 implementation

More information

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25

FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 FPGA Acceleration using OpenCL & PCIe Accelerators MEW 25 December 2014 FPGAs in the news» Catapult» Accelerate BING» 2x search acceleration:» ½ the number of servers»

More information

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC

More information

Cloud Data Center Acceleration 2015

Cloud Data Center Acceleration 2015 Cloud Data Center Acceleration 2015 Agenda! Computer & Storage Trends! Server and Storage System - Memory and Homogenous Architecture - Direct Attachment! Memory Trends! Acceleration Introduction! FPGA

More information

Moving Beyond CPUs in the Cloud: Will FPGAs Sink or Swim?

Moving Beyond CPUs in the Cloud: Will FPGAs Sink or Swim? Moving Beyond CPUs in the Cloud: Will FPGAs Sink or Swim? Successful FPGA datacenter usage at scale will require differentiated capability, programming ease, and scalable implementation models Executive

More information

big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices

big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices Brian Jeff November, 2013 Abstract ARM big.little processing

More information

Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008

Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008 Best Practices Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008 Installation and Configuration Guide 2010 LSI Corporation August 13, 2010

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

How To Compare Two Servers For A Test On A Poweredge R710 And Poweredge G5P (Poweredge) (Power Edge) (Dell) Poweredge Poweredge And Powerpowerpoweredge (Powerpower) G5I (

How To Compare Two Servers For A Test On A Poweredge R710 And Poweredge G5P (Poweredge) (Power Edge) (Dell) Poweredge Poweredge And Powerpowerpoweredge (Powerpower) G5I ( TEST REPORT MARCH 2009 Server management solution comparison on Dell PowerEdge R710 and HP Executive summary Dell Inc. (Dell) commissioned Principled Technologies (PT) to compare server management solutions

More information

Compatibility Matrix BES12. September 16, 2015

Compatibility Matrix BES12. September 16, 2015 Compatibility Matrix BES12 September 16, 2015 Published: 2015-09-16 SWD-20150916153710116 Contents Introduction... 4 Legend...5 BES12 server... 6 Operating system...6 Database server...6 Browser... 8 Mobile

More information

Tips and Best Practices for Managing a Private Cloud

Tips and Best Practices for Managing a Private Cloud Deploying and Managing Private Clouds The Essentials Series Tips and Best Practices for Managing a Private Cloud sponsored by Tip s and Best Practices for Managing a Private Cloud... 1 Es tablishing Policies

More information

Linux VM Infrastructure for memory power management

Linux VM Infrastructure for memory power management Linux VM Infrastructure for memory power management Ankita Garg Vaidyanathan Srinivasan IBM Linux Technology Center Agenda - Saving Power Motivation - Why Save Power Benefits How can it be achieved Role

More information

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...

More information

XID ERRORS. vr352 May 2015. XID Errors

XID ERRORS. vr352 May 2015. XID Errors ID ERRORS vr352 May 2015 ID Errors Introduction... 1 1.1. What Is an id Message... 1 1.2. How to Use id Messages... 1 Working with id Errors... 2 2.1. Viewing id Error Messages... 2 2.2. Tools That Provide

More information

The Transition to PCI Express* for Client SSDs

The Transition to PCI Express* for Client SSDs The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o [email protected]

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o [email protected] Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

LOAD BALANCING IN THE MODERN DATA CENTER WITH BARRACUDA LOAD BALANCER FDC T740

LOAD BALANCING IN THE MODERN DATA CENTER WITH BARRACUDA LOAD BALANCER FDC T740 LOAD BALANCING IN THE MODERN DATA CENTER WITH BARRACUDA LOAD BALANCER FDC T740 Balancing Web traffic across your front-end servers is key to a successful enterprise Web presence. Web traffic is much like

More information

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011 A Close Look at PCI Express SSDs Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011 Macro Datacenter Trends Key driver: Information Processing Data Footprint (PB) CAGR: 100%

More information

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip Outline Modeling, simulation and optimization of Multi-Processor SoCs (MPSoCs) Università of Verona Dipartimento di Informatica MPSoCs: Multi-Processor Systems on Chip A simulation platform for a MPSoC

More information

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano

Intel Itanium Quad-Core Architecture for the Enterprise. Lambert Schaelicke Eric DeLano Intel Itanium Quad-Core Architecture for the Enterprise Lambert Schaelicke Eric DeLano Agenda Introduction Intel Itanium Roadmap Intel Itanium Processor 9300 Series Overview Key Features Pipeline Overview

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

Compatibility Matrix BES10. April 27, 2016. Version 10.2 and later

Compatibility Matrix BES10. April 27, 2016. Version 10.2 and later Compatibility Matrix BES10 April 27, 2016 Version 10.2 and later Published: 2016-04-28 SWD-20160428152359812 Contents Enterprise Service 10 Compatibility Matrix... 4 Introduction...4 Legend... 4 Operating

More information

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Sockets vs. RDMA Interface over 1-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji Hemal V. Shah D. K. Panda Network Based Computing Lab Computer Science and Engineering

More information