AMD heterogeneous Uniform Memory Access HANJIN CHU, DIRECTOR, GAME, SOFTWARE & HETEROGENEOUS SOLUTIONS

Similar documents
A Vision for Tomorrow s Hosting Data Center

AMD Product and Technology Roadmaps

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

HETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS

THE AMD MISSION 2 AN INTRODUCTION TO AMD NOVEMBER 2014

GPU ACCELERATED DATABASES Database Driven OpenCL Programming. Tim Child 3DMashUp CEO

White Paper AMD PROJECT FREESYNC

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

AMD Processor Performance. AMD Phenom II Processors Discrete Platform Benchmarks December 2008

PHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER

Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator

Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center

Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator

White Paper EMBEDDED GPUS ARE NO GAMBLE

MS Exchange Server Acceleration

"JAGUAR AMD s Next Generation Low Power x86 Core. Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012

SURROUNDVIEW Installation and Setup User s Guide

ORACLE VIRTUAL DESKTOP INFRASTRUCTURE

Autodesk Revit 2016 Product Line System Requirements and Recommendations

Configuring Memory on the HP Business Desktop dx5150

Transforming Data Center Economics and Performance via Flash and Server Virtualization

Supreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution

Whitepaper. NVIDIA Miracast Wireless Display Architecture

Enriched customer experience at airports

Media File Player. Version Release Note 1 st Edition

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Intelligent Monitoring Configuration Tool

Computer Setup User Guide

Windows Embedded Compact 7: RemoteFX and Remote Experience Thin Client Integration

Revit products will use multiple cores for many tasks, using up to 16 cores for nearphotorealistic

SUN ORACLE DATABASE MACHINE

Cartal-Rijsbergen Automotive Improves SQL Server Performance and I/O Database Access with OCZ s PCIe-based ZD-XL SQL Accelerator

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND

HARDWARE-ENABLED GPU VIRTUALIZATION FOR A TRUE WORKSTATION EXPERIENCE. By Tonny Wong, Product Manager, Radeon Technology Group

Windows 7 XP Mode for HP Business PCs

APU/GPGPU-BASED SECURITY SOLUTIONS. Vikenty Frantsev ALTELL CEO

Innovation in Software Quality

THE FUTURE OF THE APU BRAIDED PARALLELISM Session 2901

Autodesk 3ds Max 2010 Boot Camp FAQ

Printer Driver Installation Manual

USER S MANUAL. AXIS Media Control

Recording Server Monitoring Tool

Frequently Asked Questions: EMC UnityVSA

How To Compare Two Servers For A Test On A Poweredge R710 And Poweredge G5P (Poweredge) (Power Edge) (Dell) Poweredge Poweredge And Powerpowerpoweredge (Powerpower) G5I (

Driving Big Data with OCZ Enterprise SSDs

Accelerating MS SQL Server 2012

Running Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E

Windows Server 2012 R2 VDI - Virtual Desktop Infrastructure. Ori Husyt Agile IT Consulting Team Manager orih@agileit.co.il

G Cloud 7 Pricing Document

White Paper DisplayPort 1.2 Technology AMD FirePro V7900 and V5900 Professional Graphics. Table of Contents

A White Paper By: Dr. Gaurav Banga SVP, Engineering & CTO, Phoenix Technologies. Bridging BIOS to UEFI

BlackBerry Web Desktop Manager. Version: 5.0 Service Pack: 4. User Guide

Family 12h AMD Athlon II Processor Product Data Sheet

Accelerating Database Applications on Linux Servers

HP Client Manager 6.1

Contents Firewall Monitor Overview Getting Started Setting Up Firewall Monitor Attack Alerts Viewing Firewall Monitor Attack Alerts

Quest vworkspace Virtual Desktop Extensions for Linux

Oracle Sales Cloud Configuration, Customization and Integrations

Data Management for Portable Media Players

Citrix MetaFrame Presentation Server 3.0 and Microsoft Windows Server 2003 Value Add Feature Guide

June, 2015 Oracle s Siebel CRM Statement of Direction Client Platform Support

Greenplum Database (software-only environments): Greenplum Database (4.0 and higher supported, or higher recommended)

System Requirements and Platform Support Guide

Security Slots on Polycom SoundPoint IP, SoundStation IP, SoundStation Duo and VVX Series Phones

QuickSpecs HP Remote Graphics Software 7.2

Host OS Compatibility Guide

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Learn about each tool in parental controls and find out how you can use them to secure you and your family.

SUN ORACLE EXADATA STORAGE SERVER

INSTALLATION GUIDE. AXIS Camera Station

Technical Brief. DualNet with Teaming Advanced Networking. October 2006 TB _v02

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

Cisco WebEx Meetings Server System Requirements

Cisco TelePresence VCR MSE 8220

CX Series. Video Recording Server. Quick Start Guide CX784 / CX788 / CX7816. Version

A Powerful solution for next generation Pcs

Self Help Guides. Setup Exchange with Outlook

AMD PhenomII. Architecture for Multimedia System Prof. Cristina Silvano. Group Member: Nazanin Vahabi Kosar Tayebani

Atmel AVR4921: ASF - USB Device Stack Differences between ASF V1 and V2. 8-bit Atmel Microcontrollers. Application Note. Features.

ORACLE FUSION ACCOUNTING HUB

Installing Act! for New Users

LOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series

Intel Desktop Board DG965RY

Migration Best Practices for OpenSSO 8 and SAM 7.1 deployments O R A C L E W H I T E P A P E R M A R C H 2015

Installing Sage ACT! 2013 for New Users

StarWind iscsi SAN: Configuring Global Deduplication May 2012

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

Controlling and Managing Security with Performance Tools

EMC PERFORMANCE OPTIMIZATION FOR MICROSOFT FAST SEARCH SERVER 2010 FOR SHAREPOINT

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual

HP Client Manager 6.2

HP Intelligent Management Center Basic WLAN Manager Software Platform

NetIQ Privileged User Manager

High Performance GPGPU Computer for Embedded Systems

ShadowProtect Recovery CD Restore Wizard Options

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

A M D DA S 1. 0 For the Manageability, Virtualization and Security of Embedded Solutions

Improving IT Operational Efficiency with a VMware vsphere Private Cloud on Lenovo Servers and Lenovo Storage SAN S3200

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

Transcription:

AMD heterogeneous Uniform Memory Access HANJIN CHU, DIRECTOR, GAME, SOFTWARE & HETEROGENEOUS SOLUTIONS

ABOUT HSA

GPU COMPUTE CAPABILITY IS MORE THAN 10X THAT OF THE CPU 4500 HOW DO WE UNLOCK THIS PERFORMANCE? 4000 3500 3000 2500 2000 1500 1000 500 0 2002 2003 2004 CPU GFLOPS 2005 2006 GPU GFLOPS 2007 2008 2009 2010 2011 2012 See slide 24 for details 3

GFLOPS Year CPU CPU GFLOPS GPU (RADEON) GPU GFLOPS 2002 Pentium 4 (Northwood) 12.24 9700 Pro 31.2 2003 Pentium 4 (Northwood) 12.8 9800 XT 36.48 2004 Pentium 4 (Prescott 15.2 X850 XT 103.68 2005 15.2 X1800 XT 134.4 2006 Core 2 Duo 23.44 X1950 375 2007 Core 2 Quad 48 HD 2900 XT 473.6 2008 Q9650 96 HD 4870 1200 2009 Core i7 960 102.4 HD 5870 2720 2010 Core i7 970 153.6 HD 6970 2703 2011 Core i7 3960X 316.8 HD7970 3789 2012 Core i7 3970X 336 HD 7970 GHz Edition 4301 4

WHAT IS NEXT CPU AND GPU FUSION HAS A REAL HETEROGENEOUS ARCH An intelligent computing architecture that enables CPU, GPU and other processors to work in harmony on a single piece of silicon by seamlessly moving the right tasks to the best suited processing element huma (MEMORY) PARALLEL WORKLOADS SERIAL WORKLOADS APU ACCELERATED PROCESSING UNIT 5

Shared Memory Coherency, User Mode Queues HSA ARCHITECTURE Full Programming language support User Mode Queueing Heterogeneous Unified Memory Access (huma) CPU Audio Processor Video Hardware Encode Decode Pageable Memory Bidirectional coherency GPU Fixed Function Acctr DSP Image Signal Processing Compute context switch and preemption 6

HSA EVOLUTION Benefits Unified power efficiency Improved compute efficiency Simplified data sharing Capabilities Integrate CPU and GPU in silicon GPU can access CPU memory Uniform memory access for CPU and GPU 7

WHAT IS huma? heterogeneous UNIFORM MEMORY ACCESS 8

UNDERSTANDING UMA Original meaning of UMA is Uniform Memory Access Refers to how processing cores in a system view and access memory All processing cores in a true UMA system share a single memory address space Introduction of GPU compute created systems with Non-Uniform Memory Access (NUMA) Require data to be managed across multiple heaps with different address spaces Add programming complexity due to frequent copies, synchronization, and address translation HSA restores the GPU to Uniform memory Access Heterogeneous computing replaces GPU Computing 9

INTRODUCING huma UMA CPU Memory CPU CPU CPU CPU APU APU with HSA NUMA CPU Memory CPU CPU CPU CPU huma Memory CPU CPU CPU CPU GPU Memory GPU GPU GPU GPU GPU GPU GPU GPU 10

huma KEY FEATURES BI-DIRECTIONAL COHERENT MEMORY Any updates made by one processing element will be seen by all other processing elements - GPU or CPU PAGEABLE MEMORY GPU can take page faults, and is no longer restricted to page locked memory ENTIRE MEMORY SPACE CPU and GPU processes can dynamically allocate memory from the entire memory space 11

huma KEY FEATURES Coherent Memory: Ensures CPU and GPU caches both see an up-to-date view of data CPU Cache HW Coherency GPU Cache Physical Memory Pageable memory: The GPU can seamlessly access virtual memory addresses that are not (yet) present in physical memory Virtual Memory Entire memory space: Both CPU and GPU can access and allocate any location in the system s virtual memory space 12

WITHOUT POINTERS* AND DATA SHARING Without huma: CPU explicitly copies data to GPU memory GPU completes computation CPU explicitly copies result back to CPU memory CPU GPU CPU Memory GPU Memory Only the data array can be copied since GPU cannot follow embedded data-structure links *A Pointer is a named variable that holds a memory address. It makes it easy to reference data or code segments by a name and eliminates the need for the developer to know the actual address in memory. Pointers can be manipulated by the same expressions used to operate on any other variable 13

WITH POINTERS* AND DATA SHARING With huma: CPU simply passes a pointer to GPU GPU completes computation CPU can read the result directly no copying needed! CPU GPU CPU can pass a pointer to entire data structure since the GPU can now follow embedded links CPU / GPU Uniform Memory *A Pointer is a named variable that holds a memory address. It makes it easy to reference data or code segments by a name and eliminates the need for the developer to know the actual address in memory. Pointers can be manipulated by the same expressions used to operate on any other variable 14

huma FEATURES Access to Entire Memory Space Pageable memory Bi-directional Coherency Fast GPU access to system memory Dynamic Memory Allocation 15

huma BENEFITS

BENEFITS OF HSA 17

UNIFORM MEMORY BENEFITS TO DEVELOPERS EASE AND SIMPLICITY OF PROGRAMMING Single, standard computing environments SUPPORT FOR MAINSTREAM PROGRAMING LANGUAGES Python, C++, Java LOWER DEVELOPMENT COST More efficient architecture enables less people to do the same work 18

BENEFITS TO CONSUMERS BETTER EXPERIENCES Radically different user experiences MORE PERFORMANCE Getting more performance from the same form factor LONGER BATTERY LIFE Less power at the same performance 19

SUPPORT FROM MAJOR INDUSTRY PLAYERS For more information go to: http://hsafoundation.com/ Source http://pinterest.com/pin/193021534001931884/ 20

POTENTIAL MARKET IS HUGE Game Consoles Notebooks Tablets Desktops Embedded Servers 21

HSA Nov 11 14, 2013 San Jose McEnery Convention Center 14 Different Tracks with over 140 Individual Presentations 22

THANK YOU

DISCLAIMER The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names and logos are used for informational purposes only and may be trademarks of their respective owners. 24