Cycles for Competitiveness: A View of the Future HPC Landscape



Similar documents
Innovativste XEON Prozessortechnik für Cisco UCS

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Vendor Update Intel 49 th IDC HPC User Forum. Mike Lafferty HPC Marketing Intel Americas Corp.

Intel Core i Processor (3M Cache, 3.30 GHz)

Intel Xeon Processor E5-2600

Intel Core i3-2310m Processor (3M Cache, 2.10 GHz)

Itanium 2 Platform and Technologies. Alexander Grudinski Business Solution Specialist Intel Corporation

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

Haswell Cryptographic Performance

New Developments in Processor and Cluster. Technology for CAE Applications

CPU Session 1. Praktikum Parallele Rechnerarchtitekturen. Praktikum Parallele Rechnerarchitekturen / Johannes Hofmann April 14,

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

The Foundation for Better Business Intelligence

Evaluating Intel Virtualization Technology FlexMigration with Multi-generation Intel Multi-core and Intel Dual-core Xeon Processors.

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Intel Virtualization Technology

A Superior Hardware Platform for Server Virtualization

2015 Global Technology conference. Diane Bryant Senior Vice President & General Manager Data Center Group Intel Corporation

Processore Intel Xeon 5600 (Westmere EP) Carlo Grilli Business Developer Manager Intel Corporation Italia

AMD PhenomII. Architecture for Multimedia System Prof. Cristina Silvano. Group Member: Nazanin Vahabi Kosar Tayebani

Cloud based Holdfast Electronic Sports Game Platform

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK

IMAGE SIGNAL PROCESSING PERFORMANCE ON 2 ND GENERATION INTEL CORE MICROARCHITECTURE PRESENTATION PETER CARLSTON, EMBEDDED & COMMUNICATIONS GROUP

A Quantum Leap in Enterprise Computing

Intel Cloud Builder Guide to Cloud Design and Deployment on Intel Xeon Processor-based Platforms

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Intel Pentium 4 Processor on 90nm Technology

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

新 一 代 軟 體 定 義 的 網 路 架 構 Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Infrastructure Matters: POWER8 vs. Xeon x86

Data Center and Cloud Computing Market Landscape and Challenges

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Legal Notices and Important Information

Intel Itanium Architecture

LS DYNA Performance Benchmarks and Profiling. January 2009

Virtualization with the Intel Xeon Processor 5500 Series: A Proof of Concept

Leading Virtualization 2.0

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual

The ROI from Optimizing Software Performance with Intel Parallel Studio XE

Next Generation GPU Architecture Code-named Fermi

Intel Xeon Processor 5500 Series. An Intelligent Approach to IT Challenges

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

White Paper. Intel Sandy Bridge Brings Many Benefits to the PC/104 Form Factor

Desktop Reinvented. Lisa Graff Vice President and General Manager Desktop Client Platforms Group

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Oracle Database Reliability, Performance and scalability on Intel Xeon platforms Mitch Shults, Intel Corporation October 2011

Intel Media SDK Library Distribution and Dispatching Process

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Xeon+FPGA Platform for the Data Center

The Engine for Digital Transformation in the Data Center

Accelerating Innovation in the Desktop. Rob Crooke VP and General Manager Business Client Group

Large-Data Software Defined Visualization on CPUs

What is in Your Workstation?

Parallel Programming Survey

Intel Desktop public roadmap

Introducing the First Datacenter Atom SOC

Intel Xeon Processor E Product Family

Isaku Yamahata CloudOpen Japan May 22, 2014

What are the differences between Intel Core i3, i5 and i7

ECLIPSE Performance Benchmarks and Profiling. January 2009

Table Of Contents. Page 2 of 26. *Other brands and names may be claimed as property of others.

Design Cycle for Microprocessors

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Intel 64 and IA-32 Architectures Software Developer s Manual

The Open Cloud Near-Term Infrastructure Trends in Cloud Computing

Next Generation Intel Microarchitecture Nehalem Paul G. Howard, Ph.D. Chief Scientist, Microway, Inc. Copyright 2009 by Microway, Inc.

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Life With Big Data and the Internet of Things

Intel Virtualization Technology FlexMigration Application Note

Benchmarking the OpenCloud SIP Application Server on Intel -Based Modular Communications Platforms

Leading Virtualization Performance and Energy Efficiency in a Multi-processor Server

evm Virtualization Platform for Windows

Breakthrough AES Performance with. Intel AES New Instructions

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

CSE 6040 Computing for Data Analytics: Methods and Tools

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

Intel Virtualization and Server Technology Update

System Requirements Table of contents

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Intel Xeon Processor 3400 Series-based Platforms

Benchmarking Large Scale Cloud Computing in Asia Pacific

Multi-Threading Performance on Commodity Multi-Core Processors

Benefits of multi-core, time-critical, high volume, real-time data analysis for trading and risk management

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Transcription:

Cycles for Competitiveness: A View of the Future HPC Landscape October 6, 2010 Stephen R. Wheat, Ph.D. Sr. Director, HPC WW Business Operations Intel, Data Center Group

Legal Disclaimer Intel may make changes to specifications and product descriptions at any time, without notice. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See www.intel.com/products/processor_number for details. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel Virtualization Technology requires a computer system with a processor, chipset, BIOS, virtual machine monitor (VMM) and applications enabled for virtualization technology. Functionality, performance or other virtualization technology benefits will vary depending on hardware and software configurations. Virtualization technology-enabled BIOS and VMM applications are currently in development. 64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, operating system, device drivers and applications enabled for Intel 64 architecture. Performance will vary depending on your hardware and software configurations. Consult with your system vendor for more information. Lead-free: 45nm product is manufactured on a lead-free process. Lead is below 1000 PPM per EU RoHS directive (2002/95/EC, Annex A). Some EU RoHS exemptions for lead may apply to other components used in the product package. Halogen-free: Applies only to halogenated flame retardants and PVC in components. Halogens are below 900 PPM bromine and 900 PPM chlorine. Intel, Intel Xeon, Intel Core microarchitecture, and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. 2009 Standard Performance Evaluation Corporation (SPEC) logo is reprinted with permission 2 Copyright 2010, Intel Corporation. All rights reserved

High Performance Micro- Architecture for HPC Tick Tock Tick Tock Tick Tock Tick Tock 65nm 45nm 32nm 22nm Core Harpertown Penryn Nehalem Westmere Sandy Bridge Ivy Bridge Future SSSE3 SSE4.1 New instructions: SSE4.2 AES AVX Future - FMA 3 Copyright 2010, Intel Corporation. All rights reserved

Moore s Law: Alive and Well at Intel 65nm 2005 45nm 2007 MANUFACTURING 32nm 2009 22nm 15nm 11nm 8nm 2013 * 2015 * 2011 * DEVELOPMENT RESEARCH 2017 * 2019+ Intel Innovation-Enabled Technology Pipeline is Full 4 Copyright 2010, Intel Corporation. All rights reserved

Intel Xeon 5600 Energy Efficiency Building on Xeon 5500 Leadership Capabilities Lower Power CPUs Better performance/watt Lower power consumption 130W 95W 80W 60W (6C) 40W (4C) Intelligent Power Technology Integrated Power Gates and Automated Low Power States with Six Cores Intel Xeon 5600 Intel Xeon 5600 CPU Power Management Optimized power consumption through more efficient Turbo Boost and memory power management Lower Power DDR3 Memory Up to 10% reduction in memory power 1 Intel Xeon 5600 delivers greater platform Energy Efficiency 1 Based on voltage reduction from 1.50V to 1.35V, using Power (Watts) = Current x Voltage Lower power CPU SKU options for Xeon 5600 5 Copyright 2010, Intel Corporation. All rights reserved

Usage Advanced 6.4 GT/s QPI 8MB / 12MB DDR3 1333 Turbo Boost HT Standard 5.86 GT/s QPI 8MB / 12MB DDR3 1066 Turbo Boost HT Xeon 5500 Xeon 5600 Xeon 5500 X5570 2.93 GHz X5560 2.80 GHz X5550 2.66 GHz E5540 2.53 GHz E5530 2.40 GHz E5520 2.26 GHz 95W 95W 95W 80W 80W 80W Maximum Performance 6 Cores 4 Cores (freq optimized) Best Price Performance Higher Freq More Cache Xeon 5600 X5680 6C 3.33 GHz X5670 6C 2.93 GHz X5660 6C 2.80 GHz X5650 6C 2.66 GHz E5640 4C 2.66 GHz E5630 4C 2.53 GHz E5620 4C 2.40 GHz 130W 95W 95W 95W 80W 80W 80W 80W X5677 4C 3.46 GHz X5667 4C 3.06 GHz L5640 6C 2.26 GHz L5630 4C 2.13 GHz L5609 4C 1.86 GHz 130W 95W 60W 40W 40W Freq-Optimized Low Power Options Basic 4.8 GT/s QPI 4M cache DDR3 800 E5506 2.13 GHz E5504 2.00 GHz 80W 80W E5502 1.86 (2C) 80W Cost-Optimized Higher Frequency Xeon E5500 SKUs E5507 2.26 GHz (4C) E5506 2.13GHz (4C) 80W 80W E5503 2.00 GHz (2C) 80W Xeon 5600 (Westmere-EP) SKUs Xeon 5500 SKUs 6 Copyright 2010, Intel Corporation. All rights reserved

Performance / core Instruction Set Innovation Continues in Sandy Bridge CPUs Intel Advanced Vector Extensions (Intel AVX) AVX The old trend 15% gain/year (due to frequency and uarch) A new trajectory Vector FP offers scalable FLOPS and impressive FLOPS/W Intel AVX* 2X throughput - vector FP 2X load throughput 3-operand instructions Future Extensions Hardware FMA (Fused Multiply Add) Many more options Software development tools to help exploit Intel AVX instructions are available at http://www.intel.com/software/avx Next generation Intel microarchitecture (Nehalem) Superior Memory latency+bw Fast unaligned support Intel microarchitecure (Westmere) Cryptography acceleration instructions YMM0 XMM0 256 bits (2010) 128 bits (1999) Vectors increase from 128 bit to 256 bit The new state extends/overlays SSE The lower part (bits 0-127) of the YMM registers is mapped onto XMM registers 2009 2010 2011 > 2011 Intel AVX is a general purpose architecture building upon SSE

Intel Advanced Vector Extensions Overview New instructions to boost FP performance Extend existing FP vector instructions to 256-bits Full details at: http://www.intel.com/software/avx Requires at least a recompile of code KEY FEATURES Wider Vectors Increased from 128 bit to 256 bit Enhanced Data Rearrangement Use the new 256 bit primitives to broadcast, mask loads and permute data Three and four Operands, Non Destructive Syntax Designed for efficiency and future extensibility Flexible unaligned memory access support BENEFITS Up to 2x peak FLOPs output with good power efficiency Organize, access and pull only necessary data more quickly and efficiently Fewer register copies, better register use for both vector and scalar code More opportunities to fuse load and compute operations 8 Intel Advanced Vector Extensions is the start of the compute density increase 8 Copyright 2010, Intel Corporation. All rights reserved

FIXED FUNCTION LOGIC MEMORY and I/O INTERFACES Intel Many Integrated Core (Intel MIC) Architecture Up to 32 Intel coherent Intel processor cores on 1 silicon die Implements all four salient architectural features of Intel CPUs x86 Cores, Coherent caches, SIMD, SMT threads Enables developers to scale applications forward to future Intel MIC products VECTOR IA CORE VECTOR IA CORE VECTOR IA CORE VECTOR IA CORE INTERPROCESSOR NETWORK COHERENT CACHE COHERENT CACHE COHERENT CACHE COHERENT CACHE INTERPROCESSOR NETWORK VECTOR IA CORE COHERENT CACHE COHERENT CACHE VECTOR IA CORE VECTOR IA CORE COHERENT CACHE COHERENT CACHE VECTOR IA CORE 9 9 Copyright 2010, Intel Corporation. All rights reserved

Intel MIC Architecture Co-Processor Instruction Decode Scalar Vector Unit Unit Scalar Vector Registers Registers L1 Icache & Dcache 256K L2 Cache Local Subset Interprocessor Ring Network The co-processor core features include: Scalar pipeline derived from the dual-issue Intel Pentium processor Short execution pipeline Fully coherent cache structure Significant modern enhancements such as multi-threading, 64-bit extensions, and sophisticated pre-fetching 4 execution threads per core Separate register sets per thread Supports IEEE standards for floating point arithmetic Fast access to its 256KB local subset of a coherent L2 cache 32KB instruction cache per core 32KB data cache for each core Enhanced x86 instructions set with: Over 100 new instructions, Wide vector processing operations Some specialized scalar instructions 3-operand, 16-wide vector processing unit (VPU) VPU executes integer, single-precision float, and double precision float instructions Inter-processor Network with: 1024 bits wide, bi-directional (512 bits in each direction) 10 Copyright 2010, Intel Corporation. All rights reserved 10

Program for Intel Architecture Today and be Intel Many Integrated Core Architecture Ready Single Source Multicore CPU Compilers Libraries, Parallel Models Multi-core Multicore CPU Manycore Intel MIC architecture co-processor Full Intel C / C++ / Fortran compilers and Intel Math Kernel Library & Intel Integrated Performance Primitives libraries Flexibility of an Intel architecture design allows tools, choice of programming models, and familiar languages Programming models that span multi-core Intel architecture and Intel MIC Architecture processors Performance acceleration Intel architecture ecosystem support 11 Copyright 2010, Intel Corporation. All rights reserved 11 Eliminate Need for Dual Programming Architecture

System Growth Moore s Law 12 Copyright 2010, Intel Corporation. All rights reserved

Looking for the Missing Middle 13 Copyright 2010, Intel Corporation. All rights reserved

Internal apps ISV apps Risk Tolerant Risk Averse HPC Arenas and Differentiators Entry Level/ Mid-Range (ELMR, < $250K) is ~50% revenue 2 Key Customer Values: maximize performance early adopters, innovators minimize power & cost manageability @ scale open solutions Key Customer Values: time to productivity maximize productivity application availability robustness of total solution true cost of ownership ease of acquisition, deployment, mgmt. HPC is bifurcated into two areas w/several sub-segments High End HPC and Volume HPC have different requirements 14 Copyright 2010, Intel Corporation. All rights reserved 1 Source: IDC HPC Qview Q409 2 Source: InterSect360 Research, Traditional HPC Total Market Model and Forecast, 2009

Implied Perspectives High End > $500K Volume < $500K ELMR < $250K Large Enterprise Small Medium Enterprise 15 Copyright 2010, Intel Corporation. All rights reserved

Reality? About two-thirds of ELMR-sized (<$250K) systems are upgrades or add-ons to larger systems 1 InterSect360 measures that: Of true ELMR systems, 20-25% go to users who also have larger (high-end) systems. so, only 10-15% of said systems go to ELMR users 2 IDC sees something similar, with 70% 3 of the <$500K going to the Workgroup, Department, Divisional segments. Needs further visibility/corroboration Large Enterprises SMEs 16 Copyright 2010, Intel Corporation. All rights reserved 1 Source: InterSect360 Research, HPC User Site Census: Lifecycles, 2009. 2 Source: InterSect360 Research, custom user study, 2009. 3 Source: IDC, personal comms, 2010

Institution Size A Notional Perspective Today s Market Segment Exceptions: National Assets All other considerations: Power Space Budget/ROI stakeholder buy-in 17 Copyright 2010, Intel Corporation. All rights reserved Systems Capacity

USERS Defining the Missing Middle Small Home to midsize and National office business labs use Product Internet/email Universities design Graphic CAD/CAM design NASA USGS/NWS Prediction Gaming Advanced Video production simulation DOD Low-end New CAD/CAM materials Traditional Computer Users National Opportunity: High-End the Missing Middle HPC Users TASK COMPLEXITY 18 Copyright 2010, Intel Corporation. All rights reserved From ncms.org

Identifying the Missing Middle In a few words, the missing middle is comprised of those institutions that do not use HPC and yet HPC would result in a net-positive ROI to them. Also includes those that use some HPC but not as much as they could 19 Copyright 2010, Intel Corporation. All rights reserved

Key Barriers The COC/IDC Reveal 1 report concluded that there are three major system barriers stalling HPC adoption: Lack of Application Software Lack of Sufficient Talent Cost constraints They noted that these were the same constraints identified four years prior 1,2 InterSect360 3 had a similar perspective; that cost is not the top barrier. You could give companies free HW and SW, and it wouldn t solve these problems: Political will to change a workflow and to build faith in simulation to supplement physical testing. Expertise and knowledge for using scalable systems, and Creation of digital models. 20 Copyright 2010, Intel Corporation. All rights reserved 1 Source: CoC/IDC Reveal report, 2008. 2 Source: CoC Study of US Industrial HPC Users, July 2004 3 Source: Addison Snell, InterSect360

21 Copyright 2010, Intel Corporation. All rights reserved Barriers Summarized

This is the Right Time Several key events are transpiring, thus making this the first time in history to successfully tackle the missing middle problem Parallel everywhere, and becoming more so each year Recent launch of the Intel Xeon Processor 5600 Series And the launch of Nehalem-EX Tick-Tock Everyone in DC focused on jobs A 20 th Century work force needs 21 st Century job skills International competitiveness increasingly defined by innovation as enabled by modeling and simulation Large-scale industry and USG are motivated to solve the problem The environment is aligned for action; but what action? 22 Copyright 2010, Intel Corporation. All rights reserved

Action on the Right Issues Processors and programming paradigms are not the right issues. We have more than enough performance now and on the roadmap for the foreseeable future Basic software tools are sufficient Application Vacuum It s a miss! Processor Performance 23 Copyright 2010, Intel Corporation. All rights reserved

Catching Their Attention Must make computational modeling and simulation: Easy to use Application Frameworks End-user specific infrastructure Deliver computational continuity Scaled use Seamless compatibility Affordable access models Easy to see ROI 24 Copyright 2010, Intel Corporation. All rights reserved

Putting the Pieces Together Need a modeling and simulation supply chain The missing middle is comprised of a few communities The actual business community needing that competitive edge The liaison community facilitating the tech transfer The application framework community developing relevant software environments The speculative innovation community aka academic research teams focused on local industry mod/sim needs 25 Copyright 2010, Intel Corporation. All rights reserved

Industry Resources R&D Resources National Digital Manufacturing Strategy Industry PIC National Labs TAA Centers Universities Digital Manufacturing R&D Ctrs DoD ManTech Ctrs HPC Centers Universities National Labs Universities Digital Manufacturing GAP TAA Centers Industry PIC Digital Manufacturing R&D Ctrs DoD ManTech Ctrs TAA Centers... Industry PIC... DoD ManTech Ctrs TAA Centers Manufacturers & Industry HPC Centers Digital Manufacturing R&D Ctrs Industry PIC TAA Centers MEP MEP MEP MEP MEP MEP MEP MEP MEP MEP Existing R&D Expertise Universities National Labs DoE Labs HPC Centers (i.e. OSC, NCSA, etc.) Proposed National Manufacturing Innovation Network Digital Manufacturing R&D Centers (academic focus) Industry Predictive Innovation Collaboration Centers (non-profit e.g. NCMS) Trade Adjustment Assistance Centers (TAAC) Approx. 14 National Centers Expand mission beyond trade impacted companies MEP s (NIST) 60+ National Centers New focus on Digital Manufacturing Focused Digital Manufacturing Training Community colleges, NAM, Manufacturing web portals 26 Copyright 2010, Intel Corporation. All rights reserved

Requires an Industry Effort This isn t something Intel can or should do alone It will require a concerted and determined effort on everyone s part It may require a USG managed agenda It won t happen over night History is in the making! 27 Copyright 2010, Intel Corporation. All rights reserved

Established to pursue solutions to the barriers facing the Missing Middle in US Manufacturing Transforming American Manufacturing for Economic Growth Comprised of more than 35 entities, from: Computer OEMs ISVs Academia Manufacturing National Labs Recent results: America COMPETES Renewal language for IAWG Further analysis: results released via NCMS on 9/30/2010 Industry Recognition Initiative launched at IDC HPC User Forum on 9/14/2010 28 Copyright 2010, Intel Corporation. All rights reserved

Institution Size Is there more? Today s Market Segment Exceptions: National Assets The Cloud 29 Copyright 2010, Intel Corporation. All rights reserved Systems Capacity

Definition of Success When the middle isn t missing 30 Copyright 2010, Intel Corporation. All rights reserved

Thank you! HPC @ Intel