Maximize Application Performance On the Go and In the Cloud with OpenCL* on Intel Architecture

Size: px
Start display at page:

Download "Maximize Application Performance On the Go and In the Cloud with OpenCL* on Intel Architecture"

Transcription

1 Maximize Application Performance On the Go and In the Cloud with OpenCL* on Intel Architecture Arnon Peleg (Intel) Ben Ashbaugh (Intel) Dave Helmly (Adobe)

2 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel Processor Numbers Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel s current plan of record product roadmaps. Performance claims: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to Iris graphics is available on select systems. Consult your system manufacturer. Copyright 2013 Intel Corporation. All rights reserved. Intel, Intel Inside, the Intel logo, Centrino, Intel Core, Intel Atom, Pentium, VTune, and Ultrabook are trademarks of Intel Corporation in the United States and other countries. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. *Other names and brands may be claimed as the property of others.

3 Today Intel and OpenCL* sessions 12:15 PM -1:45 PM Maximize Application Performance On the Go and in the Cloud with OpenCL* on Intel Architecture (Intel, Adobe Systems Inc.) 2:00 PM 3:00 PM Faster Video Creation with Higher Productivity Using Intel Developer Tools and OpenCL* (Intel) 3:15 PM 4:15 PM Journey of Pixels in Adobe Photoshop from CPU, GPU to Cloud on Intel HD Graphics (Intel, Adobe Systems Inc.) 3

4 This Session Agenda Introduction to the Intel SDK for OpenCL* Applications Presenter: Arnon Peleg (Intel) Product Manager, Intel SDK for OpenCL Applications, Intel Corporation Optimize OpenCL* applications for Intel Iris Graphics Presenter: Ben Ashbaugh (Intel) Sr. Graphics Complier Engineer, Intel Corporation Accelerating video production with OpenCL* and Intel Iris Graphics Presenter: Dave Helmly, Sr. Manager, Solutions Consulting Pro Video/Audio Americas, Adobe* Systems Inc. 4

5 Introduction to the Intel SDK for OpenCL* Applications Arnon Peleg, Intel Cooperation 5

6 The Question Is How to get maximum performance out of the platform? Better Together Use all resources (CPU, Graphics, Media) Target the right task on the right device Multicore CPU Task-parallel or irregular workloads better suited to a CPU Complex game & graphics engine graphs, variable bitrate compression, Programmable Graphics Film & image post-processing filters, graphics, video analytics, Highly data-parallel tasks better suited to Intel Processor Graphics Be the conductor of our orchestra: Use a dynamic set of instruments, whose ranges overlap and compliment each other. When these instruments come together, your composition has even more power to move your audience.

7 { int id = get_global_id(0); What is Programmability With OpenCL*? A Standard-based Environment to Write Portable Parallel Code for a Diverse Mix of Multi-core CPUs, Processor Graphics, and Coprocessors kernel void dp_mul(global const float *a, global const float *b, global float *c) Unified Code Base c[id] = a[id] * b[id]; } // execute over n work-items CPU Maximize platform performance with Intel Iris Graphics and Intel HD Graphics for Visual Computing Usages Image Processing Video Editing & Playback Games Proc. GFX Perceptual Computing Standard based by Khronos* with Intel participation

8 The BIG Idea Behind OpenCL* OpenCL* execution model Define N-dimensional computation domain Choose a target device Execute a kernel at each point in computation domain void trad_mul(int n, const float *a, const float *b, float *c) { int i; for (i=0; i<n; i++) c[i] = a[i] * b[i]; } Traditional loops Data Parallel OpenCL kernel void dp_mul(global const float *a, global const float *b, global float *c) { int id = get_global_id(0); c[id] = a[id] * b[id]; } // execute over n workitems 8

9 Intel SDK for OpenCL* Applications 2013 A Comprehensive Software Development Environment for OpenCL* Applications Supporting the OpenCL 1.2 Full-Profile on 3 rd and 4 th Generation Intel Core Processors with Intel Iris Graphics and Intel HD Graphics product family, Intel Xeon Processors, and Intel Xeon Phi Coprocessors FREE download at intel.com/software/opencl 9

10 Intel SDK for OpenCL* Applications XE 2013 What is it? OpenCL* runtime and compiler for Intel Processors (CPU) and Intel Xeon Phi Coprocessors Linux OSs support: Red Hat EL 6.x SUSE SLES 11.x OpenCL headers and libs Source Code Samples Product documentation User guide, Optimization Guide Development tools Offline build and analysis tools Debugger for OpenCL kernels on CPU Profiling with Intel VTune Amplifier XE Use the Intel SDK for OpenCL* Applications XE 2013 for Intel Xeon Phi support on Linux*

11 Intel SDK for OpenCL* Applications 2013 What is it? OpenCL* 1.2 support for 3rd and Future 4th Generation Intel Core Processors Accelerating performance with Intel Iris Graphics and Intel HD Graphics family Enhanced graphics and media APIs interoperability DirectX*, OpenGL*, and the Intel Media SDK Support for Microsoft Windows 7* and Windows 8* Operating Systems Developers tools for build, debug and tune of OpenCL applications. Use the Intel SDK for OpenCL* Applications 2013 for maximize performance of Intel Core Processors

12 The Intel SDK for OpenCL* Applications 2013 Software Stack For Intel Core Processors with Intel Iris Graphics and Intel HD Graphics Applications SDK Components Run Develop Profiling Tools Integration Intel SDK for OpenCL* Applications Microsoft* Visual Studio* Integration SDK Tools - Kernel Debugger - Kernel Builder Developer Environment (libs and headers) Online Resources: - Code Samples - Optimization Guide - Tech Articles and Videos Intel HD Graphics Driver With OpenCL* 1.2 support for CPU and the Processor Graphics 3rd and 4th Generation Intel Core Processors With Intel Iris Graphics and Intel HD Graphics

13 Intel SDK for OpenCL* Applications Support Matrix Know the Processors and Operating Systems Visual Computing Domain Data Center Domain (Version 2013) (Version XE 2013) You Use and download the SDK You Need Supported Processors: Intel HD & Iris Graphics Intel Core Processor (CPU) Intel Xeon Processor Intel Xeon Phi Coprocessor Operating Systems: Windows 7* Window 8* Red Hat* Linux* SUSE* Linux*

14 The Intel SDK for OpenCL* Applications Online Resource The SDK section of the Intel Developers Zone is a one-stop shop for resources, support and information for OpenCL* developers Free Downloads Code Samples Tech Articles Case Studies Forums and Support Beta Programs

15 Optimize OpenCL* applications for Intel Iris Graphics Ben Ashbaugh (Intel) 15

16 Agenda Understanding Occupancy How Intel Iris Graphics executes OpenCL* Kernels Memory Matters Host to Device Device Access Compute Characteristics Maximizing GFlops Summary / Questions 16

17 Agenda Understanding Occupancy How Intel Iris Graphics executes OpenCL* Kernels 17

18 Sub Slice 1 Sub Slice 3 Ring Bus / LLC / Memory Sub Slice 0 Sub Slice 2 Intel Iris Graphics Architecture Overview Command Streamer (CS) Vertex Fetch (VF) Video Front End (VFE) Video Quality Engine Multi-Format CODEC Blitter Display Vertex Shader (VS) Hull Shader (HS) L1 IC$ 3D Sampler Media Sampler Data Port Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Tessellator Domain Shader (DS) Thread Dispatch Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Geometry Shader (GS) Stream Out (SOL) Clip/Setup L1 IC$ 3D Sampler Media Sampler Data Port Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Slice 0 Slice 1 18

19 Sub Slice 1 Sub Slice 3 Ring Bus / LLC / Memory Sub Slice 0 Sub Slice 2 Intel Iris Graphics Architecture Overview Command Streamer (CS) Vertex Fetch (VF) Vertex Shader (VS) Hull Shader (HS) Video Front End (VFE) Global Assets Command Video Quality Multi-Format Streamer Blitter Engine CODEC Thread Dispatch L1 IC$ EU EU EU EU EU 3D Sampler Media Sampler Data Port Display Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Tessellator Domain Shader (DS) Thread Dispatch Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Geometry Shader (GS) Stream Out (SOL) Clip/Setup L1 IC$ 3D Sampler Media Sampler Data Port Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Slice 0 Slice 1 19

20 Sub Slice 1 Sub Slice 3 Ring Bus / LLC / Memory Sub Slice 0 Sub Slice 2 Intel Iris Graphics Architecture Overview Command Streamer (CS) Vertex Fetch (VF) Vertex Shader (VS) Hull Shader (HS) Video Front End (VFE) Video Quality Engine L1 IC$ EU Multi-Format CODEC EU EU EU EU Blitter Display Slice Common L3 Cache 3D Sampler Media Shared Local EUMemory Sampler L1 Tex$ IC$ Data Port 3D Sampler Media Sampler Data Port Tex$ Tessellator Domain Shader (DS) Thread Dispatch Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ Geometry Shader (GS) Stream Out (SOL) Clip/Setup L1 IC$ 3D Sampler Media Sampler Data Port Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Slice 0 Slice 1 20

21 Sub Slice 1 Sub Slice 3 Ring Bus / LLC / Memory Sub Slice 0 Sub Slice 2 Intel Iris Graphics Architecture Overview Command Streamer (CS) Vertex Fetch (VF) Video Front End (VFE) Video Quality Engine Multi-Format CODEC Blitter Display Vertex Shader (VS) Hull Shader (HS) Tessellator Domain Shader (DS) Sub Slice L1 Execution Units IC$ Thread Dispatch Rasterizer / Depth L3$ Pixel Ops 3D Sampler Media Sampler Data Port Samplers and Data Port Instruction and Texture Caches Tex$ Render$ Depth$ L1 IC$ EU Rasterizer / Depth EU EU EU EU L3$ Pixel Ops 3D Sampler Media Sampler Data Port Tex$ Render$ Depth$ Geometry Shader (GS) Stream Out (SOL) Clip/Setup L1 IC$ 3D Sampler Media Sampler Data Port Tex$ L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Slice 0 Slice 1 21

22 Intel Iris Graphics Architecture Building Blocks OpenCL* Kernels run on an Execution Unit (EU) Each EU is a Multi-Threaded SIMD Processor Up to 7 threads per EU 128 x 8 x 32-bit registers per thread Up to 8, 16, or 32 OpenCL* work items per thread (compiler-controlled) Thread 0 Thread 2 Thread 4 EU Thread ` 1 Thread 3 Thread 5 SIMD8, SIMD16, SIMD32 Thread 6 SIMD8 More Registers SIMD16 and SIMD32 Better Efficiency 22

23 Sub Slice Intel Iris Graphics Architecture Building Blocks OpenCL* Work Groups run on a Sub Slice 10 EUs per Sub Slice Texture Sampler (Images) Data Port (Buffers) L1 IC$ 3D Sampler Media Sampler Data Port Tex$ Instruction and Texture Caches OpenCL* Work Groups may run on multiple EU threads, on multiple EUs! 23

24 Sub Slice Sub Slice Intel Iris Graphics Architecture Building Blocks Two Sub Slices make a Slice Shared Resources: Slice Common L1 IC$ EU EU EU EU EU 3D Sampler Media Sampler Data Port Tex$ L3 Cache + Shared Local Memory Barriers Intel Iris Graphics has Two Slices Rasterizer / Depth L3$ Pixel Ops Render$ Depth$ 2 x 2 = 4 Sub Slices 4 x 10 = 40 EUs Up to 40 x 7 = 280 EU threads Up to 8960 OpenCL* work items in flight! L1 IC$ EU EU EU EU EU 3D Sampler Media Sampler Data Port Tex$ Slice 24

25 How Intel Iris Graphics Runs OpenCL* 1. Divide Into Work Groups 25

26 How Intel Iris Graphics Runs OpenCL* 2. Divide Each Work Group Into EU Threads 26

27 Sub Slice How Intel Iris Graphics Runs OpenCL* L1 IC$ 3D Sampler Media Sampler Data Port Tex$ 2. Launch EU Threads for the Work Group Onto a Sub Slice Repeat for each Work Group Must have enough room in the Sub Slice for all EU threads for the Work Group Not enough room in any Sub Slice EU threads must wait 27

28 Occupancy Goal: Use All Machine Resources This is harder than it sounds! Many factors to consider 1. Launch Enough Work One thread sufficient to prevent an EU from going idle Too few EU threads can result in an EU being stalled More EU threads better latency coverage keeps an EU active 28

29 Occupancy Intel VTune Amplifier XE

30 Occupancy (continued ) 2. Don t Waste SIMD Lanes Use an optimal Local Work Size Good: Query for compiled SIMD size: CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE Occasionally Helpful: Compile for a specific local work size (8, 16, or 32): attribute ((reqd_work_group_size(x, Y, Z))) Best: Let the driver pick (Local Work Size == NULL) Ideal for kernels with no barriers or shared local memory 30

31 Occupancy (continued ) More subtle factors: 3. Barriers 16 barriers per sub slice Can be a limiting factor for very small local work groups 4. Shared Local Memory 64KB shared local memory per sub slice Can be a limiting factor for kernels that use lots of shared local memory 31

32 Agenda Memory Matters Host to Device 32

33 Optimizing Host to Device Transfers Host (CPU) and Device (GPU) share the same physical memory For OpenCL* buffers: No transfer needed (zero copy)! Allocate system memory aligned to a cache line (64 bytes) Create buffer with system memory pointer and CL_MEM_USE_HOST_PTR Use clenqueuemapbuffer() to access data For OpenCL* images: Currently tiled in device memory transfer required 33

34 Operating on Buffers as Images Intel Iris Graphics supports cl_khr_image2d_from_buffer New OpenCL* 1.2 Extension Treat data as a buffer for some kernels, as an image for others Some restrictions for zero copy: buffer size, image pitch Buffer Image 0x123 0x456 0x789 34

35 Interop with Graphics and Media APIs / SDKs Intel Iris Graphics supports many Graphics and Media interop extensions: cl_khr_dx9_media_sharing (includes DXVA for Intel Media SDK) cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing Use Graphics API / SDK assets in OpenCL* with no copies! 35

36 Agenda Memory Matters Device Access 36

37 Intel Iris Graphics Cache Hierarchy EDRAM (non-inclusive victim cache) 128MB/package (Intel Iris Pro 5200) images Sampler L1 Sampler L2 EU L3 LLC DRAM buffers 256KB/slice 2-8MB/package (shared w/ CPU) 37

38 global and constant Memory Global Memory Accesses go through the L3 Cache L3 Cache Line is 64 bytes EU thread accesses to the same L3 Cache Line are collapsed Order of data within cache line does not matter Bandwidth determined by number of cache lines accessed Maximum Bandwidth: 64 bytes / clock / sub slice Good: Load at least 32-bits of data at a time, starting from a 32-bit aligned address Best: Load 4 x 32-bits of data at a time, starting from a cache line aligned address Loading more than 4 x 32-bits of data is not beneficial 38

39 Global and Constant Memory Access Examples 1. x = data[ get_global_id(0) ] One cache line, full bandwidth 2. x = data[ n get_global_id(0) ] Reverse order, full bandwidth Global ID: Global ID: Cache Line n Cache Line n - 1 Cache Line n + 1 Cache Line n x = data[ get_global_id(0) + 1 ] Offset, two cache lines, half bandwidth Global ID: Cache Line n Cache Line n x = data[ get_global_id(0) * 2 ] Strided, half bandwidth Global ID: Cache Line n Cache Line n x = data[ get_global_id(0) * 16 ] Very strided, worst-case Cache Line n Cache Line n + 1 Cache Line n Global ID:

40 local Memory Accesses Local Memory Accesses also go through the L3 Cache! Key Difference: Local Memory is Banked Banked at a DWORD granularity, 16 banks Bandwidth determined by number of bank conflicts Maximum Bandwidth: Still 64 bytes / clock / sub slice Supports more access patterns with full bandwidth than Global Memory No bank conflicts full bandwidth Reading from the same address in a bank full bandwidth 40

41 Local Memory Access Examples 1. x = data[ get_global_id(0) + 1 ] Unique banks, full bandwidth 2. x = data[ get_global_id(0) & ~1 ] Same address read, full bandwidth 3. x = data[ get_global_id(0) * 2 ] Strided, half bandwidth 4. x = data[ get_global_id(0) * 16 ] Very strided, worst-case 5. x = data[ get_global_id(0) * 17 ] Full bandwidth! Bank: Global ID: Bank: Global ID: Bank: Global ID: Bank: Global ID: Bank: Global ID:

42 private Memory EU Thread n+1 EU Thread n-1 EU Thread n Compiler can usually allocate Private Memory in the Register File Even if Private Memory is dynamically indexed Good Performance Work Item 0 Work Item 1 Work Item n Work Item 0 Work Item 1 private int a[100] private int b[100] Fallback: Private Memory allocated in Global Memory Accesses are very strided Bad Performance Work Item n Work Item 0 Work Item 1 Work Item n private int c[200]

43 Agenda Compute Characteristics Maximizing GFlops 43

44 ISA SIMT ISA with Predication and Branching Divergent code executes both branches Reduced SIMD Efficiency this(); if ( x ) that(); else another(); finish(); time Example: x sometimes true SIMD lane time Example: x never true SIMD lane 44

45 Compute GFlops EUs have 2 x 4-wide vector ALUs Second ALU has limitations: Subset of instructions: add, mov, mad, mul, cmp Instruction must come from another EU thread Only float operands! Peak GFlops: #EUs x ( 2 x 4-wide ALUs ) x ( MUL + ADD ) x Clock Rate For Intel Iris Pro 5200: 40 x 8 x 2 x 1.3 = 832 GFlops! 4 th Generation Intel Core Processor (CPU+ Intel Iris Graphics) >1TFlop! 45

46 Maximizing Compute Performance Use mad() / fma(): Either explicitly with built-ins, or via -cl-mad-enable Use floats wherever possible to maximize co-issue Avoid long and size_t data types Prefer float over int, if possible Using short data types may improve performance Trade accuracy for speed: native built-ins, -cl-fast-relaxed-math Often good enough for graphics 46

47 Agenda Summary / Questions 47

48 Summary Maximize Occupancy Choose a Good Local Work Size Or, Let the Driver Choose (Local Work Size == NULL) Avoid Host-to-Device Transfers Create Buffers with CL_MEM_USE_HOST_PTR Access Device Memory Efficiently Minimize Cache Lines for global Memory Minimize Bank Conflicts for local Memory Maximize Compute Avoid Divergent Branches Use mad / fma and float Data When Possible 48

49 Questions / Acknowledgements This presentation would not have been possible without material and review comments from many people Thank you! Murali Sundaresan, Sushma Rao, Aaron Kunze, Tom Craver, Brijender Bharti, Rami Jiossy, Michal Mrozek, Jay Rao, Pavan Lanka, Adam Lake, Arnon Peleg, Raun Krisch, Berna Adalier 49

50 Additional Resources Intel SDK for OpenCL* Applications 2013 Intel OpenCL* Optimization Guide Intel VTune Amplifier XE 2013 Intel Graphics Performance Analyzers 50

51 Accelerating video production with OpenCL* and Intel Iris Graphics Dave Helmly, Sr. Manager, Solutions Consulting Pro Video/Audio Americas, Adobe* Systems Inc. 51

52 Intel is Hiring! We want to work with you! Head to our booth (201) to hear about our exciting opportunities! 52

53 Coming up next 2:00 p.m. 3:00 p.m. Faster Content Creation with Higher Productivity using Intel Developer Tools and OpenCL* Presenters: Arnon Peleg (Intel) Raghu Muthyalampalli (Intel) For more information: 53

54

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual

Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual Overview Metrics Monitor is part of Intel Media Server Studio 2015 for Linux Server. Metrics Monitor is a user space shared library

More information

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze

Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance...

More information

Intel Media SDK Library Distribution and Dispatching Process

Intel Media SDK Library Distribution and Dispatching Process Intel Media SDK Library Distribution and Dispatching Process Overview Dispatching Procedure Software Libraries Platform-Specific Libraries Legal Information Overview This document describes the Intel Media

More information

The Transition to PCI Express* for Client SSDs

The Transition to PCI Express* for Client SSDs The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers

More information

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,

More information

2013 Intel Corporation

2013 Intel Corporation 2013 Intel Corporation Intel Open Source Graphics Programmer s Reference Manual (PRM) for the 2013 Intel Core Processor Family, including Intel HD Graphics, Intel Iris Graphics and Intel Iris Pro Graphics

More information

Vendor Update Intel 49 th IDC HPC User Forum. Mike Lafferty HPC Marketing Intel Americas Corp.

Vendor Update Intel 49 th IDC HPC User Forum. Mike Lafferty HPC Marketing Intel Americas Corp. Vendor Update Intel 49 th IDC HPC User Forum Mike Lafferty HPC Marketing Intel Americas Corp. Legal Information Today s presentations contain forward-looking statements. All statements made that are not

More information

Specification Update. January 2014

Specification Update. January 2014 Intel Embedded Media and Graphics Driver v36.15.0 (32-bit) & v3.15.0 (64-bit) for Intel Processor E3800 Product Family/Intel Celeron Processor * Release Specification Update January 2014 Notice: The Intel

More information

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture

Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive

More information

Intel X38 Express Chipset Memory Technology and Configuration Guide

Intel X38 Express Chipset Memory Technology and Configuration Guide Intel X38 Express Chipset Memory Technology and Configuration Guide White Paper January 2008 Document Number: 318469-002 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

AMD GPU Architecture. OpenCL Tutorial, PPAM 2009. Dominik Behr September 13th, 2009

AMD GPU Architecture. OpenCL Tutorial, PPAM 2009. Dominik Behr September 13th, 2009 AMD GPU Architecture OpenCL Tutorial, PPAM 2009 Dominik Behr September 13th, 2009 Overview AMD GPU architecture How OpenCL maps on GPU and CPU How to optimize for AMD GPUs and CPUs in OpenCL 2 AMD GPU

More information

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015

FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 AGENDA The Kaveri Accelerated Processing Unit (APU) The Graphics Core Next Architecture and its Floating-Point Arithmetic

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics GPU Architectures A CPU Perspective Derek Hower AMD Research 5/21/2013 Goals Data Parallelism: What is it, and how to exploit it? Workload characteristics Execution Models / GPU Architectures MIMD (SPMD),

More information

Intel Core TM i3 Processor Series Embedded Application Power Guideline Addendum

Intel Core TM i3 Processor Series Embedded Application Power Guideline Addendum Intel Core TM i3 Processor Series Embedded Application Power Guideline Addendum July 2012 Document Number: 327705-001 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Intel Media Server Studio Professional Edition for Windows* Server

Intel Media Server Studio Professional Edition for Windows* Server Intel Media Server Studio 2015 R3 Professional Edition for Windows* Server Release Notes Overview What's New System Requirements Installation Installation Folders Known Limitations Legal Information Overview

More information

Monte Carlo Method for Stock Options Pricing Sample

Monte Carlo Method for Stock Options Pricing Sample Monte Carlo Method for Stock Options Pricing Sample User's Guide Copyright 2013 Intel Corporation All Rights Reserved Document Number: 325264-003US Revision: 1.0 Document Number: 325264-003US Intel SDK

More information

Intel Retail Client Manager

Intel Retail Client Manager Intel Retail Client Manager Frequently Asked Questions June 2014 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO

More information

Radeon HD 2900 and Geometry Generation. Michael Doggett

Radeon HD 2900 and Geometry Generation. Michael Doggett Radeon HD 2900 and Geometry Generation Michael Doggett September 11, 2007 Overview Introduction to 3D Graphics Radeon 2900 Starting Point Requirements Top level Pipeline Blocks from top to bottom Command

More information

Intel HTML5 Development Environment. Tutorial Test & Submit a Microsoft Windows Phone 8* App (BETA)

Intel HTML5 Development Environment. Tutorial Test & Submit a Microsoft Windows Phone 8* App (BETA) Intel HTML5 Development Environment Tutorial Test & Submit a Microsoft Windows Phone 8* App v1.00 : 04.09.2013 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS.

More information

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.

More information

Intel 965 Express Chipset Family Memory Technology and Configuration Guide

Intel 965 Express Chipset Family Memory Technology and Configuration Guide Intel 965 Express Chipset Family Memory Technology and Configuration Guide White Paper - For the Intel 82Q965, 82Q963, 82G965 Graphics and Memory Controller Hub (GMCH) and Intel 82P965 Memory Controller

More information

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Intel Data Direct I/O Technology (Intel DDIO): A Primer > Intel Data Direct I/O Technology (Intel DDIO): A Primer > Technical Brief February 2012 Revision 1.0 Legal Statements INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Finding Performance and Power Issues on Android Systems. By Eric W Moore

Finding Performance and Power Issues on Android Systems. By Eric W Moore Finding Performance and Power Issues on Android Systems By Eric W Moore Agenda Performance & Power Tuning on Android & Features Needed/Wanted in a tool Some Performance Tools Getting a Device that Supports

More information

Intel Service Assurance Administrator. Product Overview

Intel Service Assurance Administrator. Product Overview Intel Service Assurance Administrator Product Overview Running Enterprise Workloads in the Cloud Enterprise IT wants to Start a private cloud initiative to service internal enterprise customers Find an

More information

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas

More information

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide White Paper August 2007 Document Number: 316971-002 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

Hetero Streams Library 1.0

Hetero Streams Library 1.0 Release Notes for release of Copyright 2013-2016 Intel Corporation All Rights Reserved US Revision: 1.0 World Wide Web: http://www.intel.com Legal Disclaimer Legal Disclaimer You may not use or facilitate

More information

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1.

Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V. Technical Brief v1. Intel Ethernet and Configuring Single Root I/O Virtualization (SR-IOV) on Microsoft* Windows* Server 2012 Hyper-V Technical Brief v1.0 September 2012 2 Intel Ethernet and Configuring SR-IOV on Windows*

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

Introduction to GPU Programming Languages

Introduction to GPU Programming Languages CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure

More information

Intel HTML5 Development Environment. Tutorial Building an Apple ios* Application Binary

Intel HTML5 Development Environment. Tutorial Building an Apple ios* Application Binary Intel HTML5 Development Environment Tutorial Building an Apple ios* Application Binary V1.02 : 08.08.2013 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO

More information

The ROI from Optimizing Software Performance with Intel Parallel Studio XE

The ROI from Optimizing Software Performance with Intel Parallel Studio XE The ROI from Optimizing Software Performance with Intel Parallel Studio XE Intel Parallel Studio XE delivers ROI solutions to development organizations. This comprehensive tool offering for the entire

More information

High Performance Computing and Big Data: The coming wave.

High Performance Computing and Big Data: The coming wave. High Performance Computing and Big Data: The coming wave. 1 In science and engineering, in order to compete, you must compute Today, the toughest challenges, and greatest opportunities, require computation

More information

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization

More information

Introduction to GPU Architecture

Introduction to GPU Architecture Introduction to GPU Architecture Ofer Rosenberg, PMTS SW, OpenCL Dev. Team AMD Based on From Shader Code to a Teraflop: How GPU Shader Cores Work, By Kayvon Fatahalian, Stanford University Content 1. Three

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

VNF & Performance: A practical approach

VNF & Performance: A practical approach VNF & Performance: A practical approach Luc Provoost Engineering Manager, Network Product Group Intel Corporation SDN and NFV are Forces of Change One Application Per System Many Applications Per Virtual

More information

Creating Overlay Networks Using Intel Ethernet Converged Network Adapters

Creating Overlay Networks Using Intel Ethernet Converged Network Adapters Creating Overlay Networks Using Intel Ethernet Converged Network Adapters Technical Brief Networking Division (ND) August 2013 Revision 1.0 LEGAL INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

Douglas Fisher Vice President General Manager, Software and Services Group Intel Corporation

Douglas Fisher Vice President General Manager, Software and Services Group Intel Corporation Douglas Fisher Vice President General Manager, Software and Services Group Intel Corporation Other brands and names are the property of their respective owners. Other brands and names are the property

More information

Intel 810 and 815 Chipset Family Dynamic Video Memory Technology

Intel 810 and 815 Chipset Family Dynamic Video Memory Technology Intel 810 and 815 Chipset Family Dynamic Video Technology Revision 3.0 March 2002 March 2002 1 Information in this document is provided in connection with Intel products. No license, express or implied,

More information

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group

ATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group ATI Radeon 4800 series Graphics Michael Doggett Graphics Architecture Group Graphics Product Group Graphics Processing Units ATI Radeon HD 4870 AMD Stream Computing Next Generation GPUs 2 Radeon 4800 series

More information

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms EXECUTIVE SUMMARY Intel Cloud Builder Guide Intel Xeon Processor-based Servers Red Hat* Cloud Foundations Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms Red Hat* Cloud Foundations

More information

Intel Platform and Big Data: Making big data work for you.

Intel Platform and Big Data: Making big data work for you. Intel Platform and Big Data: Making big data work for you. 1 From data comes insight New technologies are enabling enterprises to transform opportunity into reality by turning big data into actionable

More information

Create Natural User Interfaces with the Next-Generation Intel Perceptual Computing SDK

Create Natural User Interfaces with the Next-Generation Intel Perceptual Computing SDK Create Natural User Interfaces with the Next-Generation Intel Perceptual Computing SDK Ryan Tabrah, Group Manager, UX Developer Products @PerceptualSDK Intel Innovation: Transforming the Game Intel's Vision

More information

GPU Architecture. Michael Doggett ATI

GPU Architecture. Michael Doggett ATI GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super

More information

Intel 845G/GL Chipset Dynamic Video Memory Technology

Intel 845G/GL Chipset Dynamic Video Memory Technology R Intel 845G/GL Chipset Dynamic Video Memory Technology Revision 1.2 June 2002 May 2002 1 Information in this document is provided in connection with Intel products. No license, express or implied, by

More information

* * * Intel RealSense SDK Architecture

* * * Intel RealSense SDK Architecture Multiple Implementations Intel RealSense SDK Architecture Introduction The Intel RealSense SDK is architecturally different from its predecessor, the Intel Perceptual Computing SDK. If you re a developer

More information

Large-Data Software Defined Visualization on CPUs

Large-Data Software Defined Visualization on CPUs Large-Data Software Defined Visualization on CPUs Greg P. Johnson, Bruce Cherniak 2015 Rice Oil & Gas HPC Workshop Trend: Increasing Data Size Measuring / modeling increasingly complex phenomena Rendering

More information

Software Solutions for Multi-Display Setups

Software Solutions for Multi-Display Setups White Paper Bruce Bao Graphics Application Engineer Intel Corporation Software Solutions for Multi-Display Setups January 2013 328563-001 Executive Summary Multi-display systems are growing in popularity.

More information

GPU Architecture. An OpenCL Programmer s Introduction. Lee Howes November 3, 2010

GPU Architecture. An OpenCL Programmer s Introduction. Lee Howes November 3, 2010 GPU Architecture An OpenCL Programmer s Introduction Lee Howes November 3, 2010 The aim of this webinar To provide a general background to modern GPU architectures To place the AMD GPU designs in context:

More information

Intel HTML5 Development Environment Article Using the App Dev Center

Intel HTML5 Development Environment Article Using the App Dev Center Intel HTML5 Development Environment Article Using the App Dev Center v1.06 : 06.04.2013 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS

More information

Intel Retail Client Manager Audience Analytics

Intel Retail Client Manager Audience Analytics Intel Retail Client Manager Audience Analytics By using this document, in addition to any agreements you have with Intel, you accept the terms set forth below. You may not use or facilitate the use of

More information

Keys to node-level performance analysis and threading in HPC applications

Keys to node-level performance analysis and threading in HPC applications Keys to node-level performance analysis and threading in HPC applications Thomas GUILLET (Intel; Exascale Computing Research) IFERC seminar, 18 March 2015 Legal Disclaimer & Optimization Notice INFORMATION

More information

OpenCL Programming for the CUDA Architecture. Version 2.3

OpenCL Programming for the CUDA Architecture. Version 2.3 OpenCL Programming for the CUDA Architecture Version 2.3 8/31/2009 In general, there are multiple ways of implementing a given algorithm in OpenCL and these multiple implementations can have vastly different

More information

Head-Coupled Perspective

Head-Coupled Perspective Head-Coupled Perspective Introduction Head-Coupled Perspective (HCP) refers to a technique of rendering a scene that takes into account the position of the viewer relative to the display. As a viewer moves

More information

Measuring Cache and Memory Latency and CPU to Memory Bandwidth

Measuring Cache and Memory Latency and CPU to Memory Bandwidth White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary

More information

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service. Eddie Dong, Tao Hong, Xiaowei Yang

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service. Eddie Dong, Tao Hong, Xiaowei Yang COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Tao Hong, Xiaowei Yang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO

More information

Accomplish Optimal I/O Performance on SAS 9.3 with

Accomplish Optimal I/O Performance on SAS 9.3 with Accomplish Optimal I/O Performance on SAS 9.3 with Intel Cache Acceleration Software and Intel DC S3700 Solid State Drive ABSTRACT Ying-ping (Marie) Zhang, Jeff Curry, Frank Roxas, Benjamin Donie Intel

More information

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Accelerating High-Speed Networking with Intel I/O Acceleration Technology White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing

More information

The Case for Rack Scale Architecture

The Case for Rack Scale Architecture The Case for Rack Scale Architecture An introduction to the next generation of Software Defined Infrastructure Intel Data Center Group Pooled System Top of Rack Switch POD Manager Network CPU/Memory Storage

More information

Haswell Cryptographic Performance

Haswell Cryptographic Performance White Paper Sean Gulley Vinodh Gopal IA Architects Intel Corporation Haswell Cryptographic Performance July 2013 329282-001 Executive Summary The new Haswell microarchitecture featured in the 4 th generation

More information

Cross-Platform Game Development Best practices learned from Marmalade, Unreal, Unity, etc.

Cross-Platform Game Development Best practices learned from Marmalade, Unreal, Unity, etc. Cross-Platform Game Development Best practices learned from Marmalade, Unreal, Unity, etc. Orion Granatir & Omar Rodriguez GDC 2013 www.intel.com/software/gdc Be Bold. Define the Future of Software. Agenda

More information

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Eddie Dong, Yunhong Jiang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

A Powerful solution for next generation Pcs

A Powerful solution for next generation Pcs Product Brief 6th Generation Intel Core Desktop Processors i7-6700k and i5-6600k 6th Generation Intel Core Desktop Processors i7-6700k and i5-6600k A Powerful solution for next generation Pcs Looking for

More information

Intel Integrated Native Developer Experience (INDE): IDE Integration for Android*

Intel Integrated Native Developer Experience (INDE): IDE Integration for Android* Intel Integrated Native Developer Experience (INDE): IDE Integration for Android* 1.5.8 Overview IDE Integration for Android provides productivity-oriented design, coding, and debugging tools for applications

More information

Intel HTML5 Development Environment. Article - Native Application Facebook* Integration

Intel HTML5 Development Environment. Article - Native Application Facebook* Integration Intel HTML5 Development Environment Article - Native Application Facebook* Integration V3.06 : 07.16.2013 Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO

More information

iscsi Quick-Connect Guide for Red Hat Linux

iscsi Quick-Connect Guide for Red Hat Linux iscsi Quick-Connect Guide for Red Hat Linux A supplement for Network Administrators The Intel Networking Division Revision 1.0 March 2013 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH

More information

Intel Cyber Security Briefing: Trends, Solutions, and Opportunities. Matthew Rosenquist, Cyber Security Strategist, Intel Corp

Intel Cyber Security Briefing: Trends, Solutions, and Opportunities. Matthew Rosenquist, Cyber Security Strategist, Intel Corp Intel Cyber Security Briefing: Trends, Solutions, and Opportunities Matthew Rosenquist, Cyber Security Strategist, Intel Corp Legal Notices and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008 Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer

More information

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware

More information

COSBench: A benchmark Tool for Cloud Object Storage Services. Jiangang.Duan@intel.com 2012.10

COSBench: A benchmark Tool for Cloud Object Storage Services. Jiangang.Duan@intel.com 2012.10 COSBench: A benchmark Tool for Cloud Object Storage Services Jiangang.Duan@intel.com 2012.10 Updated June 2012 Self introduction COSBench Introduction Agenda Case Study to evaluate OpenStack* swift performance

More information

Benchmarking Cloud Storage through a Standard Approach Wang, Yaguang Intel Corporation

Benchmarking Cloud Storage through a Standard Approach Wang, Yaguang Intel Corporation Benchmarking Cloud Storage through a Standard Approach Wang, Yaguang Intel Corporation Legal Notices and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1

How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1 How to Configure Intel Ethernet Converged Network Adapter-Enabled Virtual Functions on VMware* ESXi* 5.1 Technical Brief v1.0 February 2013 Legal Lines and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

Autodesk Revit 2016 Product Line System Requirements and Recommendations

Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016 Product Line System Requirements and Recommendations Autodesk Revit 2016, Autodesk Revit Architecture 2016, Autodesk Revit MEP 2016, Autodesk Revit Structure 2016 Minimum: Entry-Level

More information

Intel 865G Chipset Dynamic Video Memory Technology

Intel 865G Chipset Dynamic Video Memory Technology Intel 865G Chipset Dynamic Video Memory Technology White Paper February 2004 Document Number: 253144-002 INFOMATION IN THIS DOCUMENT IS POVIDED IN CONNECTION WITH INTEL PODUCTS. NO LICENSE, EXPESS O IMPLIED,

More information

Whitepaper. NVIDIA Miracast Wireless Display Architecture

Whitepaper. NVIDIA Miracast Wireless Display Architecture Whitepaper NVIDIA Miracast Wireless Display Architecture 1 Table of Content Miracast Wireless Display Background... 3 NVIDIA Miracast Architecture... 4 Benefits of NVIDIA Miracast Architecture... 5 Summary...

More information

Intel Platform Controller Hub EG20T

Intel Platform Controller Hub EG20T Intel Platform Controller Hub EG20T General Purpose Input Output (GPIO) Driver for Windows* Order Number: 324257-002US Legal Lines and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques

More information

MCA Enhancements in Future Intel Xeon Processors June 2013

MCA Enhancements in Future Intel Xeon Processors June 2013 MCA Enhancements in Future Intel Xeon Processors June 2013 Reference Number: 329176-001, Revision: 1.0 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR

More information

Creating Full Screen Applications Across Multiple Displays in Extended Mode

Creating Full Screen Applications Across Multiple Displays in Extended Mode White Paper Anthony See Platform Application Engineer Intel Corporation Ho Nee Shen Software Engineering Manager Intel Corporation Creating Full Screen Applications Across Multiple Displays in Extended

More information

INTEL PARALLEL STUDIO XE EVALUATION GUIDE

INTEL PARALLEL STUDIO XE EVALUATION GUIDE Introduction This guide will illustrate how you use Intel Parallel Studio XE to find the hotspots (areas that are taking a lot of time) in your application and then recompiling those parts to improve overall

More information

Intel Media SDK Features in Microsoft Windows 7* Multi- Monitor Configurations on 2 nd Generation Intel Core Processor-Based Platforms

Intel Media SDK Features in Microsoft Windows 7* Multi- Monitor Configurations on 2 nd Generation Intel Core Processor-Based Platforms Intel Media SDK Features in Microsoft Windows 7* Multi- Monitor Configurations on 2 nd Generation Intel Core Processor-Based Platforms Technical Advisory December 2010 Version 1.0 Document Number: 29437

More information

Accelerating Business Intelligence with Large-Scale System Memory

Accelerating Business Intelligence with Large-Scale System Memory Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness

More information

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?

This Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture? This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo

More information

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK Barry Davis, General Manager, High Performance Fabrics Operation Data Center Group, Intel Corporation Legal Disclaimer Today s presentations contain

More information

AMD CodeXL 1.7 GA Release Notes

AMD CodeXL 1.7 GA Release Notes AMD CodeXL 1.7 GA Release Notes Thank you for using CodeXL. We appreciate any feedback you have! Please use the CodeXL Forum to provide your feedback. You can also check out the Getting Started guide on

More information

SAP * Mobile Platform 3.0 Scaling on Intel Xeon Processor E5 v2 Family

SAP * Mobile Platform 3.0 Scaling on Intel Xeon Processor E5 v2 Family White Paper SAP* Mobile Platform 3.0 E5 Family Enterprise-class Security SAP * Mobile Platform 3.0 Scaling on Intel Xeon Processor E5 v2 Family Delivering Incredible Experiences to Mobile Users Executive

More information

How to Configure Intel X520 Ethernet Server Adapter Based Virtual Functions on Citrix* XenServer 6.0*

How to Configure Intel X520 Ethernet Server Adapter Based Virtual Functions on Citrix* XenServer 6.0* How to Configure Intel X520 Ethernet Server Adapter Based Virtual Functions on Citrix* XenServer 6.0* Technical Brief v1.0 December 2011 Legal Lines and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

LOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series

LOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series Product Brief 6th Gen Intel Core Processors for Desktops: Sseries LOOKING FOR AN AMAZING PROCESSOR for your next desktop PC? Look no further than 6th Gen Intel Core processors. With amazing performance

More information

http://www.intel.com/performance/resources Version 2008-09 Rev. 1.0

http://www.intel.com/performance/resources Version 2008-09 Rev. 1.0 Software Evaluation Guide for ImTOO* YouTube* to ipod* Converter and Adobe Premiere Elements* 4.0 Downloading YouTube videos to your ipod while uploading a home video to YouTube http://www.intel.com/performance/resources

More information

Intel Perceptual Computing SDK My First C++ Application

Intel Perceptual Computing SDK My First C++ Application Intel Perceptual Computing SDK My First C++ Application LEGAL DISCLAIMER THIS DOCUMENT CONTAINS INFORMATION ON PRODUCTS IN THE DESIGN PHASE OF DEVELOPMENT. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION

More information

Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms

Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms Intel Cloud Builders Guide Intel Xeon Processor-based Servers RES Virtual Desktop Extender Intel Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms Client Aware Cloud with RES Virtual

More information

Intel Graphics Media Accelerator 900

Intel Graphics Media Accelerator 900 Intel Graphics Media Accelerator 900 White Paper September 2004 Document Number: 302624-003 INFOMATION IN THIS DOCUMENT IS POVIDED IN CONNECTION WITH INTEL PODUCTS. NO LICENSE, EXPESS O IMPLIED, BY ESTOPPEL

More information

Intel Technical Advisory

Intel Technical Advisory This Technical Advisory describes an issue which may or may not affect the customer s product Intel Technical Advisory 5200 NE Elam Young Parkway Hillsboro, OR 97124 TA-1054-01 April 4, 2014 Incorrectly

More information

Software Evaluation Guide for Autodesk 3ds Max 2009* and Enemy Territory: Quake Wars* Render a 3D character while playing a game

Software Evaluation Guide for Autodesk 3ds Max 2009* and Enemy Territory: Quake Wars* Render a 3D character while playing a game Software Evaluation Guide for Autodesk 3ds Max 2009* and Enemy Territory: Quake Wars* Render a 3D character while playing a game http://www.intel.com/performance/resources Version 2008-09 Rev. 1.0 Information

More information

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007

Multi-core architectures. Jernej Barbic 15-213, Spring 2007 May 3, 2007 Multi-core architectures Jernej Barbic 15-213, Spring 2007 May 3, 2007 1 Single-core computer 2 Single-core CPU chip the single core 3 Multi-core architectures This lecture is about a new trend in computer

More information

GPU Parallel Computing Architecture and CUDA Programming Model

GPU Parallel Computing Architecture and CUDA Programming Model GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel

More information