FASTER PASSWORD RECOVERY WITH MODERN GPUs
|
|
- Ada Oliver
- 7 years ago
- Views:
Transcription
1
2 FASTER PASSWORD RECOVERY WITH MODERN GPUs Andrey Belenko ElcomSoft Co. Ltd. Security Researcher
3 WHO ARE WE Founded in 1990 Privately owned Doing password recovery (software) since 1998 HQ and development in Moscow, Russia Brought GPUs to password recovery in US patents issued, more in queue 2 are about GPU-accelerated password recovery 3
4 WHO NEEDS PASSWORD RECOVERY? Ordinary users Passwords of their own IT Departments Passwords of the employees Security auditors, consultants and penetration testers Customer/contractor passwords Law enforcement & government agencies Passwords of suspects Hackers usually don t! 4
5 WHY SPEED COUNTS? Users and IT Departments: «We needed those passwords yesterday» Auditors, consultants and pentesters: «Time is Money» Law Enforcement and investigators Legal time limits 5
6 PASSWORD RECOVERY The Loop The slow part Generate trial Transform password Validate hash/ Success password (compute hash or key encryption key) Failure Try next password 6
7 PASSWORD RECOVERY The Slow Part Designed to be slow 50ms verification time has no impact on usability but HUGE impact on password recovery performance Usually designed around well-known hash functions MD5 (old days) SHA-1 (most popular so far) SHA-2 (still exotic) Thousands to millions of hash computations per password 7
8 FAST PASSWORD RECOVERY The CPU Way Before GPGPU era most optimizations focused on: SIMD (MMX, SSE, AVX) Multi-core Distributed computing (think distributed.net) Communication overhead Difficult to manage Not power-efficient 8
9 9
10 10
11 FAST PASSWORD RECOVERY The GPU Way Password recovery constitutes embarrassingly parallel workload Each processing unit verifies own password, independently from other processing units Linear scalability in practice Done by GPU Transform password Transform password Generate trial passwords Transform password Validate hashes/ keys Success Transform password Transform password Try next password Failure 11
12 FAST PASSWORD RECOVERY The GPU Way CPU GPU Generate trial passwords PCIe Passwords[] Passwords[] Compute keys from passwords Keys[] Keys[] Validate keys 12
13 LIMITATIONS Works good for slow algorithms For fast algorithms PCIe becomes the bottleneck e.g. for SHA-1 theoretical limit is 8 Gbps / (20 bytes in + 20 bytes out) 214 million passwords per second Need to offload everything to the GPU password generation and key validation on GPU are bigger challenges than crypto itself especially so without OpenCL 13
14 ALTERNATIVE WAY CPU GPU PCIe Initial password Generate trial passwords Passwords[] Compute keys from passwords Keys[] Result Validate keys 14
15 PASSWORD RECOVERY CPU GPU Generate trial passwords PCIe Passwords[] Passwords[] Compute keys from passwords Keys[] Keys[] Validate keys 15
16 OVERLAPPING CPU AND GPU In straightforward implementation it may look like this: CPU Gen Vfy Gen Vfy Gen Vfy GPU Compute Compute Compute But CPU and GPU can work simultaneously, so overlap their operations: CPU Gen Gen Vfy Gen Vfy Vfy GPU Compute Compute Compute Profit! 16
17 PERFORMANCE PBKDF2-SHA1 x Intel i NVIDIA GTX AMD HD K 15K 30K 45K 60K Computations per second 17
18 HEY, WHY NO 100X SPEEDUP? Be fair! CPUs are not single core any more Even Atoms are not Extended instruction sets were introduced for performance reasons So why ignore them? Will usually get ~10x on comparable hardware for well-suited compute-bound tasks 18
19 IO & QPI Queue IO & QPI CPU LAYOUT 1.2 billions transistors Most are L3/L2 caches L3 Cache L3 Cache Less than 10% are in execution and/or ALU units Core Core Core Core Core Core Memory Controller 19
20 GPU LAYOUT 3 billions transistors (2.5x) About 30% are execution and/or ALU units (3x) 7.5x more transistors dedicated to execution units Core frequency is about lower (~0.4x) 3x estimated speedup In fair real-world comparison this GPU is 4x faster than CPU on compute-bound task 20
21 HEY, WHY NO 100X SPEEDUP? Be fair! CPUs are not single core any more Even Atoms are not Extended instruction sets were introduced for performance reasons So why ignore them? Will usually get ~10x on comparable hardware for well-suited tasks In our case: SSE2 code + processor-specific compiler optimizations 12 threads to fully utilize 6 cores + HT 16x over high-end CPU 21
22 PERFORMANCE PBKDF2-SHA1 x Intel i NVIDIA GTX AMD HD K 15K 30K 45K 60K Computations per second 22
23 WHY AMD IS SO FAST? Most password transformations are bounded by integer performance AMD cards exhibit awesome integer performance Many password transformations (=crypto) make heavy use of bit rotations (=cyclic shifts) There is a special instruction for this! Cyclic shift in 1 instruction instead of 3, up to 30% overall speedup in practice GPU code written in IL Utilize all GPU devices under Windows (Recent APP SDK versions allow this with OpenCL) 23
24 PERFORMANCE bitalign AMD IL Specification, section 7.13: Aligns bit data for video. This is a special instruction for multi-media video. bitalign dst, src0, src1, src2 dst = (src0 >> src2.x) (src1 << (32-src2.x)) Can be used to implement cyclic bit shift in 1 instruction VERY useful for many crypto algorithms Introduced in Evergreen Exposed at the IL level 24
25 PERFORMANCE Bitfield Insert AMD Evergreen ISA Reference, page 9-61: BFI_INT dst, src0, src1, src2 dst = (src1 & src0) (src2 & -src0) This is vector bit select dst i = (mask i!= 0 )? arg1 i : arg2 i Very useful for accelerating various crypto algorithms And especially for breaking them Introduced in Evergreen NOT exposed at the IL level OpenCL bitselect() is not using it either No documented way to emit this instruction directly 25
26 WHY INTERMEDIATE LANGUAGE We chose IL over Brook+ OpenCL has not existed yet Brook+ programming model was not quite suited for password recovery ISA provided no significant benefit over IL Early OpenCL support couldn t compete with IL either Limited support for binary (pre-compiled) kernels Limited support for multi-gpu in OpenCL (Those issues seems to be fixed in APP SDK 2.4) AMD is going to deprecate CAL in next SDK (2.5) IL will almost certainly be deprecated altogether This is very bad news for us Need to decide whether to go up (OpenCL) or down (ISA) Morning Keynote mentinoed FSAIL which seems like a great alternative! 26
27 WRITING IN INTERMEDIATE LANGUAGE IL doesn t seem to be designed to be human-friendly Use scripting languages to generate IL code And handle platform-specific optimizations (i.e. emulate bitalign on older GPUs) Compile kernels at program build time Avoids runtime compilation Solves (partially) IP problem no source code needs to be distributed Need to provide new binaries for new devices Use CAL at runtime to load, configure and launch pre-compiled kernel 27
28 SCALABILITY Not all GPUs are equally powerful Program should scale nicely with number of processing cores in installed GPU Query number of processors at runtime Partition task proportionally to number of processors Helps to reduce UI update freezes Also helps to avoid TDR 28
29 SCALABILITY 8 GPUs are not uncommon today Program should scale nicely with number of GPUs Query number of devices in system Spawn thread for each device Partition task as appropriate Speedup should be linear unless you hit PCIe bandwidth limits 29
30 Disclaimer & Attribution The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes. NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in this presentation are for informational purposes only and may be trademarks of their respective owners. The contents of this presentation were provided by individual(s) and/or company listed on the title page. The information and opinions presented in this presentation may not represent AMD s positions, strategies or opinions. Unless explicitly stated, AMD is not responsible for the content herein and no endorsements are implied. 30
31 QUESTIONS?
GPU ACCELERATED DATABASES Database Driven OpenCL Programming. Tim Child 3DMashUp CEO
GPU ACCELERATED DATABASES Database Driven OpenCL Programming Tim Child 3DMashUp CEO SPEAKERS BIO Tim Child 35 years experience of software development Formerly VP Engineering, Oracle Corporation VP Engineering,
More informationA Vision for Tomorrow s Hosting Data Center
A Vision for Tomorrow s Hosting Data Center JOHN WILLIAMS CORPORATE VICE PRESIDENT, SERVER MARKETING MARCH 2013 THE EVOLVING HOSTING MARKET NEW OPPORTUNITIES: HOSTING IN THE CLOUD Hosted Server shipments
More informationFLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015
FLOATING-POINT ARITHMETIC IN AMD PROCESSORS MICHAEL SCHULTE AMD RESEARCH JUNE 2015 AGENDA The Kaveri Accelerated Processing Unit (APU) The Graphics Core Next Architecture and its Floating-Point Arithmetic
More informationATI Radeon 4800 series Graphics. Michael Doggett Graphics Architecture Group Graphics Product Group
ATI Radeon 4800 series Graphics Michael Doggett Graphics Architecture Group Graphics Product Group Graphics Processing Units ATI Radeon HD 4870 AMD Stream Computing Next Generation GPUs 2 Radeon 4800 series
More informationHETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS
HETEROGENEOUS SYSTEM COHERENCE FOR INTEGRATED CPU-GPU SYSTEMS JASON POWER*, ARKAPRAVA BASU*, JUNLI GU, SOORAJ PUTHOOR, BRADFORD M BECKMANN, MARK D HILL*, STEVEN K REINHARDT, DAVID A WOOD* *University of
More informationRadeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008
Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer
More informationAMD Product and Technology Roadmaps
AMD Product and Technology Roadmaps AMD 2013 DESKTOP OEM GRAPHICS ROADMAP 2012: AMD RADEON 7000/6000 SERIES 2013: AMD RADEON HD 8000 SERIES Enthusiast AMD Radeon 7900 Series GPU AMD Radeon 8990 Series
More informationPHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER
PHYSICAL CORES V. ENHANCED THREADING SOFTWARE: PERFORMANCE EVALUATION WHITEPAPER Preface Today s world is ripe with computing technology. Computing technology is all around us and it s often difficult
More informationOptimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator
White Paper Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator Delivering Accelerated Application Performance, Microsoft AlwaysOn High Availability and Fast Data Replication
More information"JAGUAR AMD s Next Generation Low Power x86 Core. Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012
"JAGUAR AMD s Next Generation Low Power x86 Core Jeff Rupley, AMD Fellow Chief Architect / Jaguar Core August 28, 2012 TWO X86 CORES TUNED FOR TARGET MARKETS Mainstream Client and Server Markets Bulldozer
More informationDisk Storage Shortfall
Understanding the root cause of the I/O bottleneck November 2010 2 Introduction Many data centers have performance bottlenecks that impact application performance and service delivery to users. These bottlenecks
More informationMS Exchange Server Acceleration
White Paper MS Exchange Server Acceleration Using virtualization to dramatically maximize user experience for Microsoft Exchange Server Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba
More informationNVIDIA GeForce GTX 580 GPU Datasheet
NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines
More informationSupreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution
enterprise Case Study Supreme Court of Italy Improves Oracle Database Performance and I/O Access to Court Proceedings with OCZ s PCIe-based Virtualized Solution Combination of Z-Drive R4 PCIe SSDs and
More informationTHE AMD MISSION 2 AN INTRODUCTION TO AMD NOVEMBER 2014
THE AMD MISSION To be the leading designer and integrator of innovative, tailored technology solutions that empower people to push the boundaries of what is possible 2 AN INTRODUCTION TO AMD NOVEMBER 2014
More informationAMD GPU Architecture. OpenCL Tutorial, PPAM 2009. Dominik Behr September 13th, 2009
AMD GPU Architecture OpenCL Tutorial, PPAM 2009 Dominik Behr September 13th, 2009 Overview AMD GPU architecture How OpenCL maps on GPU and CPU How to optimize for AMD GPUs and CPUs in OpenCL 2 AMD GPU
More informationIntroducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child
Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.
More informationBinary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More informationGPUs for Scientific Computing
GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research
More informationOpenCL Programming for the CUDA Architecture. Version 2.3
OpenCL Programming for the CUDA Architecture Version 2.3 8/31/2009 In general, there are multiple ways of implementing a given algorithm in OpenCL and these multiple implementations can have vastly different
More informationNVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X
NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v5.5 July 2013 Installation and Verification on Mac OS X TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About
More informationAn Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
More informationNVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X
NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v6.5 August 2014 Installation and Verification on Mac OS X TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2. About
More informationGPU File System Encryption Kartik Kulkarni and Eugene Linkov
GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through
More informationAPU/GPGPU-BASED SECURITY SOLUTIONS. Vikenty Frantsev ALTELL CEO
APU/GPGPU-BASED SECURITY SOLUTIONS Vikenty Frantsev ALTELL CEO ALTELL: KEY FACTS Core business: IT security, software development, network appliances design & manufacturing Founded: Year 2006 Vertical
More informationSamsung Magician v.4.5 Introduction and Installation Guide
Samsung Magician v.4.5 Introduction and Installation Guide 1 Legal Disclaimer SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE. Products and specifications
More informationQuery Acceleration of Oracle Database 12c In-Memory using Software on Chip Technology with Fujitsu M10 SPARC Servers
Query Acceleration of Oracle Database 12c In-Memory using Software on Chip Technology with Fujitsu M10 SPARC Servers 1 Table of Contents Table of Contents2 1 Introduction 3 2 Oracle Database In-Memory
More informationAn Oracle White Paper January 2011. Using Oracle's StorageTek Search Accelerator
An Oracle White Paper January 2011 Using Oracle's StorageTek Search Accelerator Executive Summary...2 Introduction...2 The Problem with Searching Large Data Sets...3 The StorageTek Search Accelerator Solution...3
More informationLOOKING FOR AN AMAZING PROCESSOR. Product Brief 6th Gen Intel Core Processors for Desktops: S-series
Product Brief 6th Gen Intel Core Processors for Desktops: Sseries LOOKING FOR AN AMAZING PROCESSOR for your next desktop PC? Look no further than 6th Gen Intel Core processors. With amazing performance
More informationAnswering the Requirements of Flash-Based SSDs in the Virtualized Data Center
White Paper Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center Provide accelerated data access and an immediate performance boost of businesscritical applications with caching
More informationx64 Servers: Do you want 64 or 32 bit apps with that server?
TMurgent Technologies x64 Servers: Do you want 64 or 32 bit apps with that server? White Paper by Tim Mangan TMurgent Technologies February, 2006 Introduction New servers based on what is generally called
More informationDelivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator
enterprise White Paper Delivering Accelerated SQL Server Performance with OCZ s ZD-XL SQL Accelerator Performance Test Results for Analytical (OLAP) and Transactional (OLTP) SQL Server 212 Loads Allon
More informationRunning Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E 2 0 15
Running Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E 2 0 15 Table of Contents Fully Integrated Hardware and Software
More informationIntroduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software
GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas
More informationTHE FUTURE OF THE APU BRAIDED PARALLELISM Session 2901
THE FUTURE OF THE APU BRAIDED PARALLELISM Session 2901 Benedict R. Gaster AMD Programming Models Architect Lee Howes AMD MTS Fusion System Software PROGRAMMING MODELS A Track Introduction Benedict Gaster
More informationAn Oracle White Paper May 2011. Exadata Smart Flash Cache and the Oracle Exadata Database Machine
An Oracle White Paper May 2011 Exadata Smart Flash Cache and the Oracle Exadata Database Machine Exadata Smart Flash Cache... 2 Oracle Database 11g: The First Flash Optimized Database... 2 Exadata Smart
More informationHOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu:
THE QUESTION HOW MANY USERS CAN I GET ON A SERVER? This is a typical conversation we have with customers considering NVIDIA GRID vgpu: How many users can I get on a server? NVIDIA: What is their primary
More informationThe Foundation for Better Business Intelligence
Product Brief Intel Xeon Processor E7-8800/4800/2800 v2 Product Families Data Center The Foundation for Big data is changing the way organizations make business decisions. To transform petabytes of data
More informationFamily 12h AMD Athlon II Processor Product Data Sheet
Family 12h AMD Athlon II Processor Publication # 50322 Revision: 3.00 Issue Date: December 2011 Advanced Micro Devices 2011 Advanced Micro Devices, Inc. All rights reserved. The contents of this document
More informationHIGH PERFORMANCE CONSULTING COURSE OFFERINGS
Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...
More informationNVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS
NVIDIA CUDA GETTING STARTED GUIDE FOR MICROSOFT WINDOWS DU-05349-001_v6.0 February 2014 Installation and Verification on TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. System Requirements... 1 1.2.
More informationGraphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
More informationPerformance with the Oracle Database Cloud
An Oracle White Paper September 2012 Performance with the Oracle Database Cloud Multi-tenant architectures and resource sharing 1 Table of Contents Overview... 3 Performance and the Cloud... 4 Performance
More informationIn-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller
In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency
More informationAccelerating Database Applications on Linux Servers
White Paper Accelerating Database Applications on Linux Servers Introducing OCZ s LXL Software - Delivering a Data-Path Optimized Solution for Flash Acceleration Allon Cohen, PhD Yaron Klein Eli Ben Namer
More informationCartal-Rijsbergen Automotive Improves SQL Server Performance and I/O Database Access with OCZ s PCIe-based ZD-XL SQL Accelerator
enterprise Case Study Cartal-Rijsbergen Automotive Improves SQL Server Performance and I/O Database Access with OCZ s PCIe-based ZD-XL SQL Accelerator ZD-XL SQL Accelerator Significantly Boosts Data Warehousing,
More informationRedefining Flash Storage Solution
Redefining Flash Storage Solution Through Capacity + Efficiency + Performance + Form PRODUCT GUIDE Holistic Approach to Redefine Flash Storage Novachips is a leading provider of a broad range of Flash
More informationCUDA programming on NVIDIA GPUs
p. 1/21 on NVIDIA GPUs Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford-Man Institute for Quantitative Finance Oxford eresearch Centre p. 2/21 Overview hardware view
More informationAccelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
More informationA Powerful solution for next generation Pcs
Product Brief 6th Generation Intel Core Desktop Processors i7-6700k and i5-6600k 6th Generation Intel Core Desktop Processors i7-6700k and i5-6600k A Powerful solution for next generation Pcs Looking for
More informationFastboot Techniques for x86 Architectures. Marcus Bortel Field Application Engineer QNX Software Systems
Fastboot Techniques for x86 Architectures Marcus Bortel Field Application Engineer QNX Software Systems Agenda Introduction BIOS and BIOS boot time Fastboot versus BIOS? Fastboot time Customizing the boot
More informationIntroduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
More informationEmbedded Systems: map to FPGA, GPU, CPU?
Embedded Systems: map to FPGA, GPU, CPU? Jos van Eijndhoven jos@vectorfabrics.com Bits&Chips Embedded systems Nov 7, 2013 # of transistors Moore s law versus Amdahl s law Computational Capacity Hardware
More informationIntel DPDK Boosts Server Appliance Performance White Paper
Intel DPDK Boosts Server Appliance Performance Intel DPDK Boosts Server Appliance Performance Introduction As network speeds increase to 40G and above, both in the enterprise and data center, the bottlenecks
More informationIntel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual
Intel Media Server Studio - Metrics Monitor (v1.1.0) Reference Manual Overview Metrics Monitor is part of Intel Media Server Studio 2015 for Linux Server. Metrics Monitor is a user space shared library
More informationPerformance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France
More informationAccelerating MS SQL Server 2012
White Paper Accelerating MS SQL Server 2012 Unleashing the Full Power of SQL Server 2012 in Virtualized Data Centers Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba Group Company 1
More informationAccelerating Business Intelligence with Large-Scale System Memory
Accelerating Business Intelligence with Large-Scale System Memory A Proof of Concept by Intel, Samsung, and SAP Executive Summary Real-time business intelligence (BI) plays a vital role in driving competitiveness
More informationHigh Performance Tier Implementation Guideline
High Performance Tier Implementation Guideline A Dell Technical White Paper PowerVault MD32 and MD32i Storage Arrays THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS
More informationSpeeding up GPU-based password cracking
Speeding up GPU-based password cracking SHARCS 2012 Martijn Sprengers 1,2 Lejla Batina 2,3 Sprengers.Martijn@kpmg.nl KPMG IT Advisory 1 Radboud University Nijmegen 2 K.U. Leuven 3 March 17-18, 2012 Who
More informationData-parallel Acceleration of PARSEC Black-Scholes Benchmark
Data-parallel Acceleration of PARSEC Black-Scholes Benchmark AUGUST ANDRÉN and PATRIK HAGERNÄS KTH Information and Communication Technology Bachelor of Science Thesis Stockholm, Sweden 2013 TRITA-ICT-EX-2013:158
More informationTowards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com
Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Please Note IBM s statements regarding its plans, directions, and intent are subject to
More informationSUN ORACLE EXADATA STORAGE SERVER
SUN ORACLE EXADATA STORAGE SERVER KEY FEATURES AND BENEFITS FEATURES 12 x 3.5 inch SAS or SATA disks 384 GB of Exadata Smart Flash Cache 2 Intel 2.53 Ghz quad-core processors 24 GB memory Dual InfiniBand
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware
More informationNew Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC
New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC Alan Gara Intel Fellow Exascale Chief Architect Legal Disclaimer Today s presentations contain forward-looking
More informationMIDeA: A Multi-Parallel Intrusion Detection Architecture
MIDeA: A Multi-Parallel Intrusion Detection Architecture Giorgos Vasiliadis, FORTH-ICS, Greece Michalis Polychronakis, Columbia U., USA Sotiris Ioannidis, FORTH-ICS, Greece CCS 2011, 19 October 2011 Network
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo
More informationScaling up to Production
1 Scaling up to Production Overview Productionize then Scale Building Production Systems Scaling Production Systems Use Case: Scaling a Production Galaxy Instance Infrastructure Advice 2 PRODUCTIONIZE
More informationIntroduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1
Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?
More informationHigh Availability Server Clustering Solutions
White Paper High vailability Server Clustering Solutions Extending the benefits of technology into the server arena Intel in Communications Contents Executive Summary 3 Extending Protection to Storage
More informationVirtualization and the U2 Databases
Virtualization and the U2 Databases Brian Kupzyk Senior Technical Support Engineer for Rocket U2 Nik Kesic Lead Technical Support for Rocket U2 Opening Procedure Orange arrow allows you to manipulate the
More informationBooting XP Embedded from USB Flash By Sean D. Liming and John R. Malin
Booting XP Embedded from USB Flash By Sean D. Liming and John R. Malin 11/01/06 1 Copyright 2006 SJJ Embedded Micro Solutions, LLC., All Rights Reserved No part of this guide may be copied, duplicated,
More informationAMD APP SDK v2.8 FAQ. 1 General Questions
AMD APP SDK v2.8 FAQ 1 General Questions 1. Do I need to use additional software with the SDK? To run an OpenCL application, you must have an OpenCL runtime on your system. If your system includes a recent
More informationTransforming Data Center Economics and Performance via Flash and Server Virtualization
White Paper Transforming Data Center Economics and Performance via Flash and Server Virtualization How OCZ PCIe Flash-Based SSDs and VXL Software Create a Lean, Mean & Green Data Center Allon Cohen, PhD
More informationPERFORMANCE TIPS FOR BATCH JOBS
PERFORMANCE TIPS FOR BATCH JOBS Here is a list of effective ways to improve performance of batch jobs. This is probably the most common performance lapse I see. The point is to avoid looping through millions
More informationPower Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze
Power Benefits Using Intel Quick Sync Video H.264 Codec With Sorenson Squeeze Whitepaper December 2012 Anita Banerjee Contents Introduction... 3 Sorenson Squeeze... 4 Intel QSV H.264... 5 Power Performance...
More informationBandwidth Calculations for SA-1100 Processor LCD Displays
Bandwidth Calculations for SA-1100 Processor LCD Displays Application Note February 1999 Order Number: 278270-001 Information in this document is provided in connection with Intel products. No license,
More informationFamily 10h AMD Phenom II Processor Product Data Sheet
Family 10h AMD Phenom II Processor Product Data Sheet Publication # 46878 Revision: 3.05 Issue Date: April 2010 Advanced Micro Devices 2009, 2010 Advanced Micro Devices, Inc. All rights reserved. The contents
More informationNew!! - Higher performance for Windows and UNIX environments
New!! - Higher performance for Windows and UNIX environments The IBM TotalStorage Network Attached Storage Gateway 300 (NAS Gateway 300) is designed to act as a gateway between a storage area network (SAN)
More informationHaswell Cryptographic Performance
White Paper Sean Gulley Vinodh Gopal IA Architects Intel Corporation Haswell Cryptographic Performance July 2013 329282-001 Executive Summary The new Haswell microarchitecture featured in the 4 th generation
More informationData Center and Cloud Computing Market Landscape and Challenges
Data Center and Cloud Computing Market Landscape and Challenges Manoj Roge, Director Wired & Data Center Solutions Xilinx Inc. #OpenPOWERSummit 1 Outline Data Center Trends Technology Challenges Solution
More informationThe MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.
Performance benefit of MAX5 for databases The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5 Vinay Kulkarni Kent Swalin IBM
More informationLecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?
Lecture 3: Evaluating Computer Architectures Announcements - Reminder: Homework 1 due Thursday 2/2 Last Time technology back ground Computer elements Circuits and timing Virtuous cycle of the past and
More informationLecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.
Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide
More informationIntel Software Guard Extensions(Intel SGX) Carlos Rozas Intel Labs November 6, 2013
Intel Software Guard Extensions(Intel SGX) Carlos Rozas Intel Labs November 6, 2013 Legal Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationWhite Paper AMD PROJECT FREESYNC
White Paper AMD PROJECT FREESYNC TABLE OF CONTENTS INTRODUCTION 3 PROJECT FREESYNC USE CASES 4 Gaming 4 Video Playback 5 System Power Savings 5 PROJECT FREESYNC IMPLEMENTATION 6 Implementation Overview
More informationIntel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family
Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family White Paper June, 2008 Legal INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL
More informationOverview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming
Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.
More informationKeys to node-level performance analysis and threading in HPC applications
Keys to node-level performance analysis and threading in HPC applications Thomas GUILLET (Intel; Exascale Computing Research) IFERC seminar, 18 March 2015 Legal Disclaimer & Optimization Notice INFORMATION
More informationNext Generation GPU Architecture Code-named Fermi
Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time
More informationIntelligent Heuristic Construction with Active Learning
Intelligent Heuristic Construction with Active Learning William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather E H U N I V E R S I T Y T O H F G R E D I N B U Space is BIG! Hubble Ultra-Deep Field
More informationVALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS
VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli Department of Electrical and Computer Engineering Northeastern University,
More informationHetero Streams Library 1.0
Release Notes for release of Copyright 2013-2016 Intel Corporation All Rights Reserved US Revision: 1.0 World Wide Web: http://www.intel.com Legal Disclaimer Legal Disclaimer You may not use or facilitate
More informationGraphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data
Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:
More informationOptimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server
Optimizing GPU-based application performance for the HP for the HP ProLiant SL390s G7 server Technology brief Introduction... 2 GPU-based computing... 2 ProLiant SL390s GPU-enabled architecture... 2 Optimizing
More informationIntel Media SDK Library Distribution and Dispatching Process
Intel Media SDK Library Distribution and Dispatching Process Overview Dispatching Procedure Software Libraries Platform-Specific Libraries Legal Information Overview This document describes the Intel Media
More informationAMD Processor Performance. AMD Phenom II Processors Discrete Platform Benchmarks December 2008
AMD Processor Performance AMD Phenom II Processors Discrete Platform Benchmarks December 2008 AMD Phenom II Performance Overall Performance of Office Productivity + Digital Media + Games AMD Phenom II
More informationOperating System for the K computer
Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements
More informationIntel 965 Express Chipset Family Memory Technology and Configuration Guide
Intel 965 Express Chipset Family Memory Technology and Configuration Guide White Paper - For the Intel 82Q965, 82Q963, 82G965 Graphics and Memory Controller Hub (GMCH) and Intel 82P965 Memory Controller
More information