Fast Implementations of AES on Various Platforms
|
|
- Shonda Underwood
- 8 years ago
- Views:
Transcription
1 Fast Implementations of AES on Various Platforms Joppe W. Bos 1 Dag Arne Osvik 1 Deian Stefan 2 1 EPFL IC IIF LACAL, Station 14, CH-1015 Lausanne, Switzerland {joppe.bos, dagarne.osvik}@epfl.ch 2 Dept. of Electrical Engineering, The Cooper Union, NY 10003, New York, USA stefan@cooper.edu 1 / 19
2 Outline Introduction Motivation Previous Work Contributions The Advanced Encryption Standard Target Platforms The 8-bit AVR Microcontroller The Cell Broadband Engine Architecture The NVIDIA Graphics Processing Unit Conclusions 2 / 19
3 Motivation Advanced Encryption Standard Rijndael announced in 2001 as the AES. One of the most widely used cryptographic primitives. IP Security, Secure Shell, Truecrypt RFID and low-power authentication methods Key tokens, RF-based Remote Access Control Many intensive efforts to speed up AES in both hard- and software. Related work E. Käsper and P. Schwabe. Faster and Timing-Attack Resistant AES-GCM. CHES P. Bulens, et al. Implementation of the AES-128 on Virtex-5 FPGAs AFRICACRYPT O. Harrison and J. Waldron. Practical Symmetric Key Cryptography on Modern Graphics Hardware. USENIX Sec. Symp S. Rinne, et al. Performance Analysis of Contemporary Light-Weight Block Ciphers on 8-bit Microcontrollers. SPEED K. Shimizu, et al. Cell Broadband Engine Support for Privacy, Security, and Digital Rights Management Applications / 19
4 Contributions New software speed records for various architectures 8-bit AVR microcontrollers compact, efficient single stream AES version Synergistic processing elements of the Cell broadband engine widely available in the PS3 video game console single instruction multiple data (SIMD) architecture process 16 streams in parallel (bytesliced) NVIDIA graphics processing unit first AES decryption implementation single instruction multiple threads (SIMT) architecture process thousand of streams in parallel (T -table based) 4 / 19
5 The Advanced Encryption Standard Fixed block length version of the Rijndael block cipher Key-iterated block cipher with 128-bit state and block length Support for 128-, 192-, and 256-bit keys Strong security properties no attacks on full AES-128 Very efficient in hardware and software 5 / 19
6 AES-128 Block Cipher Algorithm consists of 5 steps: 1 Key expansion: 128-bit N r + 1 = bit round keys 2 State initialization: initial state plaintext block 128-bit key 3 Round transformation: apply round function on state N r 1 times 4 Final round transformation: apply the modified round function Core of AES, the round function, consists of the following steps: SubBytes, ShiftRows, MixColumns, and AddRoundKey. 6 / 19
7 AES-128 Block Cipher Algorithm consists of 5 steps: 1 Key expansion: 128-bit N r + 1 = bit round keys 2 State initialization: initial state plaintext block 128-bit key 3 Round transformation: apply round function on state N r 1 times 4 Final round transformation: apply the modified round function Core of AES, the round function, consists of the following steps: SubBytes, ShiftRows, MixColumns, and AddRoundKey. Decryption follows the same procedure - round function steps are the inverse and run in reverse order 6 / 19
8 Round Function Steps 1 SubBytes: S-box a 00 a 01 a 02 a 03 b 00 b 01 b 02 b 03 a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 7 / 19
9 Round Function Steps 1 SubBytes: S-box a 00 a 01 a 02 a 03 b 00 b 01 b 02 b 03 a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 2 ShiftRows: a 00 a 01 a 02 a 03 a 00 a 01 a 02 a 03 a 10 a 11 a 12 a 13 a 11 a 12 a 13 a 10 a 20 a 21 a 22 a 23 a 22 a 23 a 20 a 21 a 30 a 31 a 32 a 33 a 33 a 30 a 31 a 32 7 / 19
10 Round Function Steps 1 SubBytes: 1 MixColumns: a 00 a 01 a 02 a 03 S-box b 00 b 01 b 02 b a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 00 a 01 a 02 a 03 b 00 b 01 b 02 b 03 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 2 ShiftRows: a 00 a 01 a 02 a 03 a 00 a 01 a 02 a 03 a 10 a 11 a 12 a 13 a 11 a 12 a 13 a 10 a 20 a 21 a 22 a 23 a 22 a 23 a 20 a 21 a 30 a 31 a 32 a 33 a 33 a 30 a 31 a 32 7 / 19
11 Round Function Steps 1 SubBytes: 1 MixColumns: a 00 a 01 a 02 a 03 S-box b 00 b 01 b 02 b a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 00 a 01 a 02 a 03 b 00 b 01 b 02 b 03 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 10 a 11 a 12 a 13 b 10 b 11 b 12 b 13 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 a 20 a 21 a 22 a 23 b 20 b 21 b 22 b 23 a 30 a 31 a 32 a 33 b 30 b 31 b 32 b 33 2 ShiftRows: 2 AddRoundKey: a 00 a 01 a 02 a 03 a 00 a 01 a 02 a 03 a 00 a 01 a 02 a 03 k 00 a 01 k 02 k 03 b 00 b 01 b 02 b 03 a 10 a 11 a 12 a 13 a 11 a 12 a 13 a 10 a 10 a 11 a 12 a 13 k 10 a 11 k 12 k 13 b 10 b 11 b 12 b 13 a 20 a 21 a 22 a 23 a 22 a 23 a 20 a 21 a 20 a 21 a 22 a 23 k 20 a 21 k 22 k 23 b 20 b 21 b 22 b 23 a 30 a 31 a 32 a 33 a 33 a 30 a 31 a 32 a 30 a 31 a 32 a 33 k 30 k 31 k 32 k 33 b 30 b 31 ab 32 b 33 7 / 19
12 Target Platforms NVIDIA GPUs Cell B.E Atmel AVRs IBM Systems 8 / 19
13 Advanced Virtual RISC Architecture Modified Harvard architecture 32 8-bit registers 16-bit pointer registers Registers are addressable Mostly single-cycle execution 1 2KB to 384KB flash memory 0 to 32KB SRAM 0 to 4KB EEPROM 9 / 19
14 Encryption Comparison AVR Code Size in Bytes Cycles for Encryption Rinne et al. Poettering Otte New 10 / 19
15 Decryption Comparison AVR Code Size in Bytes Cycles for Decryption Rinne et al. Poettering Otte New 11 / 19
16 Cell Broadband Engine Architecture Use the Synergistic Processing Elements runs at 3.2 GHz 128-bit wide SIMD-architecture two instructions per clock cycle (dual pipeline) in-order processor rich instruction set: i.e. all distinct binary operations f : {0, 1} 2 {0, 1} are present. Expensive QS22 Blade Servers (2 8 SPEs) Cheap PS3 video game console (6 SPEs) 12 / 19
17 SPU Results Comparison Cycles/byte K. Shimizu et al This article 0 Encryption Decryption Throughput per PS3: 13.2 (encryption) and 10.8 Gbps (decryption) Work-in-progress, fill both pipelines Current version: 1752 odd and 2764 even instructions for encryption. 13 / 19
18 NVIDIA Graphic Processing Units Contain simultaneous multiprocessors (SMs): 8 streaming processors (SPs) 16KB 16-way banked fast shared memory 8192/ bit registers 8KB constant memory cache 6KB-8KB texture cache 2 special function units instruction fetch and scheduling unit GeForce 8800GTX: GHz GTX 295: GHz 16KB Shared Mem 16KB Shared Mem 16KB Shared Mem Scheduler SP Scheduler Scheduler SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP Register File Register File SP SFU SP SFU SP SP SFUConst $ SFU Text $ SFU SFU Const $ Text $ Const $ Text $ Register File 14 / 19
19 AES GPU Implementation Combine SubBytes, ShiftRows, MixColumns using the standard T -table approach. Update each column (0 j 3): [s j0, s j1, s j2, s j3 ] T = T 0 [a c0 0] T 1 [a c1 1] T 2 [a c2 2] T 3 [a c3 3] k j, where each T i is 1KB and k j is the jth column of the round key. 15 / 19
20 AES GPU Implementation Combine SubBytes, ShiftRows, MixColumns using the standard T -table approach. Update each column (0 j 3): [s j0, s j1, s j2, s j3 ] T = T 0 [a c0 0] T 1 [a c1 1] T 2 [a c2 2] T 3 [a c3 3] k j, where each T i is 1KB and k j is the jth column of the round key. Example (j = 0): a 00 a 01 a 02 a 03 T 0 b 00 k 00 s 00 s 01 s 02 s 03 a 10 a 11 a 12 a 13 T 1 b 10 k 10 s 10 s 11 s 12 s 13 a 20 a 21 a 22 a 23 T 2 b 20 k 20 s 20 s 21 s 22 s 23 a 30 a 31 a 32 a 33 T 3 b 30 k 30 s 30 s 31 s 32 s / 19
21 AES GPU Implementation Combine SubBytes, ShiftRows, MixColumns using the standard T -table approach. Update each column (0 j 3): [s j0, s j1, s j2, s j3 ] T = T 0 [a c0 0] T 1 [a c1 1] T 2 [a c2 2] T 3 [a c3 3] k j, where each T i is 1KB and k j is the jth column of the round key. Example (j = 0): a 00 a 01 a 02 a 03 T 0 b 00 k 00 s 00 s 01 s 02 s 03 a 10 a 11 a 12 a 13 T 1 b 10 k 10 s 10 s 11 s 12 s 13 a 20 a 21 a 22 a 23 T 2 b 20 k 20 s 20 s 21 s 22 s 23 a 30 a 31 a 32 a 33 T 3 b 30 k 30 s 30 s 31 s 32 s 33 Optimization approach: launch thread blocks containing multiple independent groups of 16 (1/2-warp) streams. 15 / 19
22 AES GPU Implementation (cont.) Key expansion: 1 On-the-fly: allows thousands of independent streams speed dependent on T -access speed multi-block speed improvement: cache few round keys / stream in shared memory; 16-streams/group no bank conflicts! 16 / 19
23 AES GPU Implementation (cont.) Key expansion: 1 On-the-fly: allows thousands of independent streams speed dependent on T -access speed multi-block speed improvement: cache few round keys / stream in shared memory; 16-streams/group no bank conflicts! 2 Texture memory: keys alive between kernel launches: multi-block encryption is faster than on-the-fly! thread count limited by texture cache size 16 / 19
24 AES GPU Implementation (cont.) Key expansion: 1 On-the-fly: allows thousands of independent streams speed dependent on T -access speed multi-block speed improvement: cache few round keys / stream in shared memory; 16-streams/group no bank conflicts! 2 Texture memory: keys alive between kernel launches: multi-block encryption is faster than on-the-fly! thread count limited by texture cache size 3 Shared memory: 16 round key column reads with no bank conflicts single kernel multi-block encryption is the fastest! thread count limited by shared memory size 16 / 19
25 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 17 / 19
26 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. 17 / 19
27 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. All shared memory used by 1 thread block very low device utilization. 17 / 19
28 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. All shared memory used by 1 thread block very low device utilization. Lazy approach Place T -tables in order. 17 / 19
29 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. All shared memory used by 1 thread block very low device utilization. Lazy approach Place T -tables in order. On average: 6/16 collisions, so remaining reads are parallel. Allows for multiple blocks/sm higher device occupancy. 17 / 19
30 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. All shared memory used by 1 thread block very low device utilization. Lazy approach Place T -tables in order. On average: 6/16 collisions, so remaining reads are parallel. Allows for multiple blocks/sm higher device occupancy. combine with on-the-fly key scheduling or key expansion in texture memory. 17 / 19
31 AES GPU Implementation (cont.) Placement of T -tables: 1 Constant memory: simple and very quick approach unless encrypting same block with same key: almost all T -accesses are serialized combine with any key scheduling algorithm 2 Shared memory: Collision-free approach T i, i > 0 are rotations of T 0: place 1KB T 0 in each bank. All shared memory used by 1 thread block very low device utilization. Lazy approach Place T -tables in order. On average: 6/16 collisions, so remaining reads are parallel. Allows for multiple blocks/sm higher device occupancy. combine with on-the-fly key scheduling or key expansion in texture memory. 3 Texture memory: ongoing work, but estimates are lower than lazy shared memory approach. 17 / 19
32 GPU Results Comparison Cycles/byte S. A. Manavski 2007 (8800GTX) J. Yang et al (ATI HD 2900 XT) O. Harrison et al (8800GTX) This article (8800GTX) This article (GTX295) 0 Encryption Decryption Encryption: 59.6 and 14.6 Gbps on the GTX 295 and 8800GTX, respectively. Decryption: 52.4 and 14.3 Gbps on the GTX 295 and 8800GTX, respectively. 18 / 19
33 Conclusions AES-128 software speed records for encryption and decryption 8-bit AVR 1.24 encryption 1.10 decryption smaller code size Cell Broadband Engine (SPE) 1.06 encryption 1.18 decryption NVIDIA GPU 1.75 encryption First decryption implementation All numbers subject to further improvements 19 / 19
34 Conclusions AES-128 software speed records for encryption and decryption 8-bit AVR 1.24 encryption 1.10 decryption smaller code size Cell Broadband Engine (SPE) 1.06 encryption 1.18 decryption NVIDIA GPU 1.75 encryption First decryption implementation All numbers subject to further improvements To be continued / 19
Fast Software AES Encryption
Calhoun: The NPS Institutional Archive Faculty and Researcher Publications Faculty and Researcher Publications 2010 Fast Software AES Encryption Osvik, Dag Arne Proceedings FSE'10 Proceedings of the 17th
More informationFast Software AES Encryption
Fast Software AES Encryption Dag Arne Osvik 1, Joppe W. Bos 1, Deian Stefan 2, and David Canright 3 1 Laboratory for Cryptologic Algorithms, EPFL, CH-1015 Lausanne, Switzerland 2 Dept. of Electrical Engineering,
More informationHigh-Performance Modular Multiplication on the Cell Processor
High-Performance Modular Multiplication on the Cell Processor Joppe W. Bos Laboratory for Cryptologic Algorithms EPFL, Lausanne, Switzerland joppe.bos@epfl.ch 1 / 19 Outline Motivation and previous work
More informationImplementation of Full -Parallelism AES Encryption and Decryption
Implementation of Full -Parallelism AES Encryption and Decryption M.Anto Merline M.E-Commuication Systems, ECE Department K.Ramakrishnan College of Engineering-Samayapuram, Trichy. Abstract-Advanced Encryption
More informationIntroduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1
Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?
More informationSeChat: An AES Encrypted Chat
Name: Luis Miguel Cortés Peña GTID: 901 67 6476 GTG: gtg683t SeChat: An AES Encrypted Chat Abstract With the advancement in computer technology, it is now possible to break DES 56 bit key in a meaningful
More informationGPU File System Encryption Kartik Kulkarni and Eugene Linkov
GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through
More informationEfficient Software Implementation of AES on 32-bit Platforms
Efficient Software Implementation of AES on 32-bit Platforms Guido Bertoni, Luca Breveglieri Politecnico di Milano, Milano - Italy Pasqualina Lilli Lilli Fragneto AST-LAB of ST Microelectronics, Agrate
More informationThe Advanced Encryption Standard: Four Years On
The Advanced Encryption Standard: Four Years On Matt Robshaw Reader in Information Security Information Security Group Royal Holloway University of London September 21, 2004 The State of the AES 1 The
More informationIBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 IBM CELL. Politecnico di Milano Como Campus
Project made by: Origgi Alessandro matr. 682197 Teruzzi Roberto matr. 682552 CELL INTRODUCTION 2 1 CELL SYNERGY Cell is not a collection of different processors, but a synergistic whole Operation paradigms,
More informationIJESRT. [Padama, 2(5): May, 2013] ISSN: 2277-9655
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design and Verification of VLSI Based AES Crypto Core Processor Using Verilog HDL Dr.K.Padama Priya *1, N. Deepthi Priya 2 *1,2
More informationGraphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
More informationSpeeding up GPU-based password cracking
Speeding up GPU-based password cracking SHARCS 2012 Martijn Sprengers 1,2 Lejla Batina 2,3 Sprengers.Martijn@kpmg.nl KPMG IT Advisory 1 Radboud University Nijmegen 2 K.U. Leuven 3 March 17-18, 2012 Who
More informationHardware Implementation of AES Encryption and Decryption System Based on FPGA
Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 1373-1377 1373 Open Access Hardware Implementation of AES Encryption and Decryption System Based
More informationCSCE 465 Computer & Network Security
CSCE 465 Computer & Network Security Instructor: Dr. Guofei Gu http://courses.cse.tamu.edu/guofei/csce465/ Secret Key Cryptography (I) 1 Introductory Remarks Roadmap Feistel Cipher DES AES Introduction
More informationResearch Article. ISSN 2347-9523 (Print) *Corresponding author Shi-hai Zhu Email:
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(3A):352-357 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)
More informationANALYSIS OF RSA ALGORITHM USING GPU PROGRAMMING
ANALYSIS OF RSA ALGORITHM USING GPU PROGRAMMING Sonam Mahajan 1 and Maninder Singh 2 1 Department of Computer Science Engineering, Thapar University, Patiala, India 2 Department of Computer Science Engineering,
More informationDesign and Verification of Area-Optimized AES Based on FPGA Using Verilog HDL
Design and Verification of Area-Optimized AES Based on FPGA Using Verilog HDL 1 N. Radhika, 2 Obili Ramesh, 3 Priyadarshini, 3 Asst.Profosser, 1,2 M.Tech ( Digital Systems & Computer Electronics), 1,2,3,
More informationHigh Speed Software Driven AES Algorithm on IC Smartcards
SCIS 2004 The 2004 Symposium on Cryptography and Information Security Sendai, Japan, Jan.27-30, 2004 The Institute of Electronics, Information and Communication Engineers High Speed Software Driven AES
More informationParallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan
Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan Abstract AES is an encryption algorithm which can be easily implemented on fine grain many core systems.
More informationLecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com
CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Modern GPU
More informationAutomata Designs for Data Encryption with AES using the Micron Automata Processor
IJCSNS International Journal of Computer Science and Network Security, VOL.15 No.7, July 2015 1 Automata Designs for Data Encryption with AES using the Micron Automata Processor Angkul Kongmunvattana School
More informationThe Advanced Encryption Standard (AES)
The Advanced Encryption Standard (AES) Conception - Why A New Cipher? Conception - Why A New Cipher? DES had outlived its usefulness Vulnerabilities were becoming known 56-bit key was too small Too slow
More informationAn Instruction Set Extension for Fast and Memory-Efficient AES Implementation
An Instruction Set Extension for Fast and Memory-Efficient AES Implementation Stefan Tillich, Johann Großschädl, and Alexander Szekely Graz University of Technology Institute for Applied Information Processing
More informationProcessor Accelerator for AES
1 Processor Accelerator for AES Ruby B. Lee and Yu-Yuan Chen Department of Electrical Engineering Princeton University rblee, yctwo@princeton.edu Abstract Software AES cipher performance is not fast enough
More informationGPGPU Computing. Yong Cao
GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power
More informationCombining Mifare Card and agsxmpp to Construct a Secure Instant Messaging Software
Combining Mifare Card and agsxmpp to Construct a Secure Instant Messaging Software Ya Ling Huang, Chung Huang Yang Graduate Institute of Information & Computer Education, National Kaohsiung Normal University
More informationEnhancing Advanced Encryption Standard S-Box Generation Based on Round Key
Enhancing Advanced Encryption Standard S-Box Generation Based on Round Key Julia Juremi Ramlan Mahmod Salasiah Sulaiman Jazrin Ramli Faculty of Computer Science and Information Technology, Universiti Putra
More informationNext Generation GPU Architecture Code-named Fermi
Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time
More informationCS 758: Cryptography / Network Security
CS 758: Cryptography / Network Security offered in the Fall Semester, 2003, by Doug Stinson my office: DC 3122 my email address: dstinson@uwaterloo.ca my web page: http://cacr.math.uwaterloo.ca/~dstinson/index.html
More informationThe implementation and performance/cost/power analysis of the network security accelerator on SoC applications
The implementation and performance/cost/power analysis of the network security accelerator on SoC applications Ruei-Ting Gu grating@eslab.cse.nsysu.edu.tw Kuo-Huang Chung khchung@eslab.cse.nsysu.edu.tw
More informationE6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices
E6895 Advanced Big Data Analytics Lecture 14: NVIDIA GPU Examples and GPU on ios devices Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist,
More informationOptimizing Code for Accelerators: The Long Road to High Performance
Optimizing Code for Accelerators: The Long Road to High Performance Hans Vandierendonck Mons GPU Day November 9 th, 2010 The Age of Accelerators 2 Accelerators in Real Life 3 Latency (ps/inst) Why Accelerators?
More informationSurvey on Enhancing Cloud Data Security using EAP with Rijndael Encryption Algorithm
Global Journal of Computer Science and Technology Software & Data Engineering Volume 13 Issue 5 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationAES Power Attack Based on Induced Cache Miss and Countermeasure
AES Power Attack Based on Induced Cache Miss and Countermeasure Guido Bertoni, Vittorio Zaccaria STMicroelectronics, Advanced System Technology Agrate Brianza - Milano, Italy, {guido.bertoni, vittorio.zaccaria}@st.com
More informationRijndael Encryption implementation on different platforms, with emphasis on performance
Rijndael Encryption implementation on different platforms, with emphasis on performance KAFUUMA JOHN SSENYONJO Bsc (Hons) Computer Software Theory University of Bath May 2005 Rijndael Encryption implementation
More informationOpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA
OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization
More informationSecret File Sharing Techniques using AES algorithm. C. Navya Latha 200201066 Garima Agarwal 200305032 Anila Kumar GVN 200305002
Secret File Sharing Techniques using AES algorithm C. Navya Latha 200201066 Garima Agarwal 200305032 Anila Kumar GVN 200305002 1. Feature Overview The Advanced Encryption Standard (AES) feature adds support
More informationPerformance Analysis of the SHA-3 Candidates on Exotic Multi-Core Architectures
Performance Analysis of the SHA-3 Candidates on Exotic Multi-Core Architectures Joppe W. Bos 1 and Deian Stefan 2 1 Laboratory for Cryptologic Algorithms, EPFL, CH-1015 Lausanne, Switzerland 2 Dept. of
More informationImproving Performance of Secure Data Transmission in Communication Networks Using Physical Implementation of AES
Improving Performance of Secure Data Transmission in Communication Networks Using Physical Implementation of AES K Anjaneyulu M.Tech Student, Y.Chalapathi Rao, M.Tech, Ph.D Associate Professor, Mr.M Basha,
More informationInternational Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0020 ISSN (Online): 2279-0039 International
More informationAn Energy Efficient ATM System Using AES Processor
www.seipub.org/eer Electrical Engineering Research (EER) Volume 1 Issue 2, April 2013 An Energy Efficient ATM System Using AES Processor Ali Nawaz *1, Fakir Sharif Hossain 2, Khan Md. Grihan 3 1 Department
More informationNetwork Security. Omer Rana
Network Security Omer Rana CM0255 Material from: Cryptography Components Sender Receiver Plaintext Encryption Ciphertext Decryption Plaintext Encryption algorithm: Plaintext Ciphertext Cipher: encryption
More informationFPGA IMPLEMENTATION OF AN AES PROCESSOR
FPGA IMPLEMENTATION OF AN AES PROCESSOR Kazi Shabbir Ahmed, Md. Liakot Ali, Mohammad Bozlul Karim and S.M. Tofayel Ahmad Institute of Information and Communication Technology Bangladesh University of Engineering
More informationNVIDIA GeForce GTX 580 GPU Datasheet
NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines
More informationEfficient Software Implementation of AES on 32-Bit Platforms
Efficient Software Implementation of AES on 32-Bit Platforms Guido Bertoni 1, Luca Breveglieri 1, Pasqualina Fragneto 2, Marco Macchetti 3, and Stefano Marchesin 3 1 Politecnico di Milano, Milano, Italy
More informationCryptography and Network Security
Cryptography and Network Security Spring 2012 http://users.abo.fi/ipetre/crypto/ Lecture 3: Block ciphers and DES Ion Petre Department of IT, Åbo Akademi University January 17, 2012 1 Data Encryption Standard
More informationDesign and Implementation of Asymmetric Cryptography Using AES Algorithm
Design and Implementation of Asymmetric Cryptography Using AES Algorithm Madhuri B. Shinde Student, Electronics & Telecommunication Department, Matoshri College of Engineering and Research Centre, Nashik,
More informationAES1. Ultra-Compact Advanced Encryption Standard Core. General Description. Base Core Features. Symbol. Applications
General Description The AES core implements Rijndael encoding and decoding in compliance with the NIST Advanced Encryption Standard. Basic core is very small (start at 800 Actel tiles). Enhanced versions
More informationGPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile
GPU Computing with CUDA Lecture 2 - CUDA Memories Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile 1 Outline of lecture Recap of Lecture 1 Warp scheduling CUDA Memory hierarchy
More informationA PPENDIX H RITERIA FOR AES E VALUATION C RITERIA FOR
A PPENDIX H RITERIA FOR AES E VALUATION C RITERIA FOR William Stallings Copyright 20010 H.1 THE ORIGINS OF AES...2 H.2 AES EVALUATION...3 Supplement to Cryptography and Network Security, Fifth Edition
More informationModern Block Cipher Standards (AES) Debdeep Mukhopadhyay
Modern Block Cipher Standards (AES) Debdeep Mukhopadhyay Assistant Professor Department of Computer Science and Engineering Indian Institute of Technology Kharagpur INDIA -721302 Objectives Introduction
More informationultra fast SOM using CUDA
ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A
More informationGPUs for Scientific Computing
GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research
More informationCryptography and Network Security: Summary
Cryptography and Network Security: Summary Timo Karvi 12.2013 Timo Karvi () Cryptography and Network Security: Summary 12.2013 1 / 17 Summary of the Requirements for the exam The advices are valid for
More informationImplementation and Design of AES S-Box on FPGA
International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 232-9364, ISSN (Print): 232-9356 Volume 3 Issue ǁ Jan. 25 ǁ PP.9-4 Implementation and Design of AES S-Box on FPGA Chandrasekhar
More informationDesign and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue, Ver. III (Jan - Feb. 205), PP 0- e-issn: 239 4200, p-issn No. : 239 497 www.iosrjournals.org Design and Analysis of Parallel AES
More informationCAESAR candidate PiCipher
CAESAR candidate PiCipher Danilo Gligoroski, ITEM, NTNU, Norway Hristina Mihajloska, FCSE, UKIM, Macedonia Simona Samardjiska, ITEM, NTNU, Norway and FCSE, UKIM, Macedonia Håkon Jacobsen, ITEM, NTNU, Norway
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationIntroduction to GPU Architecture
Introduction to GPU Architecture Ofer Rosenberg, PMTS SW, OpenCL Dev. Team AMD Based on From Shader Code to a Teraflop: How GPU Shader Cores Work, By Kayvon Fatahalian, Stanford University Content 1. Three
More informationSpeeding Up RSA Encryption Using GPU Parallelization
2014 Fifth International Conference on Intelligent Systems, Modelling and Simulation Speeding Up RSA Encryption Using GPU Parallelization Chu-Hsing Lin, Jung-Chun Liu, and Cheng-Chieh Li Department of
More informationHow To Encrypt With A 64 Bit Block Cipher
The Data Encryption Standard (DES) As mentioned earlier there are two main types of cryptography in use today - symmetric or secret key cryptography and asymmetric or public key cryptography. Symmetric
More informationAccelerating Wavelet-Based Video Coding on Graphics Hardware
Wladimir J. van der Laan, Andrei C. Jalba, and Jos B.T.M. Roerdink. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA. In Proc. 6th International Symposium on Image and Signal Processing
More informationArea Optimized and Pipelined FPGA Implementation of AES Encryption and Decryption
Area Optimized and Pipelined FPGA Implementation of AES Encryption and Decryption 1, Mg Suresh, 2, Dr.Nataraj.K.R 1, Asst Professor Rgit, Bangalore, 2, Professor 1,2, Department Of Electronics And Communication
More informationLBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR
LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:
More informationThe Advanced Encryption Standard (AES)
The Advanced Encryption Standard (AES) All of the cryptographic algorithms we have looked at so far have some problem. The earlier ciphers can be broken with ease on modern computation systems. The DES
More informationPurpose... 3. Computer Hardware Configurations... 6 Single Computer Configuration... 6 Multiple Server Configurations... 7. Data Encryption...
Contents Purpose... 3 Background on Keyscan Software... 3 Client... 4 Communication Service... 4 SQL Server 2012 Express... 4 Aurora Optional Software Modules... 5 Computer Hardware Configurations... 6
More informationGPU Parallel Computing Architecture and CUDA Programming Model
GPU Parallel Computing Architecture and CUDA Programming Model John Nickolls Outline Why GPU Computing? GPU Computing Architecture Multithreading and Arrays Data Parallel Problem Decomposition Parallel
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 11: Putting It All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Amir Roth with contributions by Milo
More informationKeywords Web Service, security, DES, cryptography.
Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Provide the
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationSeparable & Secure Data Hiding & Image Encryption Using Hybrid Cryptography
502 Separable & Secure Data Hiding & Image Encryption Using Hybrid Cryptography 1 Vinay Wadekar, 2 Ajinkya Jadhavrao, 3 Sharad Ghule, 4 Akshay Kapse 1,2,3,4 Computer Engineering, University Of Pune, Pune,
More informationsynthesizer called C Compatible Architecture Prototyper(CCAP).
Speed Improvement of AES Encryption using hardware accelerators synthesized by C Compatible Architecture Prototyper(CCAP) Hiroyuki KANBARA Takayuki NAKATANI Naoto UMEHARA Nagisa ISHIURA Hiroyuki TOMIYAMA
More informationA PERFORMANCE EVALUATION OF COMMON ENCRYPTION TECHNIQUES WITH SECURE WATERMARK SYSTEM (SWS)
A PERFORMANCE EVALUATION OF COMMON ENCRYPTION TECHNIQUES WITH SECURE WATERMARK SYSTEM (SWS) Ashraf Odeh 1, Shadi R.Masadeh 2, Ahmad Azzazi 3 1 Computer Information Systems Department, Isra University,
More informationCryptography and Network Security Chapter 3
Cryptography and Network Security Chapter 3 Fifth Edition by William Stallings Lecture slides by Lawrie Brown (with edits by RHB) Chapter 3 Block Ciphers and the Data Encryption Standard All the afternoon
More informationDFW Backup Software. Whitepaper Data Security
Version 6 Jan 2012 Table of Content 1 Introduction... 3 2 DFW Backup Offsite Backup Server Secure, Robust and Reliable... 4 2.1 Secure 128-bit SSL communication... 4 2.2 Backup data are securely encrypted...
More informationNetwork Security. Chapter 3 Symmetric Cryptography. Symmetric Encryption. Modes of Encryption. Symmetric Block Ciphers - Modes of Encryption ECB (1)
Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle Network Security Chapter 3 Symmetric Cryptography General Description Modes of ion Data ion Standard (DES)
More informationIntroduction to GPU Programming Languages
CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure
More informationTHÈSE DE DOCTORAT DE L UNIVERSITÉ PIERRE ET MARIE CURIE. Spécialité: Informatique, Télécommunication et Électronique. Karim Moussa Ali Abdellatif
THÈSE DE DOCTORAT DE L UNIVERSITÉ PIERRE ET MARIE CURIE Spécialité: Informatique, Télécommunication et Électronique École Doctorale Informatique, Télécommunication et Électronique Présentée par: Karim
More informationComputer Architecture TDTS10
why parallelism? Performance gain from increasing clock frequency is no longer an option. Outline Computer Architecture TDTS10 Superscalar Processors Very Long Instruction Word Processors Parallel computers
More informationProgramming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga
Programming models for heterogeneous computing Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga Talk outline [30 slides] 1. Introduction [5 slides] 2.
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and
More informationCUDA SKILLS. Yu-Hang Tang. June 23-26, 2015 CSRC, Beijing
CUDA SKILLS Yu-Hang Tang June 23-26, 2015 CSRC, Beijing day1.pdf at /home/ytang/slides Referece solutions coming soon Online CUDA API documentation http://docs.nvidia.com/cuda/index.html Yu-Hang Tang @
More informationDynamic Profiling and Load-balancing of N-body Computational Kernels on Heterogeneous Architectures
Dynamic Profiling and Load-balancing of N-body Computational Kernels on Heterogeneous Architectures A THESIS SUBMITTED TO THE UNIVERSITY OF MANCHESTER FOR THE DEGREE OF MASTER OF SCIENCE IN THE FACULTY
More informationHardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui
Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching
More informationThanks, But No Thanks
Thanks, But No Thanks Current Cryptographic Standards Are Sufficient for Software Dan Shumow MSR Security and Cryptography Group Microsoft Research Introduction Disclaimer: I am a Software Developer, so
More informationWhite Paper. Shay Gueron Intel Architecture Group, Israel Development Center Intel Corporation
White Paper Shay Gueron Intel Architecture Group, Israel Development Center Intel Corporation Intel Advanced Encryption Standard (AES) New Instructions Set Intel AES New Instructions are a set of instructions
More informationCache based Timing Attacks on Embedded Systems
Cache based Timing Attacks on Embedded Systems Malte Wienecke Monday 20 th July, 2009 Master Thesis Ruhr-Universität Bochum Chair for Embedded Security Prof. Dr.-Ing. Christof Paar Advisor: Dipl.-Ing.
More informationGPU Programming Strategies and Trends in GPU Computing
GPU Programming Strategies and Trends in GPU Computing André R. Brodtkorb 1 Trond R. Hagen 1,2 Martin L. Sætra 2 1 SINTEF, Dept. Appl. Math., P.O. Box 124, Blindern, NO-0314 Oslo, Norway 2 Center of Mathematics
More informationKarsten Nohl, karsten@srlabs.de. Breaking GSM phone privacy
arsten Nohl, karsten@srlabs.de Breaking GSM phone privacy GSM is global, omnipresent and wants to be hacked 80% of mobile phone market 200+ countries 5 billion users! GSM encryption introduced in 1987
More informationNetwork Security. Security. Security Services. Crytographic algorithms. privacy authenticity Message integrity. Public key (RSA) Message digest (MD5)
Network Security Security Crytographic algorithms Security Services Secret key (DES) Public key (RSA) Message digest (MD5) privacy authenticity Message integrity Secret Key Encryption Plain text Plain
More informationPerformance Evaluation of AES using Hardware and Software Codesign
Performance Evaluation of AES using Hardware and Software Codesign Vilas V Deotare 1, Dinesh V Padole 2 Ashok S. Wakode 3 Research Scholar,Professor, GHRCE, Nagpur, India vilasdeotare@gmail.com 1, dvpadole@gmail.com
More informationApplications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61
F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase
More informationDesign and Implementation of Low-area and Low-power AES Encryption Hardware Core
Design and Implemtation of Low-area and Low-power AES Encryption Hardware Core Panu Hämäläin, Timo Alho, arko Hännikäin, and Timo D Hämäläin Tampere University of Technology / Institute of Digital and
More informationReal-time Visual Tracker by Stream Processing
Real-time Visual Tracker by Stream Processing Simultaneous and Fast 3D Tracking of Multiple Faces in Video Sequences by Using a Particle Filter Oscar Mateo Lozano & Kuzahiro Otsuka presented by Piotr Rudol
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationDataTrust Backup Software. Whitepaper Data Security. Version 6.8
Version 6.8 Table of Contents 1 Introduction... 3 2 DataTrust Offsite Backup Server Secure, Robust and Reliable... 4 2.1 Secure 128-bit SSL communication... 4 2.2 Backup data are securely encrypted...
More informationAStudyofEncryptionAlgorithmsAESDESandRSAforSecurity
Global Journal of Computer Science and Technology Network, Web & Security Volume 13 Issue 15 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationComputer Architecture. Secure communication and encryption.
Computer Architecture. Secure communication and encryption. Eugeniy E. Mikhailov The College of William & Mary Lecture 28 Eugeniy Mikhailov (W&M) Practical Computing Lecture 28 1 / 13 Computer architecture
More informationSPINS: Security Protocols for Sensor Networks
SPINS: Security Protocols for Sensor Networks Adrian Perrig, Robert Szewczyk, J.D. Tygar, Victor Wen, and David Culler Department of Electrical Engineering & Computer Sciences, University of California
More information