Hardware Implementation of the Compression Function for Selected SHA-3 Candidates
|
|
|
- Cleopatra Douglas
- 10 years ago
- Views:
Transcription
1 Hardware Implementation of the Compression Function for Selected SHA-3 Candidates A. H. Namin and M. A. Hasan Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada {anamin, Abstract. Hardware implementation of the main building block (compression function) for five different SHA-3 candidates is presented. The five candidates, namely Blue Midnight Wish, Luffa, Skein, Shabal, and Blake have been considered since they present faster software implementation results compared to the rest of the SHA-3 proposals. The compression functions realized in hardware create the message digest of size 256 bits. We report both ASIC and FPGA implementations. The results allow an easy comparison for hardware performance of the candidates. Key words: Hash functions, SHA-3, Hardware, Blue Midnight Wish, Luffa, Skein, Shabal, Blake. 1 Introduction Hash functions have many applications in cryptography mainly in digital signatures, message authentication codes (MACs), and other forms of authentication [1]. A public competition recently organized by the National Institute of Standard and Technology (NIST) is underway to select the new Standard Hash Algorithm (SHA-3) [2]. The competition is in response to a weakness found in the SHA family of hash functions [6]. Fifty one submitted algorithms have advanced to the first round of the competition. The source code of these algorithms in C language is available through the NIST SHA-3 webpage [2]. Source codes can be easily compiled for different platforms to measure the performance of each of the proposals. For a complete list of software performance one should refer to ECRYPT Benchmarking of All Submitted Hashes (EBASH) [4] or the SHA-3 zoo website [3]. Unfortunately small percentage of the proposals include hardware implementation results of their work. Also, for the limited number of proposals that were implemented in hardware, different technologies and platforms (ASIC or FPGA) were used. This makes a fair comparison of the implementation results almost impossible. The main motivation behind this work is to address this issue by implementing a group of SHA-3 candidates using the same technology and applying the same design approach. We then compare the results of our implementations of the SHA-3 candidates with that of SHA-2.
2 2 A. H. Namin and M. A. Hasan The SHA-3 candidates we consider here are: Blue Midnight Wish, Luffa, Skein, Shabal and Blake [7]-[11]. These five algorithms are selected since they present faster software implementation results compared to the rest of the proposals and are suitable for hardware implementation [5]. 2 Design Approach In general, most hash algorithms use an iterative process to hash the arbitrary size input by successively processing fixed size blocks of the input [1]. This is achieved through a two stage architecture. The first stage which we refer to as the Preprocessing stage is in charge of padding the input to appropriate size (usually multiplies of 256 or 512 bits), and then breaking down the message into blocks of smaller fixed sizes (256 or 512 bits). Next, the message blocks will be processed by the second stage which is referred to as the Hash Computation Stage. In this stage, a special function called the Compression Function is used iteratively through a number of rounds to create the message digest. The actual number of rounds (or iterations) depends on the number of message blocks. The first round compression function uses the first block of the message along with a predefined initial value called the Initialization Vector. The initialization vector is usually created in the preprocessing stage and is just used for the first round of the compression function. For the rest of the rounds, the outputs of each round compression function will be used as the next round initialization vector. As NIST has requested, all candidates are capable of supporting different message digest sizes of 224, 256, 384 and 512 bits. However, we have just focused on the 256-bit versions in this work. If the authors of a candidate proposed hardware architectures for their design, we use that for our hardware implementation (Skein, Shabal, and Blake). For Luffa and Blue Midnight Wish the authors did not present any hardware architecture in their proposal, hence their algorithms are implemented as pure combinational circuit. Input/Output registers are added at the input and output ports to take into account the loading effect of the inputs/outputs and also create the register transfer level modules. The inputs for the compression function block is a message of fixed block size (256 or 512 bits according to the algorithm) along with the initialization vector (if needed). The final output of the compression function is a 256 bit message digest. We present both ASIC and FPGA implementations. For ASIC implementations, VHDL is used to describe the modules in hardware. The code is then simulated using Cadence s NCSim to verify its functionality. Afterwards, the VHDL code is synthesized to a gate-level netlist using Synopsys Design Compiler. Compilation parameters are always chosen to maximize the operating frequency of the design. FPGA implementations are carried out using the same VHDL files used for ASIC implementations. The only difference is the addition of the serial-in parallel-out and parallel-in serial-out modules to transfer the data into or out of the FPGA. Note that FPGA implementations take into account the I/O speed
3 Title Suppressed Due to Excessive Length 3 limits of the FPGA whereas this limitation is not considered for ASIC implementations. We have used the 90 nm CMOS technology from STMicroelectronics for our ASIC implementations. For FPGA implementations, we use Startix III FPGA from Altera. We base the comparison of our hardware implementations on area and time requirements of each design. For ASIC implementations, we report total area in µm 2 which includes area due to combinational circuits as well as I/O registers. We also give the total number of gates used for each design. Note that this parameter varies considerably according to the size and types of gates available in the library. For FPGA implementations as we are using Altera products, the area requirements are given in terms of the number of Adaptive Look-Up Tables (ALUTs) and dedicated logic registers. For time requirements, we report total delay in nano seconds for fully unrolled designs (Luffa, Skein, BMW, and SHA-2) and is equal to the critical path delay. For sequential designs (Skein-1c, Shabal, Blake, and SHA-2-1c) the total delay is the product of the critical path delay and the required number of clock cycles for each iteration. For FPGA designs, clock signal of the combinational circuit that implements the compression function is typically different from the clock signal of the I/O modules. 3 Blue Midnight Wish Hash Function 3.1 Algorithm Description The Blue Midnight Wish (BMW) algorithm has been proposed by Svein Johan Knapskog et al. in [7]. We refer to the variant that creates the 256 bit message digest as BMW-256. The basic data block used is called a word which is 32 bits long. BMW makes use of four different operations in the hash computation stage: bit-wise logical word XOR operation, word addition and subtraction (modulo 2 32 ), shift operations (left or right), and rotate left (circular shift left) operation. BMW uses a Double Pipe design to increase the resistance against generic multicollision attacks and length extension attacks [12, 13]. In the double pipe design, the size of the inputs to the compression function are twice the message digest size. The inputs to the compression function are the message blocks M i of size 512 bits, along with the initialization vector (previous output of the compression function) which is called the old double pipe H i 1 of size 512 bits. The final message digest output bits are the least significant 256 bits of the current double pipe H i. The compression function uses three separate functions to generate the message digest; f 0, f 1, f 2. The general block diagram for the BMW-256 compression function is shown in Fig. 1. Note that each of the signals in the figure is 512 bits wide. The first function f 0 : {0, 1} {0, 1} 512 takes two arguments; a message block M i and the previous value of the double pipe H i 1, and generates
4 4 A. H. Namin and M. A. Hasan M i f 0 Q a H i 1 f 1 Q b f 2 H i H i Fig.1. General block diagram of the compression function in BMW-256 MD Q i a = Q i (0,1,...,15) the first part of the quadruple pipe of size 512 bits. Mathematical definition for the function f 0 is shown in Eqs. (1)-(3). In these equations, initially temporary signal W(0,1,...,15) i is generated through word by word XOR of the two inputs and addition/subtraction of the words. Note that all addition and subtractions are done module 2 32, hence carry propagation just exists inside the words. Signal W is then used to create the Q a signal using s 0, s 1, s 2, s 3, s 4 s-transforms according to Eq. (2). Each transformation accepts a 32-bit word and generates another 32-bit word. Details of s-transforms are shown in Eq. (3). Note that in this equation SHR r (x)/shl r (x) represents the shift right/left operation by r bits, while ROTL r (x) represents the rotate left operation by r bits.
5 Q i 12 = s2(w i 12 ), Qi 13 = s3(w i 13 ), Qi 14 = s4(w i 14 ), Qi 15 = s0(w i 15 ), (2) Title Suppressed Due to Excessive Length 5 0 = (M (i) 5 H (i 1) 5 ) (M (i) 7 H (i 1) 7 ) (M (i) 10 H(i 1) 10 ) (M (i) 13 H(i 1) 13 ) (M (i) 14 H(i 1) 14 ) 1 = (M (i) 6 H (i 1) 6 ) (M (i) 8 H (i 1) 8 ) (M (i) 11 H(i 1) 11 ) (M (i) 14 H(i 1) 14 ) (M (i) 15 H(i 1) 15 ) 2 = (M (i) 0 H (i 1) 0 ) (M (i) 7 H (i 1) 7 ) (M (i) 9 H (i 1) 9 ) (M (i) 12 H(i 1) 12 ) (M (i) 15 H(i 1) 15 ) 3 = (M (i) 0 H (i 1) 0 ) (M (i) 1 H (i 1) 1 ) (M (i) 8 H (i 1) 8 ) (M (i) 10 H(i 1) 10 ) (M (i) 13 H(i 1) 13 ) 4 = (M (i) 1 H (i 1) 1 ) (M (i) 2 H (i 1) 2 ) (M (i) 9 H (i 1) 9 ) (M (i) 11 H(i 1) 11 ) (M (i) 14 H(i 1) 14 ) 5 = (M (i) 3 H (i 1) 3 ) (M (i) 2 H (i 1) 2 ) (M (i) 10 H(i 1) 10 ) (M (i) 12 H(i 1) 12 ) (M (i) 15 H(i 1) 15 ) 6 = (M (i) 4 H (i 1) 4 ) (M (i) 0 H (i 1) 0 ) (M (i) 3 H (i 1) 3 ) (M (i) 11 H(i 1) 11 ) (M (i) 13 H(i 1) 13 ) 7 = (M (i) 1 H (i 1) 1 ) (M (i) 4 H (i 1) 4 ) (M (i) 5 H (i 1) 5 ) (M (i) 12 H(i 1) 12 ) (M (i) 14 H(i 1) 14 ) 8 = (M (i) 2 H (i 1) 2 ) (M (i) 5 H (i 1) 5 ) (M (i) 6 H (i 1) 6 ) (M (i) 13 H(i 1) 13 ) (M (i) 15 H(i 1) 15 ) 9 = (M (i) 0 H (i 1) 0 ) (M (i) 3 H (i 1) 3 ) (M (i) 6 H (i 1) 6 ) (M (i) 7 H (i 1) 7 ) (M (i) 14 H(i 1) 14 ) 10 = (M(i) 8 H (i 1) 8 ) (M (i) 1 H (i 1) 1 ) (M (i) 4 H (i 1) 4 ) (M (i) 7 H (i 1) 7 ) (M (i) 15 H(i 1) 15 ) 11 = (M(i) 8 H (i 1) 8 ) (M (i) 0 H (i 1) 0 ) (M (i) 2 H (i 1) 2 ) (M (i) 5 H (i 1) 5 ) (M (i) 9 H (i 1) 9 ) 12 = (M(i) 1 H (i 1) 1 ) (M (i) 3 H (i 1) 3 ) (M (i) 6 H (i 1) 6 ) (M (i) 9 H (i 1) 9 ) (M (i) 10 H(i 1) 10 ) 13 = (M(i) 2 H (i 1) 2 ) (M (i) 4 H (i 1) 4 ) (M (i) 7 H (i 1) 7 ) (M (i) 10 H(i 1) 10 ) (M (i) 11 H(i 1) 11 ) 14 = (M(i) 3 H (i 1) 3 ) (M (i) 5 H (i 1) 5 ) (M (i) 8 H (i 1) 8 ) (M (i) 11 H(i 1) 11 ) (M (i) 12 H(i 1) 12 ) 15 = (M(i) 12 H(i 1) 12 ) (M (i) 4 H (i 1) 4 ) (M (i) 6 H (i 1) 6 ) (M (i) 9 H (i 1) 9 ) (M (i) 13 H(i 1) 13 ) (1) Q i 0 = s0(w i 0 ), Qi 1 = s1(w i 1 ), Qi 2 = s2(w i 2 ), Qi 3 = s3(w i 3 ), Qi 4 = s4(w i 4 ), Qi 5 = s0(w i 5 ), Q i 6 = s1(w i 6 ), Qi 7 = s2(w i 7 ), Qi 8 = s3(w i 8 ), Qi 9 = s4(w i 9 ), Qi 10 = s0(w i 10 ), Qi 11 = s1(w i 11 ), s 0(x) = SHR 1 (x) SHL 3 (x) ROTL 4 (x) ROTL 19 (x) s 4(x) = SHR 1 (x) x s 1(x) = SHR 1 (x) SHL 2 (x) ROTL 8 (x) ROTL 23 (x) s 5(x) = SHR 2 (x) x s 2(x) = SHR 2 (x) SHL 1 (x) ROTL 12 (x) ROTL 25 (x) s 3(x) = SHR 2 (x) SHL 2 (x) ROTL 15 (x) ROTL 29 (x) (3) The second function f 1 : {0, 1} {0, 1} 512 takes the same message block M i as the first function and the output of the first function Q i a = Q i (0,1,...,15) and generates the second part of the quadruple pipe Qi b = Qi (16,17,...,31), through EXPAND1 and EXPAND2 expansion functions as shown in Eq. (4). Expansion functions make use of the s-transforms along with the r-transforms which are defined in Eq. (5).
6 M i (j 16)mod16 Mi (j 13)mod16 Mi (j 6)mod16 j 0x (4) 6 A. H. Namin and M. A. Hasan For ii = 0, 1 : Q i (ii16) = Expand1(ii 16) For ii = 2 to 15 : Q i (ii16) = Expand2(ii 16) Expand1(j) = s 1(Q i (j 16) ) s2(qi (j 15) ) s3(qi (j 14) ) s0(qi (j 13) ) s 1(Q i (j 12) ) s2(qi (j 11) ) s3(qi (j 10) ) s0(qi (j 9) ) s 1(Q i (j 8) ) s2(qi (j 7) ) s3(qi (j 6) ) s0(qi (j 5) ) s 1(Q i (j 4) ) s2(qi (j 3) ) s3(qi (j 2) ) s0(qi (j 1) ) M i (j 16)mod16 Mi (j 13)mod16 Mi (j 6)mod16 j 0x Expand2(j) = Q i (j 16) r1(qi (j 15) ) Qi (j 14) r2(qi (j 13) ) Q i (j 12) r3(qi (j 11) ) Qi (j 10) r4(qi (j 9) ) Q i (j 8) r5(qi (j 7) ) Qi (j 6) r6(qi (j 5) ) Q i (j 4) r7(qi (j 3) ) Qi (j 2) r4(qi (j 1) ) r 1(x) = ROL 3 (x), r 2 = ROTL 7 (x), r 3(x) = ROTL 13 (x) r 4(x) = ROL 16 (x), r 5 = ROTL 19 (x), r 6(x) = ROTL 23 (x) r 7(x) = ROL 27 (x), (5) Finally, the last function f 2 : {0, 1} {0, 1} 512 takes two arguments; the message block M i, and the quadruple pipe Q i 1 0,1,...,31 = Qi a Qi b, ( represents the concatenation operation) and generates the current value of the double pipe H i. Mathematical definition for the function f 2 is shown in Eq. (6).
7 H i 15 = ROTL16 (H i 3 ) (XH Qi 31 Mi 15 ) (SHR2 (XL) Q i 22 Qi 15 ) (6) Title Suppressed Due to Excessive Length 7 XL = Q i 16 Qi 17 Qi 18 Qi 19 Qi 20 Qi 21 Qi 22 Qi 23 XL = XL Q i 24 Qi 25 Qi 26 Qi 27 Qi 28 Qi 29 Qi 30 Qi 31 H i 0 = (SHL5 (XH) SHR 5 (Q i 16 Mi 0 ) (XL Qi 24 Qi 0 ) H i 1 = (SHR7 (XH) SHL 8 (Q i 17 Mi 1 ) (XL Qi 25 Qi 1 ) H i 2 = (SHR5 (XH) SHL 5 (Q i 18 Mi 2 ) (XL Qi 26 Qi 2 ) H i 3 = (SHR1 (XH) SHL 5 (Q i 19 Mi 3 ) (XL Qi 27 Qi 3 ) H i 4 = (SHR3 (XH) Q i 20 Mi 4 ) (XL Qi 28 Qi 4 ) H i 5 = (SHL6 (XH) SHR 6 (Q i 21 Mi 5 ) (XL Qi 29 Qi 5 ) H i 6 = (SHR4 (XH) SHLhQ i 22 Mi 6 ) (XL Qi 30 Qi 6 ) H i 7 = (SHR11 (XH) SHL 2 (Q i 23 Mi 7 ) (XL Qi 31 Qi 7 ) H i 8 = ROTL9 (H i 4 ) (XH Qi 24 Mi 8 ) (SHL8 (XL) Q i 23 Qi 8 ) H i 9 = ROTL10 (H i 5 ) (XH Qi 25 Mi 9 ) (SHR6 (XL) Q i 16 Qi 9 ) H i 10 = ROTL11 (H i 6 ) (XH Qi 26 Mi 10 ) (SHL6 (XL) Q i 17 Qi 10 ) H i 11 = ROTL12 (H i 7 ) (XH Qi 27 Mi 11 ) (SHL4 (XL) Q i 18 Qi 11 ) H i 12 = ROTL13 (H i 0 ) (XH Qi 28 Mi 12 ) (SHR3 (XL) Q i 19 Qi 12 ) H i 13 = ROTL14 (H i 1 ) (XH Qi 29 Mi 13 ) (SHR4 (XL) Q i 20 Qi 13 ) H i 14 = ROTL15 (H i 2 ) (XH Qi 30 Mi 14 ) (SHR7 (XL) Q i 21 Qi 14 ) 3.2 ASIC Implementation In_Reg M i Q a Q b f 0 f 1 f H i H i Out_Reg In_Reg Fig.2. hardware architecture used for the compression function of the BMW-256 Block diagram of the architecture that is used for the ASIC implementation of the BMW-256 is shown in Fig. 2. In this figure, each line presents a 512-bit wide signal. Inputs M i and H i 1 are stored inside the 512-bit input registers
8 8 A. H. Namin and M. A. Hasan (In Reg). Functions f 1, f 2, f 3 were realized completely as combinational circuits. Output H i can be read from the output register (Out Reg) after one clock cycle. Input Compression Combinational I/O Register Number of Size Function Delay Area Area Gates ns 670, µm 2 48, µm 2 60, 033 Table 1. ASIC implementation summary of the BMW-256 compression function Synthesis results for the BMW-256 compression function are summarized in Table 1. In this table, Compression Function Delay represents the maximum delay between the Input and Output registers. Combinational Area represents the area used by functions f 0, f 1, and f 2. I/O Register Area shows the area utilized by the In reg and Out reg registers. Number of gates may be considered to be an alternative to measure the area requirements of the implementation. 4 Luffa Hash Function 4.1 Algorithm Description The Luffa algorithm s due to Christophe De Canniere et al. in [8]. It makes use of s-boxes (non-linear permutation) along with shift, and XOR operations to create the output message digest. The basic data block used is a word which is 32 bits long. Luffa s compression function is called the round function. The structure of a round function for Luffa-256 is shown in Fig. 3. It is made of two separate functions; Message Injection (MI) and Permutation (P). M i H i 1 0 i 1 H 1 H 2 i 1 MI Q 0 Q 1 Q 2 P H i 0 i H 1 H 2 i Z i Fig.3. General block diagram of the compression function (round function) for Luffa- 256 The detail of the message injection module is shown in Fig. 4. In this figure the circled plus sign ( ) represents bit by bit XOR operation of the variables (256 bit each). Also the circled cross sign ( ) represents the multiplication operation in the ring of GF(2 8 ) 32 using the polynomial φ(x) = x 8 x 4 x 3 x 1.
9 Title Suppressed Due to Excessive Length 9 The multiplication operation can be defined as follows; Assume that A and B are each a vector of 256 bits and A = a[7]... a[0] and B = b[7]... b[0] where a[i], b[i], i = 0,..., 7 are vectors of 32 bits each. Assume that B = A 2, then we have b[7] = a[6], b[6] = a[5], b[5] = a[4] b[4] = a[3]xora[7], b[3] = a[2]xora[7], b[2] = a[1] b[1] = a[0]xora[7], b[0] = a[7] (7) The permutation stage (P) for Luffa-256 is made of three similar permutation blocks Q j, j = 0, 1, 2 which work in parallel, as shown in Fig. 3. We refer to these blocks as Permute blocks whose input and output sizes are 256 bits each. They can be modelled as a composition of an input tweak and eight iterations of an Step function. The tweak module is just reordering of the wires and does not contain any gates. The block diagram of the Step function is shown in Fig. 5. In this figure, the 256-bit input to the step function is stored in eight 32- bit registers denoted by a k, 0 k < 8 where k represents the register number. Three main submodules exist in the step function: SubCrumb, MixWord, and the AddConstant. The SubCrumb module is a nonlinear permutation which makes use of 32 similar s-boxes (4-bit input, 4- bit output) in parallel for its operation. The s-box input/output relationship is as follows. s[16] = {7, 13, 11, 10, 12, 4, 8, 3, 5, 15, 6, 0,9, 1, 2, 14} (8) The MixWord is a linear permutation module which mixes two words together. The structure of the MixWord is shown in Fig. 6. In Fig. 5, the AddConstant module performs two XOR operations involving constants according to the parameters k and r (the Step number) to the first and the fourth words. Note that since the value of the constants are predetermined (one input of the XOR gates), the synthesizer will perform optimization and replace the XOR gates by the Inverters or wires as required. M i H i 1 0 H i 1 1 H i Fig.4. Block diagram of the Message Injection module for Luffa-256
10 10 A. H. Namin and M. A. Hasan Fig.5. Block diagram of the Step function module for Luffa ASIC Implementation Block diagram of the architecture that is used for the ASIC implementation of Luffa-256 is shown in Fig. 7. In this figure, each line presents a 256-bit wide signal. Inputs M i and H i 1 are stored inside the 256-bit input registers (In Reg). MI and P blocks (including the s-boxes) are realized completely as combinational circuits. Output H i can be read from the output register (Out Reg) after one clock cycle. Synthesis results for the Luffa-256 compression function are summarized in Table 2. In this table, the Combinational Area represents the area used by the message injection and permutation (three permute blocks). Each permute block contains a tweak module and eight instances of the step function. 32 <<< 2 <<< <<< <<< 1 Fig.6. Block diagram of the MixWord function module for Luffa-256
11 Title Suppressed Due to Excessive Length 11 In_Reg M i 256 H i H 1 i H 2 i MI P Tweak Step Step... Step { Q Z i Out_Reg H i 0 H 1 i H 2 i Fig.7. Hardware architecture used for the compression function of Luffa Skein Hash Function 5.1 Algorithm Description Ferguson et al. have proposed the Skein algorithm [9]. Skein makes use of 64-bit adders along with shift, and XOR operations to create the output message digest. The basic data block used is a word which is 64 bits long. Skein s compression function is based on Threefish which is a large tweakable block cipher [14]. Skein- 256 compression function is made of 72 consecutive rounds of Threefish after four of each there exist a subkey addition module. Subkeys are generated form an input key through the key schedule module. Note that Skein does not make use of the initialization vector the way Luffa and Blue Midnight Wish do. Skein is built on multiple invocations of the Unique Block Iteration (UBI) which is a variant of the Matyas-Meyer-Oseas hash mode [15]. The basic building block of UBI is shown in Fig. 8. The structure of the Skein compression function (including the subkey modules) is shown in Fig. 9. Note that the key is not an optional parameter for Threefish and can not be removed from the Skein structure. subkeys are consist of three contributions: key words, tweak words, and a counter value. Each round of the Threefish block cipher (256 bit version) is made of two instances of a Mix function along with a permutation module. The structure of the Mix function for Threefish-256 is shown in Fig. 10. In this figure the boxed Input Compression Combinational I/O Register Number of Size Function Delay Area Area Gates ns 473, µm 2 60, µm 2 68, 884 Table 2. ASIC implementation summary of the Luffa-256 compression function
12 12 A. H. Namin and M. A. Hasan M i > 72 rounds of ThreeFish block cipher Fig.8. Unique Block Iteration (UBI) building block plus sign ( ) represents a 64 bit addition (modulo 2 64 ), and the circled plus sign ( ) represents a bit by bit XOR operation. Parameters R d,i are constants that set the size of the rotate shift operations according to the round number. Fig. 9. Block diagram of compression function module for Skein-256, 4 out of 72 rounds for Skein-256 including the key 5.2 ASIC Implementation Fully Unrolled/Parallel Design Block diagram of the architecture used for the hardware implementation of Skein- 256 is shown in Fig. 11. In this figure, each line presents a 64-bit wide signal except the input/output wires which are 256 bits wide and the Tweak and key input wires which are 128 and 256 bits respectively.
13 Title Suppressed Due to Excessive Length 13 Input Compression Combinational I/O Register Number of Size Function Delay Area Area Gates ns 1, 605, µm 2 13, µm 2 252, 725 Table 3. ASIC implementation summary of the Skein-256 compression function Input message M i is stored inside the 256-bit input registers (In Reg). The Mix functions have been realized completely as combinational circuits. After 72 rounds of Mix and Permute functions (including 18 rounds of key addition), one level of XOR gates exist to create the UBI structure of the algorithm. Output H i can be read from the output register (Out Reg) after one clock cycle. Synthesis results of the Skein-256 compression function are summarized in Table 3. In this table, the Combinational Area represents the area used by the 144 Mix functions and 18 subkey adder modules exist in 72 rounds of Threefish along with 256 XOR gates in Skein-256. Sequential Design Skein compression function architecture presents a highly regular design as shown in Fig. 9. This property can be exploited to create a complete Skein compression function by just implementing one round of the Threefish cipher and a subkey adder module. This can play a major rule when the hashing algorithm is implemented for area constrained applications. To this end, we have implemented a second version of the Skein algorithm (Skein-1c) using just one round of the Threefish algorithm. Note that some modifications to the original round function of Threefish have been allowed to make the design of Skein-1c capable of creating the message digest. The main modification here is the addition of program-ability option to the Mix module, since the rotate shift operation is a function of the round number. Also some extra modules (a counter and two multiplexers and a key schedule module) have been used to create the correct inputs for the round function as shown in Fig. 12. The key schedule module uses circular shift registers along with three 64-bit adders to generate the subkeys <<< R d,i 64 Fig.10. Block diagram of the Mix function module for Skein-256
14 14 A. H. Namin and M. A. Hasan Fig.11. Hardware architecture used for the compression function of Skein-256 Fig.12. Hardware architecture used for the compression function of Skein-1c Fig.13. Hardware architecture used for the key schedule of Skein-1c Block diagram of the architecture that is used for key schedule is shown in Fig. 12. It should be noted that the clock used for the key module is four
15 Title Suppressed Due to Excessive Length 15 times slower than the clock used for the rest of the Skein-1c design. Also we had assumed that the two parameters k 4 and t 2 are available at the start of the circuit operation and are loaded into the circular shift registers. The design is therefore referred to as Skein-1c since it uses one round of the Threefish compression function and it also includes an extra counter. Synthesis results for skein-1c compression function are summarized in Table 4. Input Critical Number Combinational I/O Register Number of Size Path Delay of Cycles Area Area Gates ns 72 64, µm 2 27, µm 2 11, 634 Table 4. ASIC implementation summary of the Skein-1c compression function From Tables 3 and 4 the following can be deduced: the area requirement for the Skein compression function has been reduced from 1, 619, µm 2 in Skein-256 to 92, µm 2 in Skein-1c. This reduction comes at the price of increasing the compression function delay from ns in Skein-256 to ns in Skein-1c. 6 Shabal Hash Function 6.1 Algorithm Description The Shabal algorithm has been proposed by Bresson et al. in [10]. Shabal architecture is based on Linear Feedback Shift Register (LFSR), meaning the inputs to the current state of the registers are linear function of the previous state of the registers. Shabal makes use of 32-bit adders/subtractors along with shift, and XOR operations to create the output message digest. The basic data block used is a word which is 32 bits long. The simplified structure of the compression function block is shown in Fig. 14. In this figure, the boxed plus sign ( ) represents a 32 bit addition (modulo 2 32 ), and the boxed minus sign ( ) represents a 32 bit subtraction. Circled plus sign ( ) represents a bit by bit XOR operation. A, B and C are initialization vectors, while W is a 64 bit counter used to number the message blocks. The P block represents a keyed permutation module and its details are shown in figure 15. In Fig. 15, inputs B, C, and M are represented by sixteen words each. The 32-bit registers are shown as one block in the figure. Note that input A contains only twelve words and during each clock cycle one word (32 bits) is shifted towards left or right according to the architecture. In this figure modules U and V are non linear functions defined by: U : x 3 x mod 2 32 and v : x 5 x mod In hardware, they can be implemented as small constant multipliers or as two additions and shift for U and three additions and shift for V. Also symbol represents a bit by bit
16 16 A. H. Namin and M. A. Hasan XOR operation. Note that the module having 0xFFF...F as one of its inputs, represents an inversion operation. 6.2 ASIC Implementation We have used the same block diagrams as Fig. 14 and Fig. 15 for hardware implementation of Shabal-256. The only modification applied is the addition of a way to load inputs A,B,C and M into the keyed permutation module. This is achieved by adding a small multiplexer (having one select line and two inputs) to each register storing the inputs. This way all input bits could be loaded into the module, similar to other architectures that have been used in our hardware implementations. Synthesis results for the Shabal-256 compression function are summarized in Table 5. W M i A B P A B C C Fig.14. Block diagram of compression function for Shabal-256 V U <<< 15 A 0 11 C M xFFF...F B <<< 1 Fig.15. Structure of the keyed permutation module P
17 Title Suppressed Due to Excessive Length 17 Input Critical Number Combinational I/O Register Number of Size Path Delay of Cycles Area Area Gates ns 16 38, µm 2 49, µm 2 8,065 Table 5. ASIC implementation summary of the Shabal-256 compression function In this table, the Combinational Area represents the area used by U and V modules along with the area used by the XOR gates and multiplexers. Major part of the area usage belongs to the five 32-bit adders realizing modules U and V. 7 Blake Hash Function 7.1 Algorithm Description The Blake algorithm has been proposed by Aumasson et al. in [11]. The basic data block used is called a word which is 32 bits long. Blake makes use of bit-wise logical word XOR operations, word addition (modulo 2 32 ), rotate operations (left or right), and permutations in its structure. Blake is mainly built on previously designed components. Its iteration mode is HAIFA which is an improved version of Merkle-Damgard [16]. Its compression function is a modified version of the ChaCha stream cipher [17]. It also supports an optional salt (a parameter that can be either public or secret) to create the message digest. Blake uses a Wide Pipe design to increase its resistance against collision attacks [18, 19]. The local wide pipe structure of Blake s compression function is shown in Fig. 16. chain value initialization rounds finialization next chain value salt counter message chain value salt Fig. 16. Block diagram of compression function for Blake-256 As can be seen from the figure, the compression function is made of three main stages: initialization, rounds, and finalization. In the initialization stage a large inner state is initialized from the previous chain value (h), salt (t), and counter (t). The state value is then updated by message-dependent rounds. It is finally compressed to create the next chain value. A sixteen word state v 0, v 1,, v 15 is generated in the initialization stage. The input/output relationship of the initialization step can be mathematically represented as follows.
18 18 A. H. Namin and M. A. Hasan v 0 v 1 v 2 v 3 v 4 v 5 v 6 v 7 v 8 v 9 v 10 v 11 v 12 v 13 v 14 v 15 h 0 h 1 h 2 h 3 h 4 h 5 h 6 h 7 s 0 c 0 s 1 c 1 s 2 c 2 s 3 c 3 t 0 c 4 t 0 c 5 t 1 c 6 t 1 c 7 (9) Ten rounds of transformation are applied in the rounds stage, after the state v is initialized. Each round of transformation is made of eight operations using the G i functions defined as follows. G 0(v 0, v 4, v 8, v 12) G 1(v 1, v 5, v 9, v 13) G 2(v 2, v 6, v 10, v 14) G 3(v 3, v 7, v 11, v 15) G 4(v 0, v 5, v 10, v 15) G 5(v 1, v 6, v 11, v 12) G 6(v 2, v 7, v 8, v 13) G 7(v 3, v 4, v 9, v 14) (10) Note that the first four G functions (G 0, G 1, G 2, G 3 ) can be computed in parallel since they apply to different set of inputs i.e., G 0 applies to v 0, v 4, v 8, v 12. Once the first four G functions are applied, the last four (G 4, G 5, G 6, G 7 ) can be applied. If desirable the last four G functions can be computed in parallel as well. A mathematical definition for function G i (a, b, c, d) is shown below. a a b ( m σr (2i) c σr (2i 1) ) d (d a) 16 c c d b (b c) 12 a a b ( m σr (2i 1) c σr (2i) ) (11) d (d a) 8 c c d b (b c) 7 In this equation, c i s are 32-bit word constants and σ i s are permutations of {0, 1,..., 15}, both defined in the Blake submission documents [11]. Architecture used for the hardware implementation of the G i function is shown in Fig. 17. Fig.17. Block diagram of the G i function for Blake-256 After 10 rounds of transformation, the next chain value (h 0, h 1,, h 7 ) are extracted from the state values (v 0, v 1,, v 15 ) in the finalization stage. The
19 Title Suppressed Due to Excessive Length 19 input/output relationship of the finalization step can be mathematically represented as follows. h 0 h 0 s 0 v 0 v 8 h 1 h 1 s 1 v 1 v 9 h 2 h 2 s 2 v 2 v 10 h 3 h 3 s 3 v 3 v 11 h 4 h 4 s 0 v 4 v 12 h 5 h 5 s 1 v 5 v 13 h 6 h 6 s 2 v 6 v 14 h 7 h 7 s 3 v 7 v 15 (12) 7.2 ASIC Implementation For our hardware implementation we have used the VHDL code available with the initial submission of Blake. The only modification applied is the addition of the input/output registers to take into account the loading effect on the inputs/outputs. Synthesis results for the Blake-256 compression function are summarized in Table 6. Input Critical Number Combinational I/O Register Number of Size Path Delay of Cycles Area Area Gates ns , µm 2 42, µm 2 30, 262 Table 6. ASIC implementation summary of the Blake-256 compression function 8 Comparison of ASIC Implementation Results As mentioned earlier, for our implementations we have used the 90 nm CMOS technology from STMicroelecronics; synthesis parameters have always been chosen to maximize the operating frequency of the designs. Area/Delay complexities of ASIC implementations of all of the SHA-3 candidates considered in this work are listed in Table 7. For simple comparison, the table also includes our two implementations of SHA-2 as described in Appendix A. At the bottom of the table, SHA-2 presents a high-speed 64-round implementation of the SHA-256 architecture, while SHA-2-1c presents a low-area 1-round implementation. One should note that even though all listed algorithms create the message digest of size 256 bits, their input block size is different. BMW, Blake, SHA-2 and SHA-2-1c use input sizes of 512 bits while Luffa, Skein, Skein-1c and Shabal each accept inputs of size 256 bits. This will make a difference in delay comparisons if the input message size is small (< 256) or very large (>> 512).
20 20 A. H. Namin and M. A. Hasan From Table 7, we can see that Luffa-256 presents the fastest architecture (9.96 ns) followed by Blue Midnight Wish (19.20 ns) and Shabal (38.72 ns). Note that for very large input message sizes (>> 512) BMW performs slightly better than Luffa, and Shabal still ranks third. Luffa s high speed comes from the fact that it does not make use of any adders (32 bit or 64 bit) in the algorithm. The most expensive building block in the Luffa algorithm is the SubCrumb module (s-box) which can be implemented with 5-6 levels of logic gates. The major part of the delay in Blue Midnight Wish and Shabal comes from the 32 bit adder/subtractor modules used in their architectures. Skein-1c and Shabal architectures present the smallest critical path delay in our comparison table; however a number of clock cycles are required for these two designs to finish one round of compression (72 and 16 clock cycles). This results in architectures with low performance with respect to the total compression time. However these two architectures present the smallest area utilization, approximately mm 2 for Skein-1c and mm 2 for Shabal. Compared to SHA-2-1c both designs require larger area while Skein-1c presents a slower design and Shabal presents a faster one. Fully unrolled Skein requires the highest area among all proposals listed in the table and it is slightly larger than SHA-2. The area requirement is mainly due to the multiple 64-bit adders used inside the design (in total it uses bit adders). BMW ranks second regarding area requirements after Skein; the area is again mainly due to multiple adder/subtractor modules in the design. It is worth mentioning that the high regularity of Skein gives the designer the ability to set trade-offs between area and speed, i.e. Skein with 8 or 16 rounds of Threefish present good trade-offs for area/speed for today s applications. In Fig. 18, area and time delays of each of the SHA-3 designs considered in this work are compared with respect to those of SHA-2. Hash Input Output Compression Number Combinational I/O Register Total Delay Size Size Function Delay of Cycles Area Area Total Area BMW ns ns 670, µm 2 48, µm 2 719, µm 2 Luffa ns ns 473, µm 2 60, µm 2 533, µm 2 Skein ns ns 1, 605, µm 2 13, µm 2 1,619, µm 2 Skein-1c ns ns 64, µm 2 27, µm 2 92, µm 2 Shabal ns ns 38, µm 2 49, µm 2 87, µm 2 Blake ns ns 191, µm 2 42, µm 2 233, µm 2 SHA ns ns 1, 587, µm 2 29, µm 2 1,616, µm 2 SHA-2-1c ns ns 30, µm 2 27, µm 2 58, µm 2 Table 7. ASIC implementation summary of the different compression functions
21 Title Suppressed Due to Excessive Length 21 Fig.18. Area delay complexities for ASIC implementation 9 FPGA Implementation For our FPGA implementations we have selected Stratix III FPGA family from Altera. The exact device number used is EP3SL340F1760C3 which represents a 1120 I/O pin FPGA using 65 nm technology. Quartus II software package from Altera under UNIX environment has been used to implement the designs. FPGA implementations have been carried out using the same VHDL files used for ASIC implementations. The only difference was the addition of the serial-in parallel-out (SIPO) and parallel-in serial-out (PISO) modules to transfer the data into or out of the FPGA. Using these modules, the restricted number of I/O pins and the I/O speed limits of the FPGA are taken into account which presents a more practical situation. The details of SIPO and PISO modules are shown in Fig. 19. Fig.19. SIPO and PISO modules used in FPGA implementations The SIPO module receives the input data as 32-bit words and then stores them for the proper output selected by the input select lines. Using this method input bits can be loaded into the SIPO module 32-bits at a time. Then the data can be transfered to the compression function in parallel. The PISO module uses a similar idea; it loads the data from the compression function in parallel and then transfers them to the output 32-bits at a time. FPGA implementation results are summarized in Table 8. Quartus was not able to fit the full unrolled version of Skein and SHA-2 in the selected FPGA
22 22 A. H. Namin and M. A. Hasan Hash C.F. Clk C.F. Clk I/O Clk I/O Clk Total Combinational Dedicated Logic I/O Frequency Cycles Frequency Cycles Delay ALUTs Registers Pins BMW 9.55 MHz MHz ns Luffa MHz MHz ns Skein Skein-1c MHz MHz ns Shabal MHz MHz ns Blake MHz MHz ns SHA SHA-2-1c MHz MHz ns Table 8. FPGA implementation summary of the different compression functions even though the selected one is the largest available FPGA in Stratix III family. In Table 8, C.F. Clk Frequency presents the maximum frequency that the compression function could be clocked at. C.F. Clk Cycles represents the required number of clock cycles for the compression function to process the input. The I/O Clk Frequency represents the maximum speed at which the SIPO and PISO modules could be clocked to load the data into and out of the FPGA. The I/O speed is limited by the FPGA itself which in our case is 2.5 ns [20]. Total Delay represents the compression function delay in addition to the I/O read and write delay to process a block of data. The area usage of the FPGA is shown as two separate columns; Combinational ALUTs and Dedicated Logic Registers. Combinational ALUTs represents the number of Adaptive Look-Up Tables (ALUTs) used inside the FPGA. It can be used as a measure of the area used by the combinational logic. Dedicated Logic Registers represents the number of memory cells used inside the FPGA. It can be used as a measure of memory usage. As can be seen from the table, Luffa presents the fastest architecture (61 ns) followed by Shabal (162 ns) and Blue Midnight Wish (184 ns). Unlike in our ASIC implementations, Blue Midnight Wish is dropped from the second place to the third and Shabal is moved up. This might be the result of having a simple small architecture for Shabal versus extensive use of multiple input 32-bit registers for Blue Midnight Wish. Skein-1c and Blake are ranked as the slowest designs the same way they are ranked in our ASIC implementations. Regarding area usage, Skein-1c requires the smallest area followed by Shabal and Blake. Note that the area requirements of Skein-1c and Shabal are still larger than SHA-2-1c, while their delays are smaller. In Fig. 20, area and time delays of each of the SHA-3 designs on FPGA are compared with respect to those of SHA Conclusions In this work we have presented hardware implementation results of the compression function block for five SHA-3 candidates (Luffa, Blue Midnight Wish, Skein, Shabal, and Blake). Hardware implementations have been carried out using the
23 Title Suppressed Due to Excessive Length 23 Fig.20. Area delay complexities for FPGA implementation 90 nm CMOS technology from STMicroelectronics for ASIC and Stratix III from Altera for FPGA. The compression function blocks implemented create the message digest of size 256 bits. Implementation results allow an easy comparison for hardware performance of the selected candidates. Among our fully unrolled ASIC designs of SHA-3 candidates namely BMW, Luffa, Skein and Blake, Luffa takes the least amount of area and is about three times smaller than SHA-2. For this type of design, Luffa has the least delay and is about ten times faster than SHA-2. For fully unrolled FPGA design, Luffa also ranks at the top for area and delay. Among our sequential ASIC designs of SHA-3 candidates, namely Skein-1c, Blake and Shabal, the latter outperforms the other two in terms of area as well as delay. For the same sequential type of designs using FPGA, Shabal and Skein- 1c rank at the top for delay and area respectively. Each of these three SHA-3 candidates requires more area but less delay than those of sequential SHA-2-1c. For a given SHA-3 candidate, we are likely to see an improved throughput per unit area in sequential versions of hardware implementation over the fully unrolled version. For example, the throughput per unit area for fully unrolled Skein is Input Size Delay Area = 1, bits/(sec µm2 ), while that for Skein-1c is 10, bits/(sec µm 2 ). Other hash algorithms, such as Blake and Shabal, are also likely to have such improved utilizations of hardware in their sequential designs. Acknowledgments This work was supported in part by NSERC strategic project grant awarded to Dr. Hasan. The authors would also like to thank Olga Nam for her assistance with debugging and testing of the VHDL codes.
24 24 A. H. Namin and M. A. Hasan References 1. Menezes, A.J., Van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. The CRC Press series on discrete mathematics and its applications, pp , National Institute of Standard and Technology (NIST): Cryptographic Hash Algorithm Competition Website: 3. The SHA-3 Zoo - The ECRYPT hash function website. Website: SHA-3 Zoo. 4. European Network of Excellence for Cryptology II (ECRYPT II): ECRYPT Benchmarking of All Submitted Hashes (EBASH): 5. Fleischmann, E., Forler, C., Gorski, M.: Classification of the SHA-3 Candidates. International Association for Cryptologic Research (IACR) eprint archive 6. Wang, X., Yao, A., Yao, F.: New Collision search for SHA-1. Crypto 2005 rump sesson. 7. Gligoroski, D., Klima, V., Knapskog S.J., El-Hadedy, M., Amundsen, J., Mjlsnes, S.T.: Cryptographic Hash Function BLUE MIDNIGHT WISH. Submission to NIST, Canniere, C.D., Sato, H., Watanabe, D.: Hash Function Luffa: Supporting Document. Submission to NIST, Ferguson, N., Lucks, S., Schneier, B., Whiting, D., Bellare, M., Kohno, T., Callas, J., Walker, J.: The Skein Hash Function Family. Submission to NIST, Bresson, E., Canteaut, A., Chevallier-Mames, B., Clavier, C., Fuhr, T., Gouget, A., Icart, T., Misarsky, J.F., Naya-Plasencia, M., Paillier, P., Pornin, T., Reinhard, J.R., Thuillet, C., Videau, M.: Shabal, a Submission to NISTs Cryptographic Hash Algorithm Competition. Submission to NIST, Aumasson,J.P., Henzen, L., Meier, W., C.-W. Phan. R.: SHA-3 proposal BLAKE. Submission to NIST, Joux, A.: Multicollisions in iterated hash functions. Application to cascaded constructions. In: Proceeding of CRYPTO LNCS, vol. 3152, pp , Lucks, S.: A failure-friendly design principle for hash functions. In: ASIACRYPT, Liskov, M., Rivest, R., Wagner, D.: Tweakable Block Ciphers. In: Advances in Cryptology-CRYPTO 2002 Proceedings, Springer-Verlag, pp. 3146, Matyas, S.M., Meyer, C.H., Oseas, J.: Generating strong one-way functions with cryptographic algorithms. IBM Technical Disclosure Bulletin, vol. 27, no. 10A, pp , Biham, E., Dunkelman. O.: A framework for iterative hash functions - HAIFA. eprint report 2007/278, J. Bernstein, D.J.: ChaCha, a variant of Salsa20. Web: Aumasson, J.P., Meier, Phan, R. C.-W.: The hash function family LAKE. In: 15th International Workshop on Fast Software Encryption, Lecture Notes in Computer Science, pp , Lucks, S.: A failure-friendly design principle for hash functions. In: ASIACRYPT, ALTERA: Stratix III Device Handbook, Volume 1, May Federal Information Processing Standards Publication (FIPS PUB) 180-2: Secure Hash Signature Standard (SHS), pp. 1-79, Aug
25 Title Suppressed Due to Excessive Length 25 A Appendix - SHA-2 Hash Function A.1 Algorithm Description The basic data block used in SHA-2 (SHA-256) is a word which is 32 bits long. SHA-2 makes use of bit-wise logical word XOR and AND operations along with word addition (modulo 2 32 ), rotate right and shift right operations in its structure. SHA-2 s compression function is made of 64 consecutive round functions. The structure of a round function is shown in Fig. 21. Fig.21. Block diagram of round function for SHA-2 The inputs to the each round are eight working variables (a, b, c, d, e, f, g, h) 32-bits each along with the two external variables K t and W t. The set of working variables (a, b, c, d.e.f.g.h) are initialized at the start of the first round calculation with the previous hash value as follows:
26 26 A. H. Namin and M. A. Hasan a = H (i 1) 0 b = H (i 1) 1 c = H (i 1) 2 d = H (i 1) 3 e = H (i 1) 4 f = H (i 1) 5 g = H (i 1) 6 h = H (i 1) 7 (13) K t variables are 32-bit constants and k t variables are created through the message schedule process as follows: W t = { Mt 0 t 15 σ 1 (W t 2 ) W t 7 σ 0 (W t 15 ) W t t 63. (14) Logical functions σ 0 and σ 1 along with the functions used inside the round function (Ch, Maj, 0, 1 ) are defined as follows: σ 0 = ROTR 7 (x) ROTR 18 (x) SHR 3 (x) σ 1 = ROTR 17 (x) ROTR 19 (x) SHR 10 (x) = ROTR 2 (x) ROTR 13 (x) ROTR 22 (x) 0 = ROTR 6 (x) ROTR 11 (x) ROTR 25 (x) 1 Ch(x, y, z) = (x y) ( x z) Maj(x, y, z) = (x y) (x z) (y z) (15) In (15), ROTR represents a rotate right operation and SHR represents a shift right operations. Also, symbol represents a logical AND operations, represents a logical XOR operation and represents a logical complements operation. After sixty four rounds of calculations, the output hash is created by the addition (modulo 2 32 ) of the last set of working variables (a, b, c, d, e, f, g, h) with the previous hash values as follows:
27 Title Suppressed Due to Excessive Length 27 H (i) 0 = a H (i 1) 0 H (i) 1 = b H (i 1) 1 H (i) 2 = c H (i 1) 2 H (i) 3 = d H (i 1) 3 H (i) 4 = e H (i 1) 4 H (i) 5 = f H (i 1) 5 H (i) 6 = g H (i 1) 6 H (i) 7 = h H (i 1) 7 (16) A.2 ASIC Implementation Fully Unrolled/Parallel Design The architecture used for the ASIC implementation of the SHA-2 is shown in Fig. 22. The previous hash value H (i 1) is stored inside the 256-bit input registers (In Reg). All the round functions have been realized completely as combinational circuits. After sixty four rounds, one level of adders (eight adders in parallel), exists to create the output hash value according to Eq. (16). Output hash values H i can be read from the output register(out Reg) after one clock cycle. Fig.22. Hardware architecture used for the compression function of SHA-2 Synthesis results of the SHA-2 compression function are summarized in Table 9. In this table, the Combinational Area represents the area mainly used by the bit adders exist in the architecture. Small part of the area utilization is used by the sixty four instances of the 0, 1, Ch, and Maj logical functions.
28 28 A. H. Namin and M. A. Hasan Input Compression Combinational I/O Register Number of Size Function Delay Area Area Gates ns 1, 587, µm 2 29, µm 2 168, 193 Table 9. ASIC implementation summary of the SHA-2 compression function Sequential Design Similar to Skein, SHA-2 compression function architecture presents a regular design as shown in Fig. 22. This property can be exploited to create a complete SHA-2 compression function by just implementing one out of sixty four rounds inside the compression function. We have implemented a second version of the SHA-2 algorithm (SHA-2-1c) using just one round which is suitable for area constrained applications. The hardware architecture used for this purpose is shown in Fig. 23. Fig.23. Hardware architecture used for the compression function of SHA-2-1c Note that some modifications have been made to the original round function of SHA-2, to make the design capable of creating the message digest. The main modification is the addition of the message schedule module in charge of generating the W t parameters from the input message blocks according to Eq. (14). This has to be done through a recursive process to present the small delay/area requirements of the design. Hardware architecture used for this purpose is shown in Fig 24. The message schedule module operates as follows: Initially all input message blocks (M 0,..., M 1 5) are loaded into the shift register. The shift register itself is made of sixteen registers each made of 32 flip-flops working in parallel. An addition module along with two other modules (σ 0, σ 1 ) exist which create the W t parameters for t 16 and load them back into the circular shift register.
29 Title Suppressed Due to Excessive Length 29 Fig. 24. Hardware architecture of the message schedule module inside the compression function of SHA-2-1c Also a multiplexer, a counter and a constant module in charge of generating the constants for each round exist inside the design as shown in Fig. 23. Synthesis results for SHA-2-1c compression function are summarized in Table 10. Input Critical Number Combinational I/O Register Number of Size Path Delay of Cycles Area Area Gates ns 64 30, µm 2 27, µm 2 4,207 Table 10. ASIC implementation summary of the SHA2-1c compression function From Tables 10 and 9 the following can be deduced: the area requirement for the SHA-2 compression function has been reduced from 1, 616, µm 2 in SHA-2 to 58, µm 2 in SHA-2-1c. This comes at the price of increasing the delay from 105.1ns in SHA-2- to ns in SHA-2-1c.
Comparison of seven SHA-3 candidates software implementations on smart cards.
Comparison of seven SHA-3 candidates software implementations on smart cards. Mourad Gouicem Oberthur Technologies Contact : {g.piret, e.prouff}@oberthur.com October 2010 Abstract In this work, we present
An Efficient Cryptographic Hash Algorithm (BSA)
An Efficient Cryptographic Hash Algorithm (BSA) Subhabrata Mukherjee 1, Bimal Roy 2, Anirban Laha 1 1 Dept of CSE, Jadavpur University, Calcutta 700 032, India 2 Indian Statistical Institute, Calcutta
Length extension attack on narrow-pipe SHA-3 candidates
Length extension attack on narrow-pipe SHA-3 candidates Danilo Gligoroski Department of Telematics, Norwegian University of Science and Technology, O.S.Bragstads plass 2B, N-7491 Trondheim, NORWAY [email protected]
ELECTENG702 Advanced Embedded Systems. Improving AES128 software for Altera Nios II processor using custom instructions
Assignment ELECTENG702 Advanced Embedded Systems Improving AES128 software for Altera Nios II processor using custom instructions October 1. 2005 Professor Zoran Salcic by Kilian Foerster 10-8 Claybrook
A NEW HASH ALGORITHM: Khichidi-1
A NEW HASH ALGORITHM: Khichidi-1 Abstract This is a technical document describing a new hash algorithm called Khichidi-1 and has been written in response to a Hash competition (SHA-3) called by National
Hash Function JH and the NIST SHA3 Hash Competition
Hash Function JH and the NIST SHA3 Hash Competition Hongjun Wu Nanyang Technological University Presented at ACNS 2012 1 Introduction to Hash Function Hash Function Design Basics Hash function JH Design
SHA3 WHERE WE VE BEEN WHERE WE RE GOING
SHA3 WHERE WE VE BEEN WHERE WE RE GOING Bill Burr May 1, 2013 updated version of John Kelsey s RSA2013 presentation Overview of Talk Where We ve Been: Ancient history 2004 The Competition Where We re Going
The implementation and performance/cost/power analysis of the network security accelerator on SoC applications
The implementation and performance/cost/power analysis of the network security accelerator on SoC applications Ruei-Ting Gu [email protected] Kuo-Huang Chung [email protected]
How To Encrypt With A 64 Bit Block Cipher
The Data Encryption Standard (DES) As mentioned earlier there are two main types of cryptography in use today - symmetric or secret key cryptography and asymmetric or public key cryptography. Symmetric
HASH CODE BASED SECURITY IN CLOUD COMPUTING
ABSTRACT HASH CODE BASED SECURITY IN CLOUD COMPUTING Kaleem Ur Rehman M.Tech student (CSE), College of Engineering, TMU Moradabad (India) The Hash functions describe as a phenomenon of information security
Hardware Implementation of the Stone Metamorphic Cipher
International Journal of Computer Science & Network Security VOL.10 No.8, 2010 Hardware Implementation of the Stone Metamorphic Cipher Rabie A. Mahmoud 1, Magdy Saeb 2 1. Department of Mathematics, Faculty
Implementation of Full -Parallelism AES Encryption and Decryption
Implementation of Full -Parallelism AES Encryption and Decryption M.Anto Merline M.E-Commuication Systems, ECE Department K.Ramakrishnan College of Engineering-Samayapuram, Trichy. Abstract-Advanced Encryption
How To Attack Preimage On Hash Function 2.2 With A Preimage Attack On A Pre Image
Preimage Attacks on 4-Step SHA-256 and 46-Step SHA-52 Yu Sasaki, Lei Wang 2, and Kazumaro Aoki NTT Information Sharing Platform Laboratories, NTT Corporation 3-9- Midori-cho, Musashino-shi, Tokyo, 8-8585
Pre-silicon Characterization of NIST SHA-3 Final Round Candidates
2011 14th Euromicro Conference on Digital System Design Pre-silicon Characterization of NIST Final Round Candidates Xu Guo, Meeta Srivastav, Sinan Huang, Dinesh Ganta, Michael B. Henry, Leyla Nazhandali
1 Data Encryption Algorithm
Date: Monday, September 23, 2002 Prof.: Dr Jean-Yves Chouinard Design of Secure Computer Systems CSI4138/CEG4394 Notes on the Data Encryption Standard (DES) The Data Encryption Standard (DES) has been
A PPENDIX H RITERIA FOR AES E VALUATION C RITERIA FOR
A PPENDIX H RITERIA FOR AES E VALUATION C RITERIA FOR William Stallings Copyright 20010 H.1 THE ORIGINS OF AES...2 H.2 AES EVALUATION...3 Supplement to Cryptography and Network Security, Fifth Edition
Lecture 4 Data Encryption Standard (DES)
Lecture 4 Data Encryption Standard (DES) 1 Block Ciphers Map n-bit plaintext blocks to n-bit ciphertext blocks (n = block length). For n-bit plaintext and ciphertext blocks and a fixed key, the encryption
Message Authentication
Message Authentication message authentication is concerned with: protecting the integrity of a message validating identity of originator non-repudiation of origin (dispute resolution) will consider the
The Stream Cipher HC-128
The Stream Cipher HC-128 Hongjun Wu Katholieke Universiteit Leuven, ESAT/SCD-COSIC Kasteelpark Arenberg 10, B-3001 Leuven-Heverlee, Belgium [email protected] Statement 1. HC-128 supports 128-bit
Hash Function of Finalist SHA-3: Analysis Study
International Journal of Advanced Computer Science and Information Technology (IJACSIT) Vol. 2, No. 2, April 2013, Page: 1-12, ISSN: 2296-1739 Helvetic Editions LTD, Switzerland www.elvedit.com Hash Function
Cryptography Lecture 8. Digital signatures, hash functions
Cryptography Lecture 8 Digital signatures, hash functions A Message Authentication Code is what you get from symmetric cryptography A MAC is used to prevent Eve from creating a new message and inserting
Overview of Cryptographic Tools for Data Security. Murat Kantarcioglu
UT DALLAS Erik Jonsson School of Engineering & Computer Science Overview of Cryptographic Tools for Data Security Murat Kantarcioglu Pag. 1 Purdue University Cryptographic Primitives We will discuss the
Cryptography and Network Security. Prof. D. Mukhopadhyay. Department of Computer Science and Engineering. Indian Institute of Technology, Kharagpur
Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 12 Block Cipher Standards
IJESRT. [Padama, 2(5): May, 2013] ISSN: 2277-9655
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design and Verification of VLSI Based AES Crypto Core Processor Using Verilog HDL Dr.K.Padama Priya *1, N. Deepthi Priya 2 *1,2
Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 11 Block Cipher Standards (DES) (Refer Slide
FPGA IMPLEMENTATION OF AN AES PROCESSOR
FPGA IMPLEMENTATION OF AN AES PROCESSOR Kazi Shabbir Ahmed, Md. Liakot Ali, Mohammad Bozlul Karim and S.M. Tofayel Ahmad Institute of Information and Communication Technology Bangladesh University of Engineering
Implementation and Design of AES S-Box on FPGA
International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 232-9364, ISSN (Print): 232-9356 Volume 3 Issue ǁ Jan. 25 ǁ PP.9-4 Implementation and Design of AES S-Box on FPGA Chandrasekhar
Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs
Cryptographic hash functions and MACs Solved Exercises for Cryptographic Hash Functions and MACs Enes Pasalic University of Primorska Koper, 2014 Contents 1 Preface 3 2 Problems 4 2 1 Preface This is a
The Advanced Encryption Standard: Four Years On
The Advanced Encryption Standard: Four Years On Matt Robshaw Reader in Information Security Information Security Group Royal Holloway University of London September 21, 2004 The State of the AES 1 The
Enhancing Advanced Encryption Standard S-Box Generation Based on Round Key
Enhancing Advanced Encryption Standard S-Box Generation Based on Round Key Julia Juremi Ramlan Mahmod Salasiah Sulaiman Jazrin Ramli Faculty of Computer Science and Information Technology, Universiti Putra
Design and Verification of Area-Optimized AES Based on FPGA Using Verilog HDL
Design and Verification of Area-Optimized AES Based on FPGA Using Verilog HDL 1 N. Radhika, 2 Obili Ramesh, 3 Priyadarshini, 3 Asst.Profosser, 1,2 M.Tech ( Digital Systems & Computer Electronics), 1,2,3,
Secret File Sharing Techniques using AES algorithm. C. Navya Latha 200201066 Garima Agarwal 200305032 Anila Kumar GVN 200305002
Secret File Sharing Techniques using AES algorithm C. Navya Latha 200201066 Garima Agarwal 200305032 Anila Kumar GVN 200305002 1. Feature Overview The Advanced Encryption Standard (AES) feature adds support
Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language
Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,
CSCE 465 Computer & Network Security
CSCE 465 Computer & Network Security Instructor: Dr. Guofei Gu http://courses.cse.tamu.edu/guofei/csce465/ Secret Key Cryptography (I) 1 Introductory Remarks Roadmap Feistel Cipher DES AES Introduction
Application of cube attack to block and stream ciphers
Application of cube attack to block and stream ciphers Janusz Szmidt joint work with Piotr Mroczkowski Military University of Technology Military Telecommunication Institute Poland 23 czerwca 2009 1. Papers
Cryptography and Network Security Chapter 11
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown (with edits by RHB) Chapter 11 Cryptographic Hash Functions Each of the messages, like each
Authentication requirement Authentication function MAC Hash function Security of
UNIT 3 AUTHENTICATION Authentication requirement Authentication function MAC Hash function Security of hash function and MAC SHA HMAC CMAC Digital signature and authentication protocols DSS Slides Courtesy
Developing and Investigation of a New Technique Combining Message Authentication and Encryption
Developing and Investigation of a New Technique Combining Message Authentication and Encryption Eyas El-Qawasmeh and Saleem Masadeh Computer Science Dept. Jordan University for Science and Technology P.O.
Cryptanalysis of Grain using Time / Memory / Data Tradeoffs
Cryptanalysis of Grain using Time / Memory / Data Tradeoffs v1.0 / 2008-02-25 T.E. Bjørstad The Selmer Center, Department of Informatics, University of Bergen, Pb. 7800, N-5020 Bergen, Norway. Email :
An On-chip Security Monitoring Solution For System Clock For Low Cost Devices
An On-chip Security Monitoring Solution For System Clock For Low Cost Devices Frank Vater Innovations for High Performance Microelectronics Im Technologiepark 25 15236 Frankfurt (Oder), Germany [email protected]
The Advanced Encryption Standard (AES)
The Advanced Encryption Standard (AES) Conception - Why A New Cipher? Conception - Why A New Cipher? DES had outlived its usefulness Vulnerabilities were becoming known 56-bit key was too small Too slow
ChaCha, a variant of Salsa20
ChaCha, a variant of Salsa20 Daniel J. Bernstein Department of Mathematics, Statistics, and Computer Science (M/C 249) The University of Illinois at Chicago Chicago, IL 60607 7045 [email protected]
FPGA Implementation of RSA Encryption Engine with Flexible Key Size
FPGA Implementation of RSA Encryption Engine with Flexible Key Size Muhammad I. Ibrahimy, Mamun B.I. Reaz, Khandaker Asaduzzaman and Sazzad Hussain Abstract An approach to develop the FPGA of a flexible
Hardware Implementation of AES Encryption and Decryption System Based on FPGA
Send Orders for Reprints to [email protected] The Open Cybernetics & Systemics Journal, 2015, 9, 1373-1377 1373 Open Access Hardware Implementation of AES Encryption and Decryption System Based
The Advanced Encryption Standard (AES)
The Advanced Encryption Standard (AES) All of the cryptographic algorithms we have looked at so far have some problem. The earlier ciphers can be broken with ease on modern computation systems. The DES
Flip-Flops, Registers, Counters, and a Simple Processor
June 8, 22 5:56 vra235_ch7 Sheet number Page number 349 black chapter 7 Flip-Flops, Registers, Counters, and a Simple Processor 7. Ng f3, h7 h6 349 June 8, 22 5:56 vra235_ch7 Sheet number 2 Page number
Network Security Technology Network Management
COMPUTER NETWORKS Network Security Technology Network Management Source Encryption E(K,P) Decryption D(K,C) Destination The author of these slides is Dr. Mark Pullen of George Mason University. Permission
Salsa20/8 and Salsa20/12
Salsa20/8 and Salsa20/12 Daniel J. Bernstein Department of Mathematics, Statistics, and Computer Science (M/C 249) The University of Illinois at Chicago Chicago, IL 60607 7045 [email protected] Introduction.
The Misuse of RC4 in Microsoft Word and Excel
The Misuse of RC4 in Microsoft Word and Excel Hongjun Wu Institute for Infocomm Research, Singapore [email protected] Abstract. In this report, we point out a serious security flaw in Microsoft
Table of Contents. Bibliografische Informationen http://d-nb.info/996514864. digitalisiert durch
1 Introduction to Cryptography and Data Security 1 1.1 Overview of Cryptology (and This Book) 2 1.2 Symmetric Cryptography 4 1.2.1 Basics 4 1.2.2 Simple Symmetric Encryption: The Substitution Cipher...
Lecture Note 8 ATTACKS ON CRYPTOSYSTEMS I. Sourav Mukhopadhyay
Lecture Note 8 ATTACKS ON CRYPTOSYSTEMS I Sourav Mukhopadhyay Cryptography and Network Security - MA61027 Attacks on Cryptosystems Up to this point, we have mainly seen how ciphers are implemented. We
Cryptography and Network Security
Cryptography and Network Security Spring 2012 http://users.abo.fi/ipetre/crypto/ Lecture 3: Block ciphers and DES Ion Petre Department of IT, Åbo Akademi University January 17, 2012 1 Data Encryption Standard
Network Security. Security. Security Services. Crytographic algorithms. privacy authenticity Message integrity. Public key (RSA) Message digest (MD5)
Network Security Security Crytographic algorithms Security Services Secret key (DES) Public key (RSA) Message digest (MD5) privacy authenticity Message integrity Secret Key Encryption Plain text Plain
AES1. Ultra-Compact Advanced Encryption Standard Core. General Description. Base Core Features. Symbol. Applications
General Description The AES core implements Rijndael encoding and decoding in compliance with the NIST Advanced Encryption Standard. Basic core is very small (start at 800 Actel tiles). Enhanced versions
Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1
(DSF) Quartus II Stand: Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de [email protected] Quartus II 1 Quartus II Software Design Series : Foundation 2007 Altera
AN IMPLEMENTATION OF HYBRID ENCRYPTION-DECRYPTION (RSA WITH AES AND SHA256) FOR USE IN DATA EXCHANGE BETWEEN CLIENT APPLICATIONS AND WEB SERVICES
HYBRID RSA-AES ENCRYPTION FOR WEB SERVICES AN IMPLEMENTATION OF HYBRID ENCRYPTION-DECRYPTION (RSA WITH AES AND SHA256) FOR USE IN DATA EXCHANGE BETWEEN CLIENT APPLICATIONS AND WEB SERVICES Kalyani Ganesh
Cryptography and Network Security Chapter 3
Cryptography and Network Security Chapter 3 Fifth Edition by William Stallings Lecture slides by Lawrie Brown (with edits by RHB) Chapter 3 Block Ciphers and the Data Encryption Standard All the afternoon
Quartus II Introduction for VHDL Users
Quartus II Introduction for VHDL Users This tutorial presents an introduction to the Quartus II software. It gives a general overview of a typical CAD flow for designing circuits that are implemented by
A low-cost Alternative for OAEP
A low-cost Alternative for OAEP Peter Schartner University of Klagenfurt Computer Science System Security [email protected] Technical Report TR-syssec-11-02 Abstract When encryption messages by use
Chapter 7. Registers & Register Transfers. J.J. Shann. J. J. Shann
Chapter 7 Registers & Register Transfers J. J. Shann J.J. Shann Chapter Overview 7- Registers and Load Enable 7-2 Register Transfers 7-3 Register Transfer Operations 7-4 A Note for VHDL and Verilog Users
Network Security. Chapter 3 Symmetric Cryptography. Symmetric Encryption. Modes of Encryption. Symmetric Block Ciphers - Modes of Encryption ECB (1)
Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle Network Security Chapter 3 Symmetric Cryptography General Description Modes of ion Data ion Standard (DES)
A Novel Secure Hash Algorithm for Public Key Digital Signature Schemes
262 The International Arab Journal of Information Technology, Vol. 9, No. 3, May 2012 A Novel Secure Hash Algorithm for Public Key Digital Signature Schemes Thulasimani Lakshmanan 1 and Madheswaran Muthusamy
Design and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue, Ver. III (Jan - Feb. 205), PP 0- e-issn: 239 4200, p-issn No. : 239 497 www.iosrjournals.org Design and Analysis of Parallel AES
Memory Elements. Combinational logic cannot remember
Memory Elements Combinational logic cannot remember Output logic values are function of inputs only Feedback is needed to be able to remember a logic value Memory elements are needed in most digital logic
Digital Systems Design! Lecture 1 - Introduction!!
ECE 3401! Digital Systems Design! Lecture 1 - Introduction!! Course Basics Classes: Tu/Th 11-12:15, ITE 127 Instructor Mohammad Tehranipoor Office hours: T 1-2pm, or upon appointments @ ITE 441 Email:
International Journal of Electronics and Computer Science Engineering 1482
International Journal of Electronics and Computer Science Engineering 1482 Available Online at www.ijecse.org ISSN- 2277-1956 Behavioral Analysis of Different ALU Architectures G.V.V.S.R.Krishna Assistant
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0020 ISSN (Online): 2279-0039 International
Counters & Shift Registers Chapter 8 of R.P Jain
Chapter 3 Counters & Shift Registers Chapter 8 of R.P Jain Counters & Shift Registers Counters, Syllabus Design of Modulo-N ripple counter, Up-Down counter, design of synchronous counters with and without
Automata Designs for Data Encryption with AES using the Micron Automata Processor
IJCSNS International Journal of Computer Science and Network Security, VOL.15 No.7, July 2015 1 Automata Designs for Data Encryption with AES using the Micron Automata Processor Angkul Kongmunvattana School
Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems
Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College [email protected] Based on EE271 developed by Mark Horowitz, Stanford University MAH
Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001
Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering
Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner
Hardware Implementations of RSA Using Fast Montgomery Multiplications ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Overview Introduction Functional Specifications Implemented Design and Optimizations
RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition
RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition A Tutorial Approach James O. Hamblen Georgia Institute of Technology Michael D. Furman Georgia Institute of Technology KLUWER ACADEMIC PUBLISHERS Boston
Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan
Parallel AES Encryption with Modified Mix-columns For Many Core Processor Arrays M.S.Arun, V.Saminathan Abstract AES is an encryption algorithm which can be easily implemented on fine grain many core systems.
How To Design A Chip Layout
Spezielle Anwendungen des VLSI Entwurfs Applied VLSI design (IEF170) Course and contest Intermediate meeting 3 Prof. Dirk Timmermann, Claas Cornelius, Hagen Sämrow, Andreas Tockhorn, Philipp Gorski, Martin
Recommendation for Applications Using Approved Hash Algorithms
NIST Special Publication 800-107 Recommendation for Applications Using Approved Hash Algorithms Quynh Dang Computer Security Division Information Technology Laboratory C O M P U T E R S E C U R I T Y February
On the Influence of the Algebraic Degree of the Algebraic Degree of
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 59, NO. 1, JANUARY 2013 691 On the Influence of the Algebraic Degree of the Algebraic Degree of Christina Boura and Anne Canteaut on Abstract We present a
A NOVEL STRATEGY TO PROVIDE SECURE CHANNEL OVER WIRELESS TO WIRE COMMUNICATION
A NOVEL STRATEGY TO PROVIDE SECURE CHANNEL OVER WIRELESS TO WIRE COMMUNICATION Prof. Dr. Alaa Hussain Al- Hamami, Amman Arab University for Graduate Studies [email protected] Dr. Mohammad Alaa Al-
Modeling Registers and Counters
Lab Workbook Introduction When several flip-flops are grouped together, with a common clock, to hold related information the resulting circuit is called a register. Just like flip-flops, registers may
Security Analysis of DRBG Using HMAC in NIST SP 800-90
Security Analysis of DRBG Using MAC in NIST SP 800-90 Shoichi irose Graduate School of Engineering, University of Fukui hrs [email protected] Abstract. MAC DRBG is a deterministic random bit generator
Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah
(DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de [email protected] NIOS II 1 1 What is Nios II? Altera s Second Generation
A New 128-bit Key Stream Cipher LEX
A New 128-it Key Stream Cipher LEX Alex Biryukov Katholieke Universiteit Leuven, Dept. ESAT/SCD-COSIC, Kasteelpark Arenerg 10, B 3001 Heverlee, Belgium http://www.esat.kuleuven.ac.e/~airyuko/ Astract.
Computer Security: Principles and Practice
Computer Security: Principles and Practice Chapter 20 Public-Key Cryptography and Message Authentication First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Public-Key Cryptography
Cryptography and Network Security Block Cipher
Cryptography and Network Security Block Cipher Xiang-Yang Li Modern Private Key Ciphers Stream ciphers The most famous: Vernam cipher Invented by Vernam, ( AT&T, in 1917) Process the message bit by bit
lundi 1 octobre 2012 In a set of N elements, by picking at random N elements, we have with high probability a collision two elements are equal
Symmetric Crypto Pierre-Alain Fouque Birthday Paradox In a set of N elements, by picking at random N elements, we have with high probability a collision two elements are equal N=365, about 23 people are
Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: [email protected]. Sequential Circuit
Modeling Sequential Elements with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: [email protected] 4-1 Sequential Circuit Outputs are functions of inputs and present states of storage elements
AsicBoost A Speedup for Bitcoin Mining
AsicBoost A Speedup for Bitcoin Mining Dr. Timo Hanke March 31, 2016 (rev. 5) Abstract. AsicBoost is a method to speed up Bitcoin mining by a factor of approximately 20%. The performance gain is achieved
Hash Functions. Integrity checks
Hash Functions EJ Jung slide 1 Integrity checks Integrity vs. Confidentiality! Integrity: attacker cannot tamper with message! Encryption may not guarantee integrity! Intuition: attacker may able to modify
FPGA Implementation of IP Packet Segmentation and Reassembly in Internet Router*
SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 6, No. 3, December 2009, 399-407 UDK: 004.738.5.057.4 FPGA Implementation of IP Packet Segmentation and Reassembly in Internet Router* Marko Carević 1,a,
Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead
Clock - key to synchronous systems Topic 7 Clocking Strategies in VLSI Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London Clocks help the design of FSM where
