A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers

Size: px
Start display at page:

Download "A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers"

Transcription

1 2014 IEEE 22nd International Symposium on Field-Programmable Custom Computing Machines A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers Klaus Schneider and Adrian Willenbücher Embedded Systems Group University of Kaiserslautern Kaiserslautern, Germany {schneider, willenbuecher}@cs.uni-kl.de Abstract Signed-digit (SD) numbers generalize traditional radix numbers by allowing negative digits within a certain range. Typically, this leads to redundant number representations that can be used to avoid the carry propagation problem of addition of radix numbers. Unfortunately, as proved by Avizienis, the standard algorithm for carry-free addition of SD numbers does not work for the binary case. In this paper, we therefore construct a special algorithm for the carry-free addition and subtraction of binary SD numbers, i.e., addition and subtraction of n-digit numbers are performed with circuits of depth O(1) and size O(n). This is possible by computing in addition to the transfer digits used by the standard algorithm one additional bit that allows us to distinguish relevant cases to avoid propagation of dependencies. The additional bit and the transfer digit used to compute the sum digit at position i depend only on the summands digits at positions i and i 1 so that all sum digits can be computed with a hardware circuit of a depth that is independent of the number of digits. We first explain the basics of the standard addition algorithm to derive the additional information needed to fix the algorithm for the binary case. After proving the correctness of our algorithm, we present experimental results that show that our implementation clearly outperforms two s complement addition even for small numbers, and saves 50% of the required chip area compared to other carry-free implementations. I. INTRODUCTION Although there are many other number systems, simple radix numbers to a base B > 0 are still popular in computer arithmetic. An n-digit radix-b number is thereby given as a sequence of digits [x n 1,...,x 0 ] with x i {0,...,B 1} that denotes the following natural number: n 1 [x n 1,...,x 0 ] B := x i B i i=0 It is well-known that the addition of radix-b numbers suffers inherently from carry propagation: In the worst case, a carry is generated when adding the least significant digits x 0 and y 0, and is then propagated from the rightmost digits x 0,y 0 to the leftmost digits x n 1,y n 1. As a consequence, simple carry-ripple adders have depth 1 O(n). Even though this can be reduced to a depth of O(log(n)), e.g., by carry-lookahead adders [1], the depth still grows with the number of digits. 1 The depth of a circuit is the length of the longest path from inputs to outputs. Circuits with a depth depending on the number of digits n limit the clock speed of synchronous circuits in terms of n. For radix-b numbers, it is not difficult to see that addition, subtraction, multiplication, division and comparison operations of n-digit numbers all require a depth of at least O(log(n)) since the digits of the results depend on all digits of the operands. For all basic operations, optimal O(log(n)) algorithms are known, even though these require sometimes substantial mathematical effort [2] [4]. Since this minimal O(log(n)) depth cannot be improved for radix-b numbers, one has to consider non-conventional number systems for improvements. For example, residue number systems (RNS) [5], [6] encode a number x by its moduli (x 1,...,x n ) := ((x mod p 1 ),...,(x mod p n )) that are unique for numbers x {0,...,( n i=1 p i) 1} for relatively prime numbers p i. Addition, subtraction, and multiplication can be done in parallel on the moduli, and thus, with a depth O(1). Division can only be done by iterative methods like Newton-Raphson or Goldschmidt iteration which lead again to a depth of O(log(n)). The main problems for RNS numbers are however that comparison (<) is not possible and that conversions to and from radix numbers are relatively expensive. An alternative to RNS numbers are signed-digit (SD) numbers [3], [7] [12] that allow negative digits of a range { D,...,+D} with D < B for radix-b numbers. Due to the redundant number representation, addition and subtraction can be implemented with a depth of O(1), i.e., independent of the number of digits, while multiplication, division, and comparison can still be implemented with a depth of O(log(n)). The key to carry-free addition is thereby to switch to another representation of the sum in case carries would have to be generated (see Section II-A). However, the standard algorithm for addition and subtraction of SD numbers [7] does not work for the important base B =2as we will also explain in Section II-A. For this reason, Parhami [8] and others [13] suggested to recode the given input numbers so that the later addition and subtraction of binary numbers will become carry-free. In this paper, we prove that the standard algorithm of Avizienis can be refined to correctly handle binary SD numbers. Avizienis algorithm computes for two digits x i and y i, a transfer digit t i+1 {1, 0, +1} and an interim sum w i /14 $ IEEE DOI /.22 44

2 such that the sum digit s i can be computed as s i = t i + w i. Our algorithm computes an additional condition l i that stores some important information to define the transfer and sum digits. Our transfer digits t i depend on the operand digits x i 1,y i 1,x i 2,y i 2 and the additional condition l i depends on x i 1,y i 1 only, so that our algorithm has still depth O(1). We implemented our algorithm on FPGAs and compared its speed and area requirements with previous approaches to SD addition and also with a carry-lookahead adder. It turned out that our algorithm is faster than a hybrid carry-lookahead/carry-ripple adder for more than 24 bits on our hardware platform, and requires just about 50% of the chip area of other SD addition circuits. Our paper is organized as follows: In Section II, we discuss Avizienis algorithm for adding SD numbers. In Section III, we first analyze why that algorithm does not work for the case of binary numbers, and then develop a solution for this problem in Section III-B. To demonstrate the efficiency of our algorithm, we present experimental results in Section IV. II. PREVIOUS WORK In this section, we review known results about signed-digit numbers. To this end, we provide new proofs that allow us to discuss in the next section where the difficulties to define a carry-free addition for binary SD numbers come from. A. Signed-Digit Numbers Avizienis introduced in [7] the following SD numbers to a radix B>1 and a digit set { D,...,+D}: Definition 1: Given some number D and a radix B>1, a sequence [x n 1,...,x 0 ] of digits x i { D,...,+D} encodes the following integer: n 1 [x n 1,...,x 0 ] D,B := x i B i i=0 There may be several SD representations of the same number. For example, for B =3and D =2, the value 5 can be encoded as [2, 1], [1, 2] or [1, 1, 1]. To understand the different redundant representations of a number, we list the following well-known theorem without proof: Theorem 1 (Uniqueness of Division with Remainder): For all integers x, y Z with y 0, there are uniquely defined numbers q, r Z with x = q y +r and 0 r< y. We therefore write q := (x div y) and r := (x mody). By the above theorem, we conclude the following result: Lemma 1 (SD Number Representations): x = [x n 1,...,x 0 ] D,B = [x n 1,...,x 0] D,B implies x 0 = x 0 + k B for some k Z. Proof: Using y 1 := [x n 1,...,x 1 ] D,B and y 1 := [x n 1,...,x 1] D,B, we obviously have x = y 1 B + x 0 = y 1 B + x 0, and therefore x 0 x 0 =(y 1 y 1) B holds. Hence, x 0 x 0 is a multiple of B, so that the proposition holds with k := y 1 y 1. Due to the redundant representations of a number, it is not possible to reduce equality testing to checking the equality of the corresponding digits. However, due to the (constant depth) reduction x = y x y =0, checking equality can be reduced to checking whether the result is zero. This is possible with depth O(log(n)) if zero has a unique representation (i.e., all digits being zero). To be able to check equality of SD numbers, Avizienis therefore imposed that D<Bmust hold because of the following result: Theorem 2 (Unique Representation of Zero): The number 0 has a unique representation as SD number [x n 1,...,x 0 ] D,B if and only if D<Bholds. Proof: For any n, wehave [x n 1,...,x 0 ] D,B =0for x i =0. For any other representation [x n 1,...,x 0] D,B = 0 with x 0 x 0, we would have x 0 x 0 = x 0 = k B with k 0by the previous lemma. However, this is impossible iff x 0 { D,...,+D} { (B 1),...,B 1} holds. Hence, we see that x 0 =0is uniquely determined for x = 0 if and only if D < B holds. Then, we have [x n 1,...,x 1 ] D,B =0, and the same argument applies to the next digit x 1, and so on. For example, we have [1, B] D,B = [ 1,B] D,B =0if we would allow D = B. Hence, we always assume D<B in the following to ensure the unique representation of 0. This uniqueness result can be generalized to other least significant digits x 0 : Assume first that B 2 D (and thus B D D) holds, so that we can partition the legal digits { D,...,+D} into the following intervals: D...D B D B +1...B D 1 B D...D By Lemma 1, the digits D 1 := { D,...,D B} and D +1 := {B D,...,D} can be mapped to each other by either adding or subtracting B, while for the digits D 0 := {D B +1,...,B D 1} no legal digits are obtained this way. Thus, digits in D 0 are uniquely determined, while digits in either D 1 or D +1 have exactly one alternative. Choosing the alternative, we have to either increment or decrement the next digit x i+1, and then the same discussion can be repeated for x i+1. However, if B > 2 D holds, then there are no alternatives left for the digits (since x 0 B D B <D 2 D = D). Hence, to ensure redundancy, we have to impose as second constraint B 2 D (in addition to D<B) to obtain the following result: Lemma 2 (Redundancy of SD Representations): For any SD number x = [x n 1,...,x 0 ] D,B with D < B 2 D, the following holds: If (x modb) {0,...,B D 1}, then x 0 is uniquely defined as x 0 := (x modb). If (x modb) {B D,...,D}, then either x 0 = (x modb) or x 0 =(xmodb) B holds, thus there are exactly two solutions for x 0. If (x modb) {D+1,...,B 1}, then x 0 is uniquely defined as x 0 := (x modb) B. 45

3 Table I POSSIBLE DECOMPOSITIONS u i = x i + y i = t i+1 B + w i WITH x i,y i,w i { D,...,+D} ASSUMING D<B 2 D. range of u i possible decomposition u i = t i+1 B + w i with w i { D,...,+D} u i { 2D,..., D 1} (t i+1,w i )=( 1,B+ u i ) with w i {B 2D,...,B D 1} { D +1,...,D 1} u i { D,..., B + D} (t i+1,w i )=( 1,B+ u i ) with w i {B D,...,D} { D +1,...,D} or (t i+1,w i )=(0,u i ) with w i { D,..., B + D} { D,...,D 1} u i { (B D 1),...,B D 1} (t i+1,w i )=(0,u i ) with w i { (B D 1),...,B D 1} { D +1,...,D 1} u i {B D,...,D} (t i+1,w i )=(0,u i ) with w i {B D,...,D} { D +1,...,D} or (t i+1,w i )=(+1, B + u i ) with w i { D,..., B + D} { D,...,D 1} u i {D +1,...,2D} (t i+1,w i )=(+1, B + u i ) with w i {D B +1,...,2D B} { D +1,...,D 1} The constraint D < B is added to ensure the unique representation of zero (to ensure that we can check equality of SD numbers) while the second constraint B 2 D is added to ensure a minimal redundancy that can be exploited for a carry-free addition as explained below. Note that Avizienis imposed a stronger second constraint B<2 D that then excludes the case B = 2. We will see in the following discussion why he did so and why we will not be that strict. The above lemma is the key to construct a carry-free addition algorithm: If two SD numbers [x n 1,...,x 0 ] D,B and [y n 1,...,y 0 ] D,B have to be added, we may first consider the expression [u n 1,...,u 0 ] D,B with u i := x i + y i. Since each x i and each y i are legal digits, we have 2 D u i 2 D. According to Avizienis, each u i is decomposed into an outgoing transfer digit t i+1 and an interim sum digit w i so that x i + y i = u i = t i+1 B + w i holds. Due to 2 B< 2 D x i + y i 2 D<2 B, it follows that t i+1 { 1, 0, +1} holds for all such decompositions. Note that a particular choice t i+1 { 1, 0, +1} determines the range of u i = t i+1 B + w i, so that we can easily prove the following lemma (note that D < B 2 D implies B D< 2D < D B + D<0 <B D D< 2D <B+ D): Lemma 3: For given digits x i,y i { D,...,+D} with D<B 2 D, the number u i = x i +y i can be decomposed as u i = t i+1 B + w i with w i { D,...,+D} and t i+1 { 1, 0, +1} as shown in Table I. The proof is easily obtained by checking the cases mentioned in Table I. The final step of the computation consists now in computing the sum digits s i := w i +t i by means of the transfer and interim sum digits. We have to make sure that these additions will not produce a carry. For this reason, Avizienis demanded that w i { D +1,...,+D 1} must hold, which is also possible according to the following lemma: Lemma 4: For given digits x i,y i { D,...,+D} with D<B<2 D, the number u i = x i +y i can be decomposed as u i = t i+1 B + w i with w i { D +1,...,+D 1} as shown in Table I. Proof: The proof is easily obtained by checking all the cases mentioned in Table I. Note that the cases with u i { D,..., B + D} and u i {B D,...,D} allow two different decompositions and for each case, there is one u i that produces an interim sum w i { D+1,...,+D 1}.In that case, however, we use the other possible decomposition and can therefore ensure w i { D +1,...,+D 1}. Since it is possible to find a decomposition with w i { D +1,...,+D 1}, it is now possible to compute the final sum digits s i := w i + t i without producing a carry! However, the reader might have noted that we had to strengthen the constraint D<B 2 D used before to D<B<2 D to make this possible. Based on the above lemma, the carry-free addition due to Avizienis is now as follows: Theorem 3 (Carry-Free Addition by Avizienis): The addition of SD numbers x = [x n 1,...,x 0 ] D,B and y = [y n 1,...,y 0 ] D,B with D < B < 2 D can be computed in depth O(1) with O(n) work (gates) as follows: 1) for i {0,...,n 1}, compute u i := x i + y i 2) for i {0,...,n 1}, +1 :ifu i +D compute t i+1 := 1 :ifu i D 0 :if D<u i < +D 3) for i {0,...,n 1}, compute digits s i := t i + u i t i+1 B with t 0 := 0 } {{ } =:w i The final sum is then the SD number s = [t n,s n 1,...,s 0 ] D,B. Each of the above steps can be performed in parallel, so that the sum can be computed in three steps. Moreover, in the 46

4 cases u i { D,..., B + D} and u i {B D,...,D} the algorithm prefers the decomposition with t i+1 = 0 except for the cases u i = ±D, where the other possible decomposition is used. This way, we always have w i { D +1,...,+D 1} and therefore, the final addition s i := t i + w i produces a legal digit. Other operations can be implemented as follows: Subtraction of x and y can be simply performed by addition of x and y = [ y n 1,..., y 0 ] D,B which can also be done with depth O(1) and work O(n). 2 Checking equality of x and y is reduced to checking whether x y =0holds. The subtraction can be done with depth O(1) and work O(n), but checking that all obtained digits are zero requires depth O(log n) and work O(n). Comparing x<yis reduced to testing for x y< 0. The subtraction can be done with depth O(1) and work O(n), but checking the sign may require depth O(log(n)) since some of the leading digits can be zero (the sign of the first non-zero digit determines the sign). Multiplication can be obtained by adding the partial products x y i B i which can be arranged with a depth of O(log(n)) and work O(n 2 ) [14], [15]. Division can be implemented by multiplication of the integer reciprocal, requiring depth O(log(n)) and work O(n 2 ) [2]. Hence, SD numbers are an interesting number representation that leads to efficient arithmetic algorithms. B. Binary SD Numbers Avizienis already noted that his algorithm does not work for binary SD numbers for the reasons we explained in the previous section. Using the weaker constraints D<B 2 D, we can reconsider Table I that reduces for B =2and D =1to the following decompositions: u i (t i+1,w i ) 2 ( 1, 0) 1 (0, 1) or ( 1, +1) 0 (0, 0) +1 (0, +1) or (+1, 1) +2 (+1, 0) As can be seen, there is no decomposition that always allows us to achieve that w i { D+1,...,+D 1} = {0} holds. For this reason, it was widely accepted that there is no carryfree addition for general binary SD numbers. One possible solution is to consider a radix B =2 k and to represent digits x i then as two s complement numbers with k +1 bits. The disadvantage is that the depth is increased to O(log(k)) (due to addition of two s complement numbers with k bits), as considered in [11]. Since small numbers 2 The work of a parallel algorithm is the number of executed operations, i.e., the number of gates of the corresponding circuit. k can be chosen, this may still be a practical solution. Many papers consider also variants of these SD number representations, e.g. using asymmetric digits sets [12]. As another solution, Parhami [8] suggested recoding a given binary SD number x of length n to an equivalent SD number x of length n +1 such that there are no two neighboring digits x i+1 and x i with x i+1 x i = 1. Unfortunately, the output of his addition algorithm does not satisfy this condition, so that it has to be recoded again before another addition takes place. This does not only increase the required chip area, but also adds further latency to each addition. Other works on recoding SD numbers are discussed in [13]. We therefore considered whether it is possible to construct a direct algorithm for the addition of binary SD numbers despite the problems with the decomposition mentioned in the previous section. As we report in the next section, it turns out that there is indeed such an algorithm, and it can be efficiently implemented in hardware. III. OUR ALGORITHM FOR CARRY-FREE ADDITION OF BINARY SD NUMBERS A. Analyzing the Problem Below, we first analyze the problem for base B =2and then construct a carry-free binary SD addition algorithm. We have to add two digits x i and y i of given numbers plus the transfer digit t i that comes from the neighboring digits to the right. All of x i, y i, and t i belong to the digit set { 1, 0, +1}, and we have to define transfer digits t i+1,an interim sum w i, and the final sum digit s i such that the following constraints hold: 1) x i + y i =2 t i+1 + w i 2) s i = t i + w i 3) t i+1, w i and s i are digits from { 1, 0, +1} 4) t i+1 is defined independent of t i (to avoid a propagation chain) To this end, consider Table II: The first three columns list the possible inputs for x i, y i and t i. The next two columns are values for t i+1 and w i that were computed by the algorithm of the previous section, i.e. +1 if x i + y i +1 t i+1 := 1 if x i + y i 1 0 otherwise and w i := x i + y i 2 t i+1 and s i := x i + y i + t i 2 t i+1. As can be seen, the algorithm sometimes computes values for s i that are not in the allowed range. The symbol * in the rightmost column marks these rows (where the algorithm fails) and we have colored these rows in dark gray. It is not difficult to see that a correct result would have been possible, since we have [t i+1,s i ] 1,2 = [ 1, +2] 1,2 = 0= [0, 0] 1,2 and [t i+1,s i ] 1,2 = [+1, 2] 1,2 =0= [0, 0] 1,2 holds. 47

5 Table II VALUES OF t i+1 AND s i FOR STANDARD SD ADDITION. x i y i t i t i+1 w i s i * * * * However, we cannot simply change these rows in the table to correct the outputs t i+1 and s i, since the computation of t i+1 must be independent of t i, and should only depend on x i and y i. Hence, changing the value of t i+1 inarow forces us to make the same change in all rows where x i and y i has the same value. We therefore say that two input triples (x i,y i,t i ) and (x i,y i,t i ) are equivalent iff x i = x i y i = y i holds. The symbol + denotes the rows that are equivalent in this sense to another input that leads to wrong results, and we have colored these rows in a lighter gray. We therefore see that we have four critical input classes (x i,y i,t i ) = ( 1, 0, ), (x i,y i,t i )=(0, 1, ), (x i,y i,t i )=(0, +1, ), and (x i,y i,t i )=(+1, 0, ) that refer to the decomposition cases in Table I where two decompositions are possible. Since we have to define a decomposition 2 t i+1 + w i = x i + y i independent of t i, there is no solution by the information given in this table. For example, consider the critical input class (x i,y i,t i )=( 1, 0, ): Using t i+1 = 1 as computed by the algorithm leads to value s i =+2for t i =+1. Using t i+1 =0instead leads to value s i = 2 for t i = 1, and using t i+1 =+1leads to forbidden values of s i for all values of t i (see Table III). Thus, it is not possible Table III ALTERNATIVE VALUES OF t i+1 FOR (x i = 1 AND y i =0). x i y i t i t i+1 s i t i+1 s i to define a decomposition for t i+1 that only depends on x i and y i as remarked by Avizienis! B. Solution Our algorithm uses additional information that solves the problem explained in the previous section. As the algorithm describes a hardware circuit, we make use of an encoding of the digits { 1, 0, +1} by a pair of booleans (x.0,x.1). There are many encodings of the digits { 1, 0, +1}, but the following two are the most popular ones: Value sign-value neg-pos -1 (true, true) (true, false) 0 (false, false) (false, false) +1 (false, true) (false, true) We choose the neg-pos encoding for our algorithm because it lends itself well to a concise description of the logic equations below; in addition, it makes negating a value a simple swap of the pair s elements. The key idea of our solution is to choose different decompositions x i + y i =2 t i+1 + w i in the critical cases (with gray color) of Table II. Since we cannot do this based on x i and y i only, and since we are not allowed to consider t i, we introduce a new input l i such that (t i =+1 l i ) (t i = 1 l i ) holds, and we generate an output l i+1 that maintains this property as an invariant (t i+1 =+1 l i+1 ) (t i+1 = 1 l i+1 ) (1) that is forwarded to the full adder that receives x i+1 and y i+1 as inputs, while l i is provided in addition to t i by the full adder for x i 1 and y i 1. Using l i, we can then decide whether we use the one or the other possible decomposition in the critical cases (with gray color) of Table II. Note that l i does not hold the full information of t i, since it is not determined for t i =0.To establish the above invariant, we define l i+1 := x i.0 y i.0 which means that l i+1 holds if and only if at least one of the digits x i,y i is 1. We prove that equation (1) holds by inspecting Table IV, where the solution computed by our algorithm is given as 48

6 Table IV VALUES OF t i+1 AND s i FOR OUR SD ADDITION ALGORITHM. x i y i t i l i l i+1 t i+1 s i x y tin lin lout tout s T T * T F T T T T T F T F T T T * T F T T T T T F T F T T F * F F F T F T F F F F F T T * T F T T F T F F F F F T F * F F F principle, we could replace l i by (t i 1) without making the equations incorrect. However, the hardware circuit would then suffer from carry propagation since t i+1 would then depend on t i. Figure 1 defines a full adder using the Quartz language [16] that can be cascaded to obtain a carry-free binary SD adder. Inputs are declared by? while outputs are declared with!. The inputs tin, x, and y are thereby pairs of booleans that encode digits { 1, 0, +1} via the neg-pos encoding, i.e., ε(x.0,x.1) = (x.0 1 0) + (x ) maps a pair of booleans to the corresponding digits. The module also makes use of local boolean variables w1, w2, w3, w4, w, u1, and u0. w is thereby defined such that it holds if and only if one of the critical input cases are given (the gray shaded ones in Table IV). Variables u1 and u0 are used to define some common subexpressions. module SgnFullAdd( (bool bool)?tin,?x,?y,bool?lin, (bool bool)!tout,!s,bool!lout) { bool w1,w2,w3,w4,w,u1,u0; // define the critical input cases: w1 =!x.0 &!x.1 & y.1; // x==0 & y==+1 w2 =!x.0 &!x.1 & y.0; // x==0 & y== 1 w3 =!y.0 &!y.1 & x.1; // y==0 & x==+1 w4 =!y.0 &!y.1 & x.0; // y==0 & x== 1 w = w1 w2 w3 w4; u1 =!lin & w; // tin!= 1 & critical input u0 = lin & w; // tin!=+1 & critical input // determine lout := x= 1 y= 1 lout = x.0 y.0; // tout.0 holds iff x=y= 1 tin!=+1 & x+y= 1 tout.0 = x.0 & y.0 lin & (w2 w4); // tout.1 holds iff x=y=+1 tin!= 1 & x+y=+1 tout.1 = x.1 & y.1!lin & (w1 w3); // determine sum digit s.0 = tin.0 &!u0 u1 &!tin.1; s.1 = tin.1 &!u1 u0 &!tin.0; } Figure 1. Implementation of a Full Adder for Binary SD Numbers the three rightmost columns, and we can also verify that the important equation x i + y i + t i =2 t i+1 + s i holds, and that all computed values are legal digits. Note that the inputs in Table IV are arbitrary, but input l i must respect the mentioned invariant above. We use * in case its value is a don t care (i.e., if t i =0). As can be seen, in case of non-critical inputs (those that are not given in gray color), the decomposition of x i + y i = u i into t i+1 2+w i does only depend on x i and y i, while in the critical cases, it also depends on l i. Using the information of l i, it is possible to choose a decomposition where always legal digits are obtained for t i+1 and s i without generating a carry digit. It is interesting to note that l i and l i+1 have strong relationships to t i and t i+1 due to the mentioned invariants. However, l i+1 only depends on the digits x i and y i, while t i+1 depends on l i, but not on t i. This is very important: In As can be seen, tout only depends on x,y,lin; s depends on tin,lin,x,y, and lout on x,y. Therefore, there is no dependency from tin to tout and neither is there one from lin to lout. Dependencies between neighbored full adder modules are shown in Figure 2. As can be seen, a sum digit s i depends on x i,y i,x i 1,y i 1,x i 2,y i 2, l i on x i 1,y i 1, and t i on x i 1,y i 1,x i 2,y i 2. Figure 2. Dependencies of the Variables in SgnFullAdd 49

7 It is not difficult to prove that the following theorem holds where ε(x) maps the pair of booleans x =(x.0,x.1) to a digit { 1, 0, +1} according to the neg-pos encoding: Theorem 4 (Correctness of SgnFullAdd): If x, y, tin are pairs of booleans that encode digits { 1, 0, +1}, and if lin is a boolean such that condition (lin tin.1) ( lin tin.0) holds, then the following holds for module SgnFullAdd shown in Figure 1: tout and s encode signed binary digits { 1, 0, +1} (lout tout.1) ( lout tout.0) ε(x)+ε(y)+ε(tin) =2 ε(tout)+ε(s) Proof: The proof can be made by an exhaustive enumeration of all cases, which has been performed by means of the Averest tool set. Thus, all bits l i, then all transfer digits, and then all sum digits are computed in three parallel steps, thus requiring time O(1). Hence, we obtained a carry-free addition of binary SD numbers without the need to re-encode the inputs. The crucial fact used here is that we can extract enough information from the next less-significant digits to distinguish the cases where forbidden digits for s i would be computed within the critical inputs. Note that l i does not have the complete information to determine t i since that would lead to a dependency between t i+1 and t i that would introduce a carry chain. C. Conversion to/from Binary Numbers Converting radix-2 or two s complement numbers to binary SD numbers does not require any logic resources. For a radix-2 number x = [x n 1,...,x 0 ], the equivalent SD number x in neg-pos encoding is x.0 := [0,...,0] and x.1 := [x n 1,...,x 0 ]; for a two s complement number x = [x n 1,...,x 0 ], an equivalent SD number is x.0:=[x n 1, 0,...,0] and x.1:=[0,x n 2,...,x 0 ]. The correctness of this can be easily seen from the equation [x n 1,...,x 0 ] 2C = x n 1 2 n 1 + n 2 i=0 x i 2 i, where x 2C denotes the two s complement interpretation of a bitvector x. To convert an SD number x back to a radix-2 or a two s complement number, the bitvector [x n 1.0,...,x 0.0] is interpreted as a radix-2 number and subtracted from the radix- 2 number [x n 1.1,...,x 0.1] (since [x n 1,...,x 0] 1,2 = n 1 i=0 (x i.1 x i.0) 2 i ). This requires a single n-bit subtraction which needs time O(log(n)) and returns an (n +1)-bit radix-2/two s complement number. IV. BENCHMARK RESULTS A. Setup We implemented our addition algorithm in hardware on a Xilinx Virtex 5 FPGA, along with Parhami s algorithm [8], and a simple addition of two s complement numbers to make comparisons. On these FPGAs, simple addition is implemented using a dedicated carry logic and fast carry chains, resulting in a combination of carry-lookahead and carry-ripple adders. This method is the fastest and the smallest carry-based addition for all but very high bit-width numbers. For Parhami s method, we chose the signed-value encoding, since it was the one they focused on in [8]. Our benchmarks were set up as follows: To measure latency, we registered the inputs and outputs of the respective adder implementation. The synthesis and implementation tools were set to optimize for clock frequency, and the given latencies are the minimum clock periods which were still routable. For area, the design was solely comprised of the adder circuit, with the FPGA s pins serving as the inputs and the outputs of the adder. The tools were set to optimize for area, and the area is measured in occupied lookup tables (LUTs). For our benchmarks, we assumed that the inputs are given as signed-digit numbers. This is necessary in order to ensure that the input is as general as possible so that the synthesis tools are not able to optimize the circuit unrealistically by exploiting don t-care conditions. We measured the following benchmarks: add2: addition circuit with two n-digit inputs and an (n +1)-digit output add3: addition circuit with three n-digit inputs and an (n +2)-digit output B. Results Table V shows the latency and the maximum frequency of the two-input and the three-input adder for our new addition algorithm and compares it to Parhami s adder. The values were determined for an input width of n =64, but they are actually independent of n (with very small deviations due to slight variations in the LUT array and the routing network of the FPGA). We included the values for a 64-bit native addition as a reference. As can be seen, our algorithm is more than 40 % faster than Parhami s SD addition. It also tends to achieve a frequency which is 50 % higher than a 64-bit native FPGA addition. This is to be expected, since our algorithm has a constant O(1) latency, while the best latency which any carry-based addition can achieve is O(log n). In fact, our algorithm is so efficient that the breakeven point is at n =24, for which native addition has a latency of 2.11 ns. Interestingly, Parhami s adder is actually slower than native FPGA addition for the three-input case, even though it is faster for the two-input case. In Table VI, we show the area requirements for the different algorithms. For all of them, the occupied area is proportional to their input width, hence we give the number of LUTs per input digit (measured for n =64). For example, our method requires 3 LUTs per digit, so adding two 32- digit numbers requires 96 LUTs. As expected, three-input 50

8 Table V LATENCY OF TWO-INPUT AND THREE-INPUT ADDERS IN NANOSECONDS, RESP. MAXIMUM FREQUENCY IN MHZ. add2 (ns / MHz) add3 (ns / MHz) Our adder 2.02 / / 318 Parhami s adder 2.88 / / 201 Simple adder (64-bit) 3.19 / / 212 Table VI AREA REQUIREMENTS OF TWO-INPUT AND THREE-INPUT ADDERS IN LUTS PER INPUT BIT. add2 add3 Our adder Parhami s adder Two s complement adder adders need twice the area of two-input adders, since they are just two adders in sequence. Our method requires three times as much area as the native addition algorithm, and less than half of Parhami s algorithm. Note that in the case of an ASIC implementation, our algorithm would likely perform even better compared to a two s complement adder since the latter benefits from the dedicated carry-propagation chain on the FPGA, an advantage which would not exist on an ASIC. V. CONCLUSION We developed an algorithm for adding binary SD numbers which does not require the recoding step of previous approaches [8]. Our algorithm makes use of an additional input l i that is used to determine suitable transfer and interim sum digits that avoid this way a carry generation. By implementing our addition algorithm on an FPGA, we showed that our method is approximately 40 % faster and needs less than half as much area compared to previous approaches to binary SD addition. It has a lower latency than even the fastest carry-based two s complement addition for input widths as low as 24 bits, allowing it to be used as a replacement in many practical, latency-critical hardware designs. REFERENCES [1] P. Kogge and H. Stone, A parallel algorithm for the efficient solution of a general class of recurrences, IEEE Transactions on Computers (T-C), vol. 22, pp , [2] P. Beame, S. Cook, and H. Hoover, Log depth circuits for division and related problems, in Foundations of Computer Science (FOCS). West Palm Beach, Florida, USA: IEEE Computer Society, 1984, pp [3] B. Parhami, Computer Arithmetic Algorithms and Hardware Designs. Oxford University Press, [4] M. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann, [5] H. Garner, The residue number system, IRE Transactions on Electronic Computers, vol. 8, pp , June [6] H. Garner, R. Arnold, B. Benson, C. Brockus, R. Gonzalez, and D. Rozenberg, Residue number systems for computers, University of Michigan, Technical Report , October [7] A. Avizienis, Signed-digit number representations for fast parallel arithmetic, IRE Transactions on Electronic Computers, vol. 10, no. 3, pp , September [8] B. Parhami, Carry-free addition of recorded binary signeddigit numbers, IEEE Transactions on Computers (T-C), vol. 37, no. 11, pp , [9], Generalized signed-digit number systems: A unifying framework for redundant number representations, IEEE Transactions on Computers (T-C), vol. 39, no. 1, pp , January [10] S.-H. Shieh and C.-W. Wu, Asymmetric high-radix signeddigit number systems for carry-free addition, Journal of Information Science and Engineering, vol. 19, no. 6, pp , [11] G. Jaberipur and M. Ghodsi, High radix signed digit number systems: Representation paradigms, Scientia Iranica, vol. 10, no. 4, pp , [12] S. Gorgin and G. Jaberipur, A family of high radix signed digit adders, in Symposium on Computer Arithmetic (ARITH). Tübingen, Germany: IEEE Computer Society, 2011, pp [13] M. Joye and S.-M. Yen, Optimal left-to-right binary signeddigit recoding, IEEE Transactions on Computers (T-C), vol. 49, no. 7, pp , [14] C. Koc and S. Johnson, Multiplication of signed-digit numbers, Electronics Letters, vol. 30, no. 11, pp , [15] C. Hung and B. Parhami, Generalized signed-digit multiplication and its systolic realizations, in Circuits and Systems. Detroit, Michigan, USA: IEEE Computer Society, 1993, pp [16] K. Schneider, The synchronous programming language Quartz, Department of Computer Science, University of Kaiserslautern, Kaiserslautern, Germany, Internal Report 375, December [17] S. Arno and F. Wheeler, Signed digit representation of minimal Hamming weight, IEEE Transactions on Computers (T-C), vol. 42, no. 8, pp , August [18] A. Booth, A signed binary multiplication technique, Quarterly Journal of Mechanics and Applied Mathematics (QJ- MAM), vol. 4, no. 2, pp , [19] D. Phatak, T. Goff, and I. Koren, Constant-time addition and simultaneous format conversion based on redundant binary representations, IEEE Transactions on Computers (T-C), vol. 50, [20] G. Reitwiesner, Advances in Computers. Academic Press, 1960, ch. Binary Arithmetic. 51

RN-Codings: New Insights and Some Applications

RN-Codings: New Insights and Some Applications RN-Codings: New Insights and Some Applications Abstract During any composite computation there is a constant need for rounding intermediate results before they can participate in further processing. Recently

More information

RN-coding of Numbers: New Insights and Some Applications

RN-coding of Numbers: New Insights and Some Applications RN-coding of Numbers: New Insights and Some Applications Peter Kornerup Dept. of Mathematics and Computer Science SDU, Odense, Denmark & Jean-Michel Muller LIP/Arénaire (CRNS-ENS Lyon-INRIA-UCBL) Lyon,

More information

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2) Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 683-690 Research India Publications http://www.ripublication.com/aeee.htm Implementation of Modified Booth

More information

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1} An Efficient RNS to Binary Converter Using the oduli Set {n + 1, n, n 1} Kazeem Alagbe Gbolagade 1,, ember, IEEE and Sorin Dan Cotofana 1, Senior ember IEEE, 1. Computer Engineering Laboratory, Delft University

More information

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers Bogdan Mătăsaru and Tudor Jebelean RISC-Linz, A 4040 Linz, Austria email: bmatasar@risc.uni-linz.ac.at

More information

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004 Scientiae Mathematicae Japonicae Online, Vol. 10, (2004), 431 437 431 JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS Ondřej Čepeka and Shao Chin Sung b Received December May 12, 2003; revised February

More information

Using Logic to Design Computer Components

Using Logic to Design Computer Components CHAPTER 13 Using Logic to Design Computer Components Parallel and sequential operation In this chapter we shall see that the propositional logic studied in the previous chapter can be used to design digital

More information

Multipliers. Introduction

Multipliers. Introduction Multipliers Introduction Multipliers play an important role in today s digital signal processing and various other applications. With advances in technology, many researchers have tried and are trying

More information

Understanding Logic Design

Understanding Logic Design Understanding Logic Design ppendix of your Textbook does not have the needed background information. This document supplements it. When you write add DD R0, R1, R2, you imagine something like this: R1

More information

Linear Codes. Chapter 3. 3.1 Basics

Linear Codes. Chapter 3. 3.1 Basics Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

More information

Mathematical Induction

Mathematical Induction Mathematical Induction (Handout March 8, 01) The Principle of Mathematical Induction provides a means to prove infinitely many statements all at once The principle is logical rather than strictly mathematical,

More information

Introduction to Programming (in C++) Loops. Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC

Introduction to Programming (in C++) Loops. Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC Introduction to Programming (in C++) Loops Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC Example Assume the following specification: Input: read a number N > 0 Output:

More information

Example. Introduction to Programming (in C++) Loops. The while statement. Write the numbers 1 N. Assume the following specification:

Example. Introduction to Programming (in C++) Loops. The while statement. Write the numbers 1 N. Assume the following specification: Example Introduction to Programming (in C++) Loops Assume the following specification: Input: read a number N > 0 Output: write the sequence 1 2 3 N (one number per line) Jordi Cortadella, Ricard Gavaldà,

More information

Flip-Flops, Registers, Counters, and a Simple Processor

Flip-Flops, Registers, Counters, and a Simple Processor June 8, 22 5:56 vra235_ch7 Sheet number Page number 349 black chapter 7 Flip-Flops, Registers, Counters, and a Simple Processor 7. Ng f3, h7 h6 349 June 8, 22 5:56 vra235_ch7 Sheet number 2 Page number

More information

Notes 11: List Decoding Folded Reed-Solomon Codes

Notes 11: List Decoding Folded Reed-Solomon Codes Introduction to Coding Theory CMU: Spring 2010 Notes 11: List Decoding Folded Reed-Solomon Codes April 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami At the end of the previous notes,

More information

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.

Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom. Some Polynomial Theorems by John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.com This paper contains a collection of 31 theorems, lemmas,

More information

Binary Numbering Systems

Binary Numbering Systems Binary Numbering Systems April 1997, ver. 1 Application Note 83 Introduction Binary numbering systems are used in virtually all digital systems, including digital signal processing (DSP), networking, and

More information

Sistemas Digitais I LESI - 2º ano

Sistemas Digitais I LESI - 2º ano Sistemas Digitais I LESI - 2º ano Lesson 6 - Combinational Design Practices Prof. João Miguel Fernandes (miguel@di.uminho.pt) Dept. Informática UNIVERSIDADE DO MINHO ESCOLA DE ENGENHARIA - PLDs (1) - The

More information

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,

More information

Floating Point Fused Add-Subtract and Fused Dot-Product Units

Floating Point Fused Add-Subtract and Fused Dot-Product Units Floating Point Fused Add-Subtract and Fused Dot-Product Units S. Kishor [1], S. P. Prakash [2] PG Scholar (VLSI DESIGN), Department of ECE Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu,

More information

SAD computation based on online arithmetic for motion. estimation

SAD computation based on online arithmetic for motion. estimation SAD computation based on online arithmetic for motion estimation J. Olivares a, J. Hormigo b, J. Villalba b, I. Benavides a and E. L. Zapata b a Dept. of Electrics and Electronics, University of Córdoba,

More information

The mathematics of RAID-6

The mathematics of RAID-6 The mathematics of RAID-6 H. Peter Anvin 1 December 2004 RAID-6 supports losing any two drives. The way this is done is by computing two syndromes, generally referred P and Q. 1 A quick

More information

Notes on Factoring. MA 206 Kurt Bryan

Notes on Factoring. MA 206 Kurt Bryan The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

More information

Offline 1-Minesweeper is NP-complete

Offline 1-Minesweeper is NP-complete Offline 1-Minesweeper is NP-complete James D. Fix Brandon McPhail May 24 Abstract We use Minesweeper to illustrate NP-completeness proofs, arguments that establish the hardness of solving certain problems.

More information

Design Methods for Binary to Decimal Converters Using Arithmetic Decompositions

Design Methods for Binary to Decimal Converters Using Arithmetic Decompositions J. of Mult.-Valued Logic & Soft Computing, Vol., pp. 8 7 Old City Publishing, Inc. Reprints available directly from the publisher Published by license under the OCP Science imprint, Photocopying permitted

More information

The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

More information

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Why? A central concept in Computer Science. Algorithms are ubiquitous. Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

More information

Non-Data Aided Carrier Offset Compensation for SDR Implementation

Non-Data Aided Carrier Offset Compensation for SDR Implementation Non-Data Aided Carrier Offset Compensation for SDR Implementation Anders Riis Jensen 1, Niels Terp Kjeldgaard Jørgensen 1 Kim Laugesen 1, Yannick Le Moullec 1,2 1 Department of Electronic Systems, 2 Center

More information

Implementing the Functional Model of High Accuracy Fixed Width Modified Booth Multiplier

Implementing the Functional Model of High Accuracy Fixed Width Modified Booth Multiplier International Journal of Electronics and Computer Science Engineering 393 Available Online at www.ijecse.org ISSN: 2277-1956 Implementing the Functional Model of High Accuracy Fixed Width Modified Booth

More information

Implementation and Design of AES S-Box on FPGA

Implementation and Design of AES S-Box on FPGA International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 232-9364, ISSN (Print): 232-9356 Volume 3 Issue ǁ Jan. 25 ǁ PP.9-4 Implementation and Design of AES S-Box on FPGA Chandrasekhar

More information

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE

CHAPTER 5 FINITE STATE MACHINE FOR LOOKUP ENGINE CHAPTER 5 71 FINITE STATE MACHINE FOR LOOKUP ENGINE 5.1 INTRODUCTION Finite State Machines (FSMs) are important components of digital systems. Therefore, techniques for area efficiency and fast implementation

More information

CHAPTER 3 Boolean Algebra and Digital Logic

CHAPTER 3 Boolean Algebra and Digital Logic CHAPTER 3 Boolean Algebra and Digital Logic 3.1 Introduction 121 3.2 Boolean Algebra 122 3.2.1 Boolean Expressions 123 3.2.2 Boolean Identities 124 3.2.3 Simplification of Boolean Expressions 126 3.2.4

More information

Binary Adders: Half Adders and Full Adders

Binary Adders: Half Adders and Full Adders Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order

More information

2.1 Complexity Classes

2.1 Complexity Classes 15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look

More information

Monday January 19th 2015 Title: "Transmathematics - a survey of recent results on division by zero" Facilitator: TheNumberNullity / James Anderson, UK

Monday January 19th 2015 Title: Transmathematics - a survey of recent results on division by zero Facilitator: TheNumberNullity / James Anderson, UK Monday January 19th 2015 Title: "Transmathematics - a survey of recent results on division by zero" Facilitator: TheNumberNullity / James Anderson, UK It has been my pleasure to give two presentations

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

C H A P T E R. Logic Circuits

C H A P T E R. Logic Circuits C H A P T E R Logic Circuits Many important functions are naturally computed with straight-line programs, programs without loops or branches. Such computations are conveniently described with circuits,

More information

How To Write A Hexadecimal Program

How To Write A Hexadecimal Program The mathematics of RAID-6 H. Peter Anvin First version 20 January 2004 Last updated 20 December 2011 RAID-6 supports losing any two drives. syndromes, generally referred P and Q. The way

More information

Check Digits for Detecting Recording Errors in Horticultural Research: Theory and Examples

Check Digits for Detecting Recording Errors in Horticultural Research: Theory and Examples HORTSCIENCE 40(7):1956 1962. 2005. Check Digits for Detecting Recording Errors in Horticultural Research: Theory and Examples W.R. Okie U.S. Department of Agriculture, Agricultural Research Service, Southeastern

More information

8 Primes and Modular Arithmetic

8 Primes and Modular Arithmetic 8 Primes and Modular Arithmetic 8.1 Primes and Factors Over two millennia ago already, people all over the world were considering the properties of numbers. One of the simplest concepts is prime numbers.

More information

A Direct Numerical Method for Observability Analysis

A Direct Numerical Method for Observability Analysis IEEE TRANSACTIONS ON POWER SYSTEMS, VOL 15, NO 2, MAY 2000 625 A Direct Numerical Method for Observability Analysis Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper presents an algebraic method

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem)

Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem) Some facts about polynomials modulo m (Full proof of the Fingerprinting Theorem) In order to understand the details of the Fingerprinting Theorem on fingerprints of different texts from Chapter 19 of the

More information

A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List

A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List A Dynamic Programming Approach for Generating N-ary Reflected Gray Code List Mehmet Kurt 1, Can Atilgan 2, Murat Ersen Berberler 3 1 Izmir University, Department of Mathematics and Computer Science, Izmir

More information

3.Basic Gate Combinations

3.Basic Gate Combinations 3.Basic Gate Combinations 3.1 TTL NAND Gate In logic circuits transistors play the role of switches. For those in the TTL gate the conducting state (on) occurs when the baseemmiter signal is high, and

More information

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

More information

Boolean Algebra Part 1

Boolean Algebra Part 1 Boolean Algebra Part 1 Page 1 Boolean Algebra Objectives Understand Basic Boolean Algebra Relate Boolean Algebra to Logic Networks Prove Laws using Truth Tables Understand and Use First Basic Theorems

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra

More information

The string of digits 101101 in the binary number system represents the quantity

The string of digits 101101 in the binary number system represents the quantity Data Representation Section 3.1 Data Types Registers contain either data or control information Control information is a bit or group of bits used to specify the sequence of command signals needed for

More information

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source) FPGA IMPLEMENTATION OF 4D-PARITY BASED DATA CODING TECHNIQUE Vijay Tawar 1, Rajani Gupta 2 1 Student, KNPCST, Hoshangabad Road, Misrod, Bhopal, Pin no.462047 2 Head of Department (EC), KNPCST, Hoshangabad

More information

Lecture 8: Binary Multiplication & Division

Lecture 8: Binary Multiplication & Division Lecture 8: Binary Multiplication & Division Today s topics: Addition/Subtraction Multiplication Division Reminder: get started early on assignment 3 1 2 s Complement Signed Numbers two = 0 ten 0001 two

More information

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Antoni Roca, Jose Flich Parallel Architectures Group Universitat Politechnica de Valencia (UPV) Valencia, Spain Giorgos Dimitrakopoulos

More information

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one

More information

AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC

AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC W. T. TUTTE. Introduction. In a recent series of papers [l-4] on graphs and matroids I used definitions equivalent to the following.

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

Combinational Logic Design

Combinational Logic Design Chapter 4 Combinational Logic Design The foundations for the design of digital logic circuits were established in the preceding chapters. The elements of Boolean algebra (two-element switching algebra

More information

Lab 1: Full Adder 0.0

Lab 1: Full Adder 0.0 Lab 1: Full Adder 0.0 Introduction In this lab you will design a simple digital circuit called a full adder. You will then use logic gates to draw a schematic for the circuit. Finally, you will verify

More information

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the

More information

Two Binary Algorithms for Calculating the Jacobi Symbol and a Fast Systolic Implementation in Hardware

Two Binary Algorithms for Calculating the Jacobi Symbol and a Fast Systolic Implementation in Hardware Two Binary Algorithms for Calculating the Jacobi Symbol and a Fast Systolic Implementation in Hardware George Purdy, Carla Purdy, and Kiran Vedantam ECECS Department, University of Cincinnati, Cincinnati,

More information

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 15 (2014), pp. 1531-1537 International Research Publications House http://www. irphouse.com Design and FPGA

More information

ONLINE EXERCISE SYSTEM A Web-Based Tool for Administration and Automatic Correction of Exercises

ONLINE EXERCISE SYSTEM A Web-Based Tool for Administration and Automatic Correction of Exercises ONLINE EXERCISE SYSTEM A Web-Based Tool for Administration and Automatic Correction of Exercises Daniel Baudisch, Manuel Gesell and Klaus Schneider Embedded Systems Group, University of Kaiserslautern,

More information

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING Hussain Al-Asaad and Alireza Sarvi Department of Electrical & Computer Engineering University of California Davis, CA, U.S.A.

More information

International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013 FACTORING CRYPTOSYSTEM MODULI WHEN THE CO-FACTORS DIFFERENCE IS BOUNDED Omar Akchiche 1 and Omar Khadir 2 1,2 Laboratory of Mathematics, Cryptography and Mechanics, Fstm, University of Hassan II Mohammedia-Casablanca,

More information

Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011. Lecture 1 Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

More information

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT 216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,

More information

= 2 + 1 2 2 = 3 4, Now assume that P (k) is true for some fixed k 2. This means that

= 2 + 1 2 2 = 3 4, Now assume that P (k) is true for some fixed k 2. This means that Instructions. Answer each of the questions on your own paper, and be sure to show your work so that partial credit can be adequately assessed. Credit will not be given for answers (even correct ones) without

More information

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC 433 - Spring 2013

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC 433 - Spring 2013 Introduction to Xilinx System Generator Part II Evan Everett and Michael Wu ELEC 433 - Spring 2013 Outline Introduction to FPGAs and Xilinx System Generator System Generator basics Fixed point data representation

More information

Method for Multiplier Verication Employing Boolean Equivalence Checking and Arithmetic Bit Level Description

Method for Multiplier Verication Employing Boolean Equivalence Checking and Arithmetic Bit Level Description Method for Multiplier Verication Employing Boolean ing and Arithmetic Bit Level Description U. Krautz 1, M. Wedler 1, W. Kunz 1 & K. Weber 2, C. Jacobi 2, M. Panz 2 1 University of Kaiserslautern - Germany

More information

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc

More information

Mathematical Induction

Mathematical Induction Mathematical Induction In logic, we often want to prove that every member of an infinite set has some feature. E.g., we would like to show: N 1 : is a number 1 : has the feature Φ ( x)(n 1 x! 1 x) How

More information

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Hardware Implementations of RSA Using Fast Montgomery Multiplications ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Overview Introduction Functional Specifications Implemented Design and Optimizations

More information

FORDHAM UNIVERSITY CISC 3593. Dept. of Computer and Info. Science Spring, 2011. The Binary Adder

FORDHAM UNIVERSITY CISC 3593. Dept. of Computer and Info. Science Spring, 2011. The Binary Adder FORDHAM UNIVERITY CIC 3593 Fordham College Lincoln Center Computer Organization Dept. of Computer and Info. cience pring, 2011 1 Introduction The Binar Adder The binar adder circuit is an important building

More information

Settling a Question about Pythagorean Triples

Settling a Question about Pythagorean Triples Settling a Question about Pythagorean Triples TOM VERHOEFF Department of Mathematics and Computing Science Eindhoven University of Technology P.O. Box 513, 5600 MB Eindhoven, The Netherlands E-Mail address:

More information

Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system

Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system Joseph LaBauve Department of Electrical and Computer Engineering University of Central Florida

More information

Counters and Decoders

Counters and Decoders Physics 3330 Experiment #10 Fall 1999 Purpose Counters and Decoders In this experiment, you will design and construct a 4-bit ripple-through decade counter with a decimal read-out display. Such a counter

More information

Attaining EDF Task Scheduling with O(1) Time Complexity

Attaining EDF Task Scheduling with O(1) Time Complexity Attaining EDF Task Scheduling with O(1) Time Complexity Verber Domen University of Maribor, Faculty of Electrical Engineering and Computer Sciences, Maribor, Slovenia (e-mail: domen.verber@uni-mb.si) Abstract:

More information

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

EE 261 Introduction to Logic Circuits. Module #2 Number Systems EE 261 Introduction to Logic Circuits Module #2 Number Systems Topics A. Number System Formation B. Base Conversions C. Binary Arithmetic D. Signed Numbers E. Signed Arithmetic F. Binary Codes Textbook

More information

Offline sorting buffers on Line

Offline sorting buffers on Line Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com

More information

Computer Science 281 Binary and Hexadecimal Review

Computer Science 281 Binary and Hexadecimal Review Computer Science 281 Binary and Hexadecimal Review 1 The Binary Number System Computers store everything, both instructions and data, by using many, many transistors, each of which can be in one of two

More information

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies ETEC 2301 Programmable Logic Devices Chapter 10 Counters Shawnee State University Department of Industrial and Engineering Technologies Copyright 2007 by Janna B. Gallaher Asynchronous Counter Operation

More information

CMOS Binary Full Adder

CMOS Binary Full Adder CMOS Binary Full Adder A Survey of Possible Implementations Group : Eren Turgay Aaron Daniels Michael Bacelieri William Berry - - Table of Contents Key Terminology...- - Introduction...- 3 - Design Architectures...-

More information

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1 Divide: Paper & Pencil Computer Architecture ALU Design : Division and Floating Point 1001 Quotient Divisor 1000 1001010 Dividend 1000 10 101 1010 1000 10 (or Modulo result) See how big a number can be

More information

Implementation of Digital Signal Processing: Some Background on GFSK Modulation

Implementation of Digital Signal Processing: Some Background on GFSK Modulation Implementation of Digital Signal Processing: Some Background on GFSK Modulation Sabih H. Gerez University of Twente, Department of Electrical Engineering s.h.gerez@utwente.nl Version 4 (February 7, 2013)

More information

Factoring & Primality

Factoring & Primality Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

More information

A New Euclidean Division Algorithm for Residue Number Systems

A New Euclidean Division Algorithm for Residue Number Systems A New Euclidean Division Algorithm for Residue Number Systems Jean-Claude Bajard and Laurent Stéphane Didier Laboratoire d Informatique de Marseille CMI, Université de Provence, 39 rue Joliot-Curie, 3453

More information

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio 1 Anuradha S. Deshmukh, 2 Prof. M. N. Thakare, 3 Prof.G.D.Korde 1 M.Tech (VLSI) III rd sem Student, 2 Assistant Professor(Selection

More information

FPGA Implementation of RSA Encryption Engine with Flexible Key Size

FPGA Implementation of RSA Encryption Engine with Flexible Key Size FPGA Implementation of RSA Encryption Engine with Flexible Key Size Muhammad I. Ibrahimy, Mamun B.I. Reaz, Khandaker Asaduzzaman and Sazzad Hussain Abstract An approach to develop the FPGA of a flexible

More information

CS 103X: Discrete Structures Homework Assignment 3 Solutions

CS 103X: Discrete Structures Homework Assignment 3 Solutions CS 103X: Discrete Structures Homework Assignment 3 s Exercise 1 (20 points). On well-ordering and induction: (a) Prove the induction principle from the well-ordering principle. (b) Prove the well-ordering

More information

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

More information

A Tool for Generating Partition Schedules of Multiprocessor Systems

A Tool for Generating Partition Schedules of Multiprocessor Systems A Tool for Generating Partition Schedules of Multiprocessor Systems Hans-Joachim Goltz and Norbert Pieth Fraunhofer FIRST, Berlin, Germany {hans-joachim.goltz,nobert.pieth}@first.fraunhofer.de Abstract.

More information

CS101 Lecture 11: Number Systems and Binary Numbers. Aaron Stevens 14 February 2011

CS101 Lecture 11: Number Systems and Binary Numbers. Aaron Stevens 14 February 2011 CS101 Lecture 11: Number Systems and Binary Numbers Aaron Stevens 14 February 2011 1 2 1 3!!! MATH WARNING!!! TODAY S LECTURE CONTAINS TRACE AMOUNTS OF ARITHMETIC AND ALGEBRA PLEASE BE ADVISED THAT CALCULTORS

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

An Empirical Study of Two MIS Algorithms

An Empirical Study of Two MIS Algorithms An Empirical Study of Two MIS Algorithms Email: Tushar Bisht and Kishore Kothapalli International Institute of Information Technology, Hyderabad Hyderabad, Andhra Pradesh, India 32. tushar.bisht@research.iiit.ac.in,

More information

Solution for Homework 2

Solution for Homework 2 Solution for Homework 2 Problem 1 a. What is the minimum number of bits that are required to uniquely represent the characters of English alphabet? (Consider upper case characters alone) The number of

More information