Huey Ling High-speed Binary Adder Based on the bit pair (ai, bi) truth table, the carry propagate pi and carry generate gi have dominated the carry-lookahead formation process for more than two decades. This paper presents a new scheme in which the new carry propagation is examined by including the neighboring pairs (ai, bi; ai+,, b,+l). This scheme not only reduces the component count in design, but also requires fewer logic levels in adder implementation. n addition, this new algorithm oflers an astonishingly uniform loading in fan-in and fan-out nesting. ntroduction The traditional recursive formula for carry propagation has dominated the carry handling process in the computer industry for more than two decades. Today, adder designs based on a similar technique include Amdahl V6, BM 68, and BM 333. The recursive formulation of carry is based on the bit pair (ai, bi) truth table. By examining the local bit pair, carry propagate pi and carry generate gi are formed. The high-order carries are generated by nesting the pi and g, together. By considering the adjacent bit pairs (a,, bi; ai+l, bi+l), a new recursive formula is obtained for new carry propagation. The comparison between this new scheme and the existing scheme will be discussed in the following sections. The detailed implementation, circuits, and logic level count are also included. Surprisingly, this method offers an astonishingly uniform loading in fan-in/ fan-out nesting. The formation of new carry and sum This paper introduces a new approach to represent the new carry formation and propagation based on the concept of the complementing signal which was introduced in %5 []. To examine the impact of this complementing signal in performing binary addition and complementing signal look-ahead, one should evaluate the formation of Hi and Hi+, as a function of neighboring bit pairs (i, i + ). Let us consider adding two binary numbers A and B together, where A = ao2" + al2"-' + a2y-' +.. + a ip +... + a,2' ; B = b2" + b,2"" + b,2"-' +... + bi2n-i + * *. + bn2'. The relation among the new carry (Hi, Hi+,) and the neighboring bit pairs (ai, b,; ai+l, bi+j can be expressed as in Table [l]; all of these are generated by ai, b, or transmitted through the low-order bits, i +, i + 2,..., with the transmitting-enable switch ON. This sig- nal or new carry can only be terminated when the inhibitor is ON (ai+l + bi+, = ). H, plays both regular carry and complementing signal roles in performing binary addition. By grouping all the Hi, we obtain Hi =f(l, 2, 3, 5, 6, 7, 9,,, 2, 3, 4, 5) = aibi + Hi+l(Gi+lbi+l + ai+,li+, + ai+,bi+,) = aibi + Hi+l(ai+l + bi+,) = ki + H,+,T,+,, () where ki is the new complementing signal, Hi+, is the previous complementary signal, and Ti+, is the previous carry enable switch or the previous stage propagate. Equation () shows that new carry Hi can be formed locally by ki or produced remotely; Hi+, can be produced with the remote stage carry inhibitor not ON (ai+l + bi+, 56 Copyright 98 by nternational Business Machines Corporation. Copying is permitted without payment of royalty provided that () each reproduction is done without alteration and (2) the Journal reference and BM copyright notice are included on the first page. The title and abstract may be used without further permission in computer-based and other information-service systems. Permission to republish other excerpts should be obtained from the Editor. HUEY LNG BM. RES DEVELOP. VOL. 25 NO. 3 MAY 98
= ). The formation of sum Si can be expressed by a similar process. The truth table for Si is shown in Table 2. By grouping all the Si, we obtain Si =f(l, 2, 3, 4, 5, 6, 8, 9,, 3, 4, 5) ; Table The relation of new carry Hi with Hi+, and its neighboring bit pairs (ai, bi; ai+,, bi+,). i Hi = in relation ai bi bi+ with Hi+,, 2, 3 + ai6$hi+,gi+,bi+, + ai+6i+, + ai+,bi+,) Hi = = Q*Hi+l(ai+l + bi+j = [Uibi + Hi+,(Ui+, + bi+)3ai6i = HiT ; 5, 6, 9, + (ai V bi)(ai+l V bi+jri+, 2 3 4 5 Hi+, = Hi+, = Hi+, = Hi = Hi+, = = (ai v bi)(ui+l v bi+l)ri+l ; 6 Hi+, = 4, 8 + (a, V bi)(ii+,bi+j ; 4, 5, 6, 8, 9, -* (ai V bi)[ri+,(ai+, V bi+j + ~i+6i+,l = (ai + bi)(ci + Qmi+](ai+, v bi+j + iii+6i+lq+l + ai+6i+,l = (ai + bi)(ci + bi)(ri+] + cii+6i+l) ; 8 9 2 Hi+, = Hi = Hi+, = Hi+, = Hi+, = Hi+, = X Hi = Qibi + H,+,(a,+, + b,+j ; q = (ai + 6J{Ri+, + ; 4, 5, 6, 8, 9, + (ai + bi)ri = TiRi ; 3 4 5 Hi+, = X Hi+, = X Hi+, = X 3, 4, 5 + aibi~i+,(ai+,bi+, + ai+,6,+, + ai+lbi+,) = a,bihi+&i+] + 4 +J = kihi+,ti+, ; Table 2 Sum Si formation. Si = f(, 2, 3, 4, 5, 6, 8, 9,, 3, 4, 5) = HiT + TiBi + kihi+,ti+, = (Hi V Ti) + kihi+,ti+,. (2) Hi+, = We have obtained a set of recursive formulae for both new cany Hi and sum Si. They are different from the conventional process. Before discovering the difference, let us examine the carry-look-ahead process. 2 Hi+, = 3 Hi+, = 4 5 Hi+, = New carry-look-ahead For ease of discussion. let us consider i = 3. We have 6 Hi+, = H3 = k.3 + H32T32 (34 By substituting i = 3, 29, and 28 in (3a), we obtain 8 9 Hi+, = Hi+, = = k2 + T2Sk2S + T2ST3k3 + T29T3T3k3 + T2ST3T3T32k32 (3b) By following a similar process, we obtain 2 3 Hi+, = H24 = k24 + T25k25 + T2ST26k26 + T25T26T27k27 + T25T26T27T2H28 4 Hi+, = 5 Hi+, = 57 BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98 HUEY LNG
= k2 + T2k2 + T2T22k22 + T2T22T23k23 + T2T22T23T24H24 ; H6 = k6 + T7k7 + T7T8k8 + T7T8T9k9 + T7T8T9T2H2' By substituting (3b) for ( 9, we obtain H]6 = H*6 + zt6h2, where H*6 = k6 + T7k7 + T7T8k8 + T7T8T9k9 ; = T7T8T9T2 * By substituting (3a) and (3b) for (5), we obtain H6 = H*6 + Z*6H; + Z*,,$*,,H*,, + ZT&oZ*,4H28 The asterisk of HT6 represents the fact that HT6 can be implemented with one level of logic. Based on current switching technology, both fan-in and fan-out are equal to four with eight-emitter dotting; H,, can be implemented with two levels of logic. Comparison with the existing scheme Based on the local bit pair (a,, b,), carry C, and sum Si can be written in the form ', = g, + C,+,Pi, g, = a$, ; ( ) Si = a, tl b, V Ci+l, pi = a, + b,. Fori = 6, we have '6 = g6 + c7p6 * By substituting i = 7, 8,..., 9, C6 can be rewritten as '6 = g6 + p6(g7 + p7g8 + p7p8g9 + p7pl@9c2) = g6 + p6g7 + p87g8 + p6p7plsg9 + p87p8p9c2 = G6P + p6pc2 ' ( ) where G, and are the grouping of the following terms: G6P = gl6 + pl6g7 + p6p7g8 + p6p7plsg9 ; (2) '6P = p6p7p8p9 * (3) Similarly, C6 can be written in terms of CZ8: '6 = + pl#3pg2p + P6PP2PG24P + p6pp2~p24~g28p ' Equations (6), (7), and (8) are similar to (ll), (2), and (3); however, H*6 can be implemented with one level of logic, whereas G,, cannot. By expanding (7) and (2) we 58 obtain Equation (4) contains eight terms, whereas (5) contains fifteen. With current available technology, the former can be implemented with one level of logic (this is shown in detail in the next section); the latter can only be implemented with two levels of logic. Let us further examine the ith-digit carry formation. For (l), the carry is generated by local complementing signal k,, and the remote carry Hi+, is controlled by remote bit pair (ai+l + 6i+); whereas for (lo), the carry is generated by local carry g,, and the remote carry C,+, is controlled by local bit pair (a, + b,). From the carry-lookahead point of view, () offers faster resolution, whereas the latter is one stage slower. That is why (4) contains only eight terms, and (5), fifteen. To illustrate the step-by-step operation, two examples are given. Example Assume the contents of A and B registers to be as shown and find their sum; A register OOO B register oooo The k, and Ti can be implemented with one level of logic: k, OOOOOOOOO Ti HUEY LNG BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98
The complementary signals can grouping ki and Ti together. This level of logic: Hi be implemented by process requires one The sum digit Si is implemented in parallel with Hi; the result of Hi will force Si to select one value between Hi = and Hi = : Si OOOO This example demonstrates that it is possible to implement a 32-bit adder with three levels of logic with the hardware constraints indicated in the previous section. The detailed implementation of Si is discussed in the next section. Example 2 Assuming that the contents of index, base, and displacement registers are as shown, compute vir- the tual address. (To test the generality of this scheme, odd contents are purposely chosen; in the normal mode of operation, an EXCPN will occur.) ndex register Base register ooooooooo OOOOOOOOOOOOlOOlllOlllOl Displacement register To implement the carry-save adder (CSA) requires one level of logic: si ci OOOOOOOOOOO OOOOOOOOOOOOOOOOllllOlllllO mplementation of ki and Ti requires one additional level of logic: ki OOOOOOOOOOOOOlOlOOOOOOO Ti mplementation of the complementary signal requires one level of logic to group ki and Ti together: Hi The address digit Si is implemented in parallel with Hi; the result of Hi will force S, to select one value between Hi = and : Si This example demonstrates that the AGEN adder can be implemented with four rather than six levels of logic, as is the case in current machine organization. The detailed implementation of Si (the address) is discussed in the next section. The logic implementation of every fourth bit (i = 3,27,23, 9, 6, 5,,7,3,) is shown in the Appendix of this paper. mplementation The detailed implementation can be divided into two categories: binary addition and subtraction, and address generation. e Addition and subtraction Equation (3b) is a general representation of the new carry-look-ahead process. For ADDTON, k3, = ; therefore, the fifth term in (3b) is dropped and Hz, can be written as HZO = k28 + TZ9kZ9 + T29T3k3 + TZ9T3T3k3 * (44 For SUBTRACTON, there is a HOT ONE carry input from bit 3; thus Hz, can be written as = kzo + T29kZ9 + TZ9T3k3 + T29T3T3k3 + T29T3T3 ' (4b) Equation (2) shows that Si is a function of Hi and Hi+,. For ease of implementation, this equation is rewritten in the form Si = (Hi V Ti) + k,h,+,t,+, = [(kt + q+,t,+,) v Til + k,h,+,ti+, = Hi+,(TiTi+, + k,t,+,) + Ri+,EiTi + kiq + ktiti+,. ( 6) Equation (6) demonstrates that Si can be written in the conditional form S,(Hi+, = ) = ki V Ti ; S,(Hi+, = ) = qt,+, + kiti+, + kif, + &TiTi+,. The general expression of SUM Si can be written as Si = Hi+,(ui+, + bi+,)(a, V b,) + Bi+,(ui V bi) + (ai+] + bi+& v bi). Fori = 3, we have '3 = H3Z(a32 + '32)('3 '3,) + n32(a3] b3]) + (a32 + b32)(~3 V b3). For i =, we obtain So = H,(a, + b,)(ao v bo) + qa v bo) + (a, + b,)(ao v bo) ; ( 7) H = k, + T2kz + T2T3k3 + T2T3T4H4 = H*, + ;H, ; ( 8) 59 BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98 HUEY LNG
H4 = k4 + T5k5 + T,T,k, + T5T,T7k7 + T5T,T7TnHn = H*, + *,H, ; ( 9) H, = H*, + *,H,, ; (2) H,, = H*,, + *,,H,,. (2) By substituting (2) for (2, we have H, = H*, + *,H*,, + *,*,,H,,. (22) Similarly, we obtain H4 and H,: H, = HZ +,*H,* + ;,*H;~ +,*;T~,,; (23) H, = H; + ;H,* + C CH; + T,*,*HTz * * * * + 4 n lzhm. (24) By substituting Eqs. (22), (23), and (24) for Eqs. (8), (9), (2), and (2), Eq. (7) can be written as So = (H*, + *,H*, + *,*,H*, + *,:*,H*,, **** +,4$ 2H(3)(a] + bl)(ao + (H*, + ;H: + *,*,H*, + *,*,*,H*,, + ;:,*;p,,)(a, v bo) + (a, + b,)(ao v bo) ; = (HT + Tg + T,* H,* + Tc,*fQ(a + b,) x ( bo) + z*,z~l*,z*,zh,(al + bl)(ao + (H*, + *,H*, + *,*,H*, + *,*,*,H*,,) X (*,*,*,*,, t B,,)(u, v bo) + (a + b,)(ao v bo). (25) By using the Sklansky conditional-sum method [2], (25). can be written as so = (H*, + *,H; + *,*,H*, + ***, 4~n~:2) x (a, + b,)(ao v bo) + (a, + b,)(ao v bo) + (H*, + *,H*, + *,*,H*, + *,*,*,H*,,) x (ao bo) = + *,~*,*,,(u, + b,)(a V bo) [H,, = + (H*, + *,H*, + *,*,H*, + ;*,*,H*,,) X (*,*,*,:,)(u, V bo) [H,, =. The hardware implementation of So is included in the Appendix. Equation (9) has indicated that H,, can be implemented with two levels of logic. Let us examine the individual terms of So. t is clearly pointed out that they also require only two levels of logic to be implemented. That is to say, when H,, is ready, So can be obtained by using one additional level of logic. We have proved, by using current switching logic, that one can implement a 32-bit adder by consuming only three levels of logic. Address generation n the address generation process, we are dealing with positive numbers only. Therefore, k,, =. The output of the (3, 2) carry-save adder provides the si and ci+, corresponding to the ai and b, bit pairs. n addition, Xi and B, both are 32 bits in length. However, Di has only a 2-bit width. For i = -8, the output of the carry-save adder has a special pattern; si and ci+, will not have the form --- ---. n general, the output of CSA will appear as --- ---. Therefore, for i = 9-3, St appears as usual: St = (Hi V Ti) + kihi+,ti,,. For i = -8, Si appears as si = (Hi v Ti). The detailed implementations for i from,3,7,, 5, 6, 9, 23, 27, and 3 are shown in the Appendix. Summary t is intended in this paper to speed up the carry propagation for examining two bit pairs. The formulation of H*,, contains eight terms as compared to that of the regular carry-look-ahead process, where G,, contains fifteen terms. t is possible to implement H*,, with one level of logic, whereas it is not possible with G6p. The formulation of sum Si in this new process will contain slightly more terms; however, they are not in the critical path. References. H. Ling, High Speed Binary Parallel Adder, EEE Trans. Electron. Computers EC-5, 799-82 (%). 2. J. Sklansky, Conditional-Sum Addition Logic, EEE Trans. Electron. Computers EC-9, 226-23 (96). 6 HUEY LNG BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98
Appendix Level 3 OX pr > 3- x 6 BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98 HUEY LNG
i=3 Level Level 2 Level 3 -ox i=7 Level Level 2 Level 3 62
i'll Level Level 2 R;2 - X.. q A A., c X Ȳ Ti2=l"-Fi 3 i= 6 Level Level 2 Level 3 - Hi L - 4 4 63 BM. RES. DEVELOP. VOL. 25 NO. 3 MAY 98 HUEY LNG
i= 5 Level Level 2 Level 3 Level Level 2 Level 3 S28- - T28 c2-. T28 64 HUEY LNG ibm. RES. DEVELOP. VOL. 25 NO. 3 MAY 98
Level Level 2 Level 3 3 65 BM J. RES. DEVELOP. VOL. 25 NO. 3 MAY 98 HUEY LNG
i=23 Level Level 2 Level 3 Level Level 2 Level 3 Received September,98; revised November 26,98 The author is located at the BM Thomas J. Watson Research Center, Yorktown Heights, New York 598. 66 HUEY LNG BM J. RES. DEVELOP. VOL. 2s NO. 3 MAY lwll