Comparative Analysis of 4-Bit Multipliers Using Low Power 8-Transistor Full Adder Cells

Comparative Analysis of 4-Bit Multipliers Using Low Power 8-Transistor Full Adder Cells S. Kiruthika 1, R.Nirmal Kumar 2, Dr. S.Valarmathy 3 1 PG Scholar, 2 Assistant Professor, 3 Professor, Department of ECE, Bannari Amman Institute of Technology, India Abstract-- In recent year, power dissipation is one of the biggest challenges in VLSI design. Multipliers are the main sources of power dissipation in DSP blocks. In this project various types of full adders design are performed. Different techniques are used for low power in full adders. The design and power comparison of the low power multiplier using different types of full adder adders units are analyzed. The Vedic multiplier is designed using different types of full adder and the power result is analyzed. The designs are implemented and power results are obtained using TANNER EDA Tool. Tanner SPICE results show that the transistor count and the power required are significantly reduced in the proposed design over the existing design. Keywords-- Multipliers, Full adders, CMOS circuit, XOR-XNOR, Low power, Multiplexer, Delay. I. INTRODUCTION The design of Multiplier, full adders forms the basic building blocks of all digital VLSI circuits has been undergoing a considerable improvement, being motivated by three basic design goals, viz. minimizing the transistor count, minimizing the power consumption and increasing the speed. Great effort has been concentrated on lowpower microelectronics due to high-speed development of laptops, portable systems and cellular networks. Adder is the core element of complex arithmetic circuits as it is used in the automatic logic unit (ALU), in the floatingpoint unit, and for address generation in case of cache or memory access. The extensive use of this operation in arithmetic functions attracts many researchers to investigate in this field. The full adder design in static CMOS, with complementary pull-up PMOS and pull-down NMOS networks is the most conventional one but it requires as many as 28 transistors. The transmission function full adder (TFA) cell is based on the transmission function theory and it has 16 transistors. In the design is further reduced to only 14 transistors using the lower power XOR design and transmission gates. In most of the twenty full-adder cells were novel circuits at that time. They were formed from combinations of various XOR/XNOR, Sum and Carry modules. All of the below circuits can operate with full output voltage swing. To Pursue even less transistor count and lower power consumption, pass transistor logic (PTL) can be used in lieu of transmission gate. A new Full adder called static energy-recovery full-adder (SERF) uses only 10 transistors. It has been reported to be the least power consuming. Note that in PTL, the output voltage swing may be degraded due to the threshold loss problem. II. PREVIOUS WORK The full adder function [2-4] can be described as follows: The addition of two 1-bit Inputs A and B with fore stage carry Cin calculates the two 1-bit outputs Sum and Carry. A. XOR Full Adder The performance of the XOR gates can significantly improve the performance of the adder. The early designs of XOR gates were based on either eight transistors or six transistor that are conventionally used in most designs [2]. The designed 10T full adder shown in figure (2) uses 4T XOR shown in figure (1). Figure 1: 4T XOR Circuit A survey of literature reveals a wide spectrum of different types of XOR gates that have been realized over the years. In this shows the remarkable improvement in power-delay product. It also reduces the power consumption and reduces the silicon area. 249

The circuit is obtained to be extremely low power because it doesn t contain direct path to the ground and the charge stored at the load capacitance is reapplied to the control gates (energy recovery). The elimination of the path to the ground reduces the total power consumption by reducing the short circuit power consumption [8]. SERF adder consists of two XNOR and one MUX circuit. XNOR designed by using four transistors shown in figure (3). B. XNOR Full Adder Figure 2: 10T XOR Full adder The XNOR full adder is also called SERF adder. The Static Energy Recovery Full (SERF) [3] adder shown in figure (4) requires only 10 transistors to implement a full adder. Where an intermediately generated XNOR (A,B) signal is shared to generate the carry out and the sum outputs. Figure 4: XNOR full adder (10T) Figure 3: XNOR (4T) circuit The circuit is obtained to be extremely low power because it doesn t contain direct path to the ground and the charge stored at the load capacitance is reapplied to the control gates (energy recovery).the elimination of the path to the ground reduces the total power consumption by reducing the short circuit power consumption [8]. C. CLRCL Full Adder The complementary and level restoring carry logic (CLRCL) as shown in figure (5). The goal is to reduce the circuit complexity and to achieve faster cascaded operation. The strategy is to avoid multiple threshold voltage losses in carry chain by proper level restoring. DC and transient analysis depicts that the CLRL adder encounters only one threshold voltage loss problem and requires the minimum VDD. In addition, the performance edge of the CLRCL circuit in both speed and energy consumption becomes even more significant as the word length of the adder increases. The major limitation of CLRCL design is a skew between inputs to the various subsections in CLRCL full adder [4]. The CLRCL adder consists of three MUX circuits and two inverter. This CLRCL adder circuit is compared to XNOR and XOR full adder it is low power consumption and reduces the delay. 250

D. Shannon Full Adder Figure 5: CLRCL Full adder The fourth type of full adder is 12 Transistor full adder using Shannon based pass transistor logic. The proposed Shannon full adder circuit as shown in Figure (6). Combines the multiplexing operation for the sum and carry operation. The Shannon Theorem for the carry operation; the sum and carry circuits are designed based on Standard full adder equations. An input C and its complement are used as the control signal of the sum circuit. Two complementary (C and B) inputs are used in the full adder carry circuit for balancing the circuit and to avoid the floating wire concept. In this circuit, all of the pass inputs are connected at VDD line so that the pass gates are always on. The control input terminals are connected to the function inputs. In the proposed adder 2, from Table I instead of giving all the inputs from external input the internal output from the SUM circuitry acts as input to the carry logic. DC and transient analysis depicts that the CLRL adder encounters only one threshold voltage loss problem and requires the minimum VDD. III. Figure 6: Shannon full adder THE PROPOSED FULL ADDER DESIGN In this paper, several different designs are included for performance comparison with proposed new 8T adder. Altogether four full adders are analyzed with respect to number of transistors used, their respective power dissipation and delay including the proposed new 8T adder. In order to analyze the compare the performance of the proposed new 8T adder with previously reported adders, extensive simulation studies have been carried out on the different types of adders. A. New XOR full adder The proposed full adder circuit which uses two XOR and one multiplexer, it s give the less delay product compare to 10T Full Adder based on 4T XOR. Requires only eight transistors the one with the least transistor count learned so far from the literatures. The full adder function can be described as follows: The addition of two 1-bit Inputs A and B with forestage carry Cin calculates the two 1-bit outputs Sum and Carry, where Sum = A xor B xor Cin Cout = (A and B) or (Cin and (A xor B)) 251

The new 8T full adder shown in figure (8) consists of two XOR and one inverter as shown in figure (7). The new 8T full adder shown in figure (10) consists of two XOR and one inverter as shown in figure (9). Figure 7: XOR (3T) circuit Figure 9: 3T XONR Circuit B. New XNOR Full Adder Figure 8: XOR full adder (8T) The proposed full adder circuit which uses two XOR and one multiplexer, it s give the less delay product compare to 10T Full Adder based on 4T XOR. Requires only eight transistors the one with the least transistor count learned so far from the literatures. The design adopts inverter buffered XOR/XNOR designs to reduce the threshold voltage loss problem. The full adder function can be described as follows: The addition of two 1-bit Inputs A and B with forestage carry Cin calculates the two 1-bit outputs Sum and Cout where, Sum = (A xnor B) xnor Cin Cout = (A xnor B) and Cin + (A xnor B) and A 252 Figure 10: XNOR full adder (8T) C. CLRCL Full Adder The full adder function can be described as follows the addition of two 1-bit inputs A and B with forestage carry Cin calculates the two 1-bit outputs Sum and Cout, where Sum Cin. Cout Cin. B Cout Cin. Cin.

From above equation the new 8T full adder shown in figure (11) using 2:1 mux is designed. The proposed full adder circuit, which uses three multiplexers and an inverter, requires eight transistors. The entire design process can be divided into Several steps as follows: 1. AxorC (or AxorCin) is needed as a control signal in multiplexers MUX2 and MUX3 to generate Cout and Sum. In this study, in A C is implemented by MUX1 shown in figure (11). 2. The multiplexer circuit MUX2 is adopted in our proposed design to generate Cout followed by an inverter INV. The inverter has three advantages for the circuit: firstly, it speeds up the carry propagation as a buffer along the carry chain. Secondly, it provides complementary signals needed for the generation of Sum. Thirdly, the inverter can improve the output voltage swing as a level restoring circuit. 3. The Sum is generated by the multiplexer MUX3 passing either B or out C according to the value of in AxorC. According to standard full adder equation, the sum circuits need three inputs. In order to avoid increasing the number of transistors due to the addition of a third input, the following arrangement is made, the CPL XOR gate multiplying with C s complement input and XNOR gate is multiplied with input C, and thereby reducing the number of transistors in the sum circuit. The carry for the half adder is given by, Sum= ((A xor B).C ) + ((A xor B).C) Carry= (A+B) C + (A.B) + (B C ) + (A.B ) Figure 12: Shannon full adder D. Shannon Full Adder Figure 11: CLRCL Full adder The proposed Shannon full adder circuit as shown in Figure(12) combines the multiplexing operation for the sum operation and the Shannon Theorem for the carry operation; the sum and carry circuits are designed based on Standard full adder equations. An input C and its complement are used as the control signal of the sum circuit. The two-input XOR gate is developed using the multiplexer method. The output node of the two-input multiplexer circuit is the differential node. IV. MULTIPLIER DESIGN High-speed multiplication is another critical function in a range of very large scale integration (VLSI) applications. Multiplications are expensive and slow operations. Multiplication is an important basic arithmetic operation and less common operation than addition, but it is still essential for microprocessors, digital signal processors and graphic engines. Multiplication is logically carried out by a sequence of addition, subtraction and shift operations. Therefore, high-speed multiplication can be achieved by having a high-speed multiplier. In this paper four different multipliers are considered for analysis of the adders. The multipliers are the structures where there will be many cascading stages of the full adder, so the performance of the full adders while cascading too many stages can be easily studied by analyzing the power, delay, power-delay product of the different multipliers made from different adders A. Braun Multiplier Braun s multiplier is an n x m bit parallel multiplier and generally known as carry save multiplier and is constructed with m x (n-1) adders and m x n AND gates. 253

The Braun s multiplier has a glitching problem which is due to the ripple carry adder in the last stage of the multiplier. This can be called as non-addictive multipliers. Figure 14: Braun Multiplier circuit (4x4) Figure 13: Architecture of Braun Multiplier The schematic diagram of Braun multiplier is as shown in the figure (13). Each of the ai x bj product bits is arranged in parallel with AND gates. Each partial product can be added to the previous sum of partial product.as the carry bits are passed diagonally downward to the next adder stage, there is no horizontal carry propagation for the first four rows. Instead, the respective carry bit is saved for the subsequent adder stage. Ripple carry adders are used at the final stage of the array to output the final result. It is a simple parallel multiplier generally called as carry save array multiplier. It has been restricted to perform signed bits. B. Vedic Multiplier The Vedic multiplier is based on the Urdhva Tiryagbhyam sutra (algorithm). These Sutras have been traditionally used for the multiplication of two numbers in the decimal number system. In this work, we apply the same ideas to the binary number system to make the proposed algorithm compatible with the digital hardware. It is a general multiplication formula applicable to all cases of multiplication. It literally means Vertically and crosswise. It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. The algorithm can be generalized for n x n bit number. Since the partial products and their sums are calculated in parallel, the multiplier is independent of the clock frequency of the processor. Due to its regular structure, it can be easily layout in microprocessors and designers can easily circumvent these problems to avoid catastrophic device failures. 254

Algorithm for 4 x 4 bit Vedic multiplier Using Urdhva Tiryakbhyam: (Vertically and crosswise) for two Binary numbers - CP = Cross Product (Vertically and Crosswise) X3 X2 X1 X0 Multiplicand Y3 Y2 Y1 Y0 Multiplier --------------------------------------------------------- H G F E D C B A ----------------------------------------------------- S7 S6 S5 S4 P3 S2 S1 S0 Product --------------------------------------------------------- PARALLEL COMPUTATION METHODOLOGY 1. CP X0 = X0 * Y0 = A Y0 2. CP X1 X0 = X1 * Y0+X0 * Y1 = B Y1 Y0 3. CP X2 X1 X0= X2*Y0+X0*Y2 +X1* Y1 = C Y2 Y1 Y0 4. CP X3X2X1X0 =X3*Y0+X0*Y3+X2*Y1 +X1*Y2 = D Y3 Y2 Y1 Y0 5. CP X3 X2 X1 =X3*Y1+X1*Y3+X2*Y2= E Y3 Y2 Y1 6. CP X3 X2 = X3*Y2+X2*Y3 = F Y3 Y2 7. CP X3 = X3*Y3 = G Y3 Figure 16: Vedic Multiplier circuit (4x4) C. Wallace tree multiplier The Wallace tree multiplier shown in figure 1.6 is considerably faster than a simple array multiplier because its height is logarithmic in word size, not linear. However, in addition to the large number of adders required, the Wallace tree s wiring is much less regular and more complicated. As a result, Wallace trees are often avoided by designers, while design complexity is a concern to them. This class of multipliers is based on reduction tree in which different schemes of compression of partial product bits can be implemented. In tree multiplier partial-sum adders are arranged in a treelike fashion, reducing both the critical path and the number of adders needed as shown in the figure 17. A collection of AND2 gates generate the partial products or multiples simultaneously. The multiples are added in combinational partial products reduction tree using carry save adders, which reduces them to two operands for the final addition. Figure 15: Block Diagram of Vedic Multiplier Figure 17: Architecture of Wallace Tree Multiplier (4x4) 255

The results from CSA are in redundant form. Finally, the redundant result is converted into standard binary output at the bottom by the use of CPA [19] as shown in Fig.18. Schematic diagram of unsigned Tree Multiplier is shown in Figure. In this figure (Y3, Y2, Y1, Y0) is multiplicand and (X3, X2, X1, X0) is multiplier. In place of input bit pattern voltage source is applied. P7P6P5P4P3P2P1P0 is the output of multiplier where P0 is LSB and P7 is MSB. Adder types Table 1 Power Comparison of Full Adders Power(uw) Existing Proposed method Method XOR 2.09 1.80 XNOR 2.62 1.99 CLRCL 2.02 1.23 SHANNON 1.92 1.04 Table 2 Transistor count Comparison of Full Adders cells Adder types Existing method Transistor count Proposed Method XOR 10 8 XNOR 10 8 CLRCL 10 8 SHANNON 12 8 Figure 18: Wallace Tree Multiplier circuit (4x4) V. SIMULATION RESULTS The result of the comparative study shows that the performance of the proposed 8T full adder cell is the best among all. The 8T full adder cell also occupies the minimum silicon area on chip amongst all the full adders reported so far in the literature. The net effects that our proposed 8T full adder cell shows a much better performance compared to any other adders available in the literature. The simulation results reveal that the proposed 8T full adder is proven to be the best if the main design aspects of area covered on chip, threshold loss, and delay and power consumption are the ultimate goals shown in table 1-2. Figure 19: Output wave form of full adder The result of the comparative study shows that the performance of the proposed 8T full adder cell is the best among all. The 8T full adder cell also occupies the minimum silicon area on chip amongst all the full adders reported so far in the literature. The net effects that our proposed 8T full adder cell shows a much better performance compared to any other adders available in the literature. 256

Table 3 power comparison of multipliers Multiplier (4x4) Adder type Power(mw) Braun multiplier Existing method 8.57 Proposed method 6.89 Wallace Tree Multiplier Existing method 7.75 Proposed method 5.89 Vedic multiplier Existing method 7.12 Proposed method 4.86 Figure 21: Output wave form of Wallace Tree multiplier Figure 20: Output wave form of Braun multiplier Figure 22: Output wave form of Vedic multiplier 257

VI. CONCLUSION The Proposed 8T full adders have 33 % power savings when compared with Existing 10 T Full adders. The multipliers implemented using proposed 8T full adders have least power delay product and least delay in all the cases and also reduce the number of transistor count in multiplier. For the future work the low power multiplier can be implemented in the DSP blocks and the power consumption can be calculated. Application of DSP blocks used in communication fields for example, I will design and analysis the filters. REFERENCES [1 ] Jin-FaLin, Yin-Tsung Hwang, A Novel High-Speed and Energy Efficient 10- Transistor Full Adder Design Vol. 54, NO. 5, MAY 2007. [2 ] E. Abu Shama and M. Bayoumi, A new cell for low power adders, in Proc. Int. Midwest symp Circuits Syst., 1995, pp. 1014 1017.DSP Journal, Volume 9, Issue 1, June, 2009 [3 ] T.Kowsalya, Tree Structured Arithmetic circuit by using different CMOS logic styles ICGSTPDCS, Volume 8, Issue 1, December 2008. [4 ] Deepak, G.Meher, P.K.Sluzek, "Performance Characteristics of Parallel and Pipelined Implementation of FIR Filters in FPGA Platform", in Signals, Circuits and Systems 2007. ISSCS2007. International Symposium on Publication Date: 13-14 July 2007. [5 ] N. Zhuang and H. Wu, A new design of the CMOS full adder, IEEE J. Solid-State Circuits, vol.27, no. 5, pp. 840 844, May1992. [6 ] J. Wang, S. Fang, and W. Feng, New efficient designs for XOR and XNOR functions on the transistor level, IEEE J.Solid State Circuits, vol.29, no. 7, pp. 780 786, Jul. 1994. [7 ] A. M. Shams and M. Bayoumi, A novel high performance CMOS 1-bit full adder cell, IEEE Trans. Circuits Syst. II, Analog Digital Signal Process., vol. 47, no. 5, pp. 478 481,May 2000. [8 ] C. S. Wallace, "A suggestion for a fast multiplier," IEEE Trans. Electronic Computers, vol. EC-13, pp. 14-17, February1964. [9 ] Reto Zimmermann and Wolfgang Fichtner Low-Power Logic Styles: CMOS versus Pass- Transistor Logic IEEE Journal of Solid-State Circuits, Vol.32, No.7, April 1997, pp.1079 1090. [10 ] Zhijun Huang, High level optimization techniques for low power multiplier design 2003. Authors Biography Ms. S. Kiruthika is a PG scholar doing her M.E VLSI Design in Bannari Amman Institute of Technology Sathyamangalam, Anna University, Chennai. She received her B.E. (Electrical and Electronics Engineering) degree from M. Kumarasamy college of Engineering, Anna University, Coimbatore in April 2011.She has attended 3 National and International Conferences. Mr. R. Nirmal Kumar has completed his B.E in Electrical and Electronics Engineering from A.R.J college of Engineering and Technology in 2008 and M.E in VLSI Design from Bannari Amman Institute of Technology Sathyamangalam in 2010. He has two years of teaching experience in Bannari Amman Institute of Technology, Sathyamangalam and currently he holds the post of Assistant professor in the Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam, and Tamilnadu, India. He has published more than 3 research papers in international journals. He has attended 12 International conferences and 10 National conferences. Dr. S. Valarmathy received her B.E.(Electronicsand Communication Engineering) degree and M.E. (Applied Electronics) degree from Bharathiar University, Coimbatorein April 1989 and January 2000 respectively. She received her Ph.D. degree at AnnaUniversity, Chennai in the area of Biometrics in 2009. She is presently working as Professor& Head in the department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathyamangalam. She is having a total of 20 years of teaching experience in various engineering colleges. Her research interest includes Biometrics, Image Processing, Soft Computing, Pattern Recognition and Neural Networks. She is the life member in Indian Society for Technical Education and Member in Institution of Engineers. She has published 14 papers in International and National Journals, 48 papers in International conferences and National Conferences. 258