Design Methods for Binary to Decimal Converters Using Arithmetic Decompositions



Similar documents
RN-Codings: New Insights and Some Applications

COMBINATIONAL CIRCUITS

RN-coding of Numbers: New Insights and Some Applications

(b) (b) f 1 = x 1 x 2 ( x 3 v x 4 )

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}

Factoring Algorithms

Binary Adders: Half Adders and Full Adders

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

CHAPTER 3 Boolean Algebra and Digital Logic

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

The string of digits in the binary number system represents the quantity

Sistemas Digitais I LESI - 2º ano

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

Chapter 1: Digital Systems and Binary Numbers

Chapter 6. Linear Programming: The Simplex Method. Introduction to the Big M Method. Section 4 Maximization and Minimization with Problem Constraints

C H A P T E R. Logic Circuits

CSI 333 Lecture 1 Number Systems

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

α = u v. In other words, Orthogonal Projection

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Notes on Determinant

Floating Point Fused Add-Subtract and Fused Dot-Product Units

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Chapter 2 Logic Gates and Introduction to Computer Architecture

Understanding Logic Design

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

The BBP Algorithm for Pi

3.Basic Gate Combinations

Mathematics Course 111: Algebra I Part IV: Vector Spaces

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

2011, The McGraw-Hill Companies, Inc. Chapter 3

Continued Fractions and the Euclidean Algorithm

Counters and Decoders

6.2 Permutations continued

AN IMPROVED DESIGN OF REVERSIBLE BINARY TO BINARY CODED DECIMAL CONVERTER FOR BINARY CODED DECIMAL MULTIPLICATION

Zeros of Polynomial Functions

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

Combinational Logic Design

Multipliers. Introduction

BINARY CODED DECIMAL: B.C.D.

An Efficient Architecture for Image Compression and Lightweight Encryption using Parameterized DWT

Implementation and Design of AES S-Box on FPGA

RAM & ROM Based Digital Design. ECE 152A Winter 2012

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

Lecture 3: Finding integer solutions to systems of linear equations

Digital Design. Assoc. Prof. Dr. Berna Örs Yalçın

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

PROGRAMMABLE LOGIC CONTROLLERS Unit code: A/601/1625 QCF level: 4 Credit value: 15 TUTORIAL OUTCOME 2 Part 1

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

LECTURE 5: DUALITY AND SENSITIVITY ANALYSIS. 1. Dual linear program 2. Duality theory 3. Sensitivity analysis 4. Dual simplex method

A High-Performance 8-Tap FIR Filter Using Logarithmic Number System

Combinational-Circuit Building Blocks

Solution of Linear Systems

GREATEST COMMON DIVISOR

Lecture 1: Course overview, circuits, and formulas

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

Discrete Mathematics: Homework 7 solution. Due:

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

BOOLEAN ALGEBRA & LOGIC GATES

MATH10040 Chapter 2: Prime and relatively prime numbers

A New Reversible TSG Gate and Its Application For Designing Efficient Adder Circuits

Lecture 8: Binary Multiplication & Division

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

CS321. Introduction to Numerical Methods

Chapter 3. if 2 a i then location: = i. Page 40

High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

DESIGN OF AN ERROR DETECTION AND DATA RECOVERY ARCHITECTURE FOR MOTION ESTIMATION TESTING APPLICATIONS

FPGA Implementation of an Extended Binary GCD Algorithm for Systolic Reduction of Rational Numbers

United States Naval Academy Electrical and Computer Engineering Department. EC262 Exam 1

Introduction to Matrix Algebra

Note on some explicit formulae for twin prime counting function

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Introduction to Programming (in C++) Loops. Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC

Example. Introduction to Programming (in C++) Loops. The while statement. Write the numbers 1 N. Assume the following specification:

8 Primes and Modular Arithmetic

Vector and Matrix Norms

International Journal of Electronics and Computer Science Engineering 1482

Grade 6 Math Circles. Binary and Beyond

Binary, Hexadecimal, Octal, and BCD Numbers

ECE 842 Report Implementation of Elliptic Curve Cryptography

CSE140: Components and Design Techniques for Digital Systems

Let s put together a Manual Processor

NUMBER SYSTEMS. 1.1 Introduction

We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: b + 90c = c + 10b

THE DIMENSION OF A VECTOR SPACE

A New Euclidean Division Algorithm for Residue Number Systems

Design and Analysis of Parallel AES Encryption and Decryption Algorithm for Multi Processor Arrays

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

Useful Number Systems

Section IV.1: Recursive Algorithms and Recursion Trees

Continued Fractions. Darren C. Collins

Combinational circuits

Gates, Circuits, and Boolean Algebra

11 Multivariate Polynomials

Transcription:

J. of Mult.-Valued Logic & Soft Computing, Vol., pp. 8 7 Old City Publishing, Inc. Reprints available directly from the publisher Published by license under the OCP Science imprint, Photocopying permitted by license only a member of the Old City Publishing Group Design Methods for Binary to Decimal Converters Using Arithmetic Decompositions Yukihiro Iguchi, Tsutomu Sasao and Munehiro Matsuura Department of Computer Science, Meiji University, Japan E-mail: iguchi@cs.meiji.ac.jp Department of Computer Science and Electronics, Kyushu Institute of Technology, Japan E-mail: sasao@cse.kyutech.ac.jp Received: April, 7. In digital signal processing, radixes other than two are often used for high-speed computation. In the computation for finance, decimal numbers are used instead of binary numbers. In such cases, radix converters are necessary. This paper considers design methods for binary to q-nary converters. It introduces a new design technique based on weighted-sum (WS) functions. The method computes a WS function for each digit by an LUT cascade and a binary adder, then adds adjacent digits with q-nary adders. A 6-bit binary to decimal converter is designed to show the method. Keywords: Radix converter, LUT cascade, arithmetic Decomposition, weightedsum function. INTRODUCTION Arithmetic operations of digital systems are usually done by binary numbers [9]. However, for high-speed digital signal processing, p-nary (p > ) numbers are often used [,6]. Computation for finance usually uses decimal numbers instead of binary numbers. In such cases, conversions between binary numbers and p-nary numbers are necessary. Such operation is radix conversion [,8]. Various methods exist to convert p-nary numbers into q-nary numbers, where p and q. Many of them require large amount of computations. Radix converters can be implemented by table lookup. That is, to store the conversion table in the memory. This method is fast, but requires a large amount of memory. When the number of inputs is large, the memory is too large to implement. Thus, more efficient methods have been developed. MVLSC d6 7/6/5 6: page #

Yukihiro Iguchi et al. In [7], ROMs and adders are used to implement binary to decimal converters. In [], LUT cascades [] are used to implement binary to ternary converters, ternary to binary converters, binary to decimal converters, and decimal to binary converters. In [], weighted-sum functions (WS functions) are introduced to design radix converters by LUT cascades. In [,], LUT cascade and arithmetic decomposition [] are used to implement p-nary to binary converters that require smaller memory. In this paper, we consider a design method of p-nary to q-nary (q > ) converters by using LUT cascades and arithmetic decompositions. To explain the concepts, we use examples of binary to decimal converters, but the method can be easily modified to any numbers of p and q. A 6-bit binary to decimal converter is designed to show the method. A preliminary version of this paper was presented at ISMVL-7 [5]. RADIX CONVERTER. Radix conversion Definition. Let x = (x n,x n,...,x ) p be a p-nary number of n-digit, and let y = (y m,y m,...,y ) q be a q-nary number of m-digit. Given the vector x, the radix conversion is the operation that obtains y satisfying the relation: n x i p i = y j q j, () i= m where x i P, y j Q, P ={,,...,p }, and Q ={,,...,q }. When p(q) is not, they are represented by binary coded p(q)-nary numbers. Definition. Let i be an integer. BIT(i, j) denotes the j-th bit of the binary representation of i, where the LSB is the -th bit. Example. The integer 6 is represented by the binary number (,, ) = (BIT(6, ), BIT(6, ), BIT(6, )). Thus, BIT(6, ) =, BIT(6, ) =, and BIT(6, ) =. Example. In the case of a binary to decimal converter, a decimal number is represented by the binary-coded-decimal (BCD) code. That is, is represented by (), is represented by (),...,8 is represented (), and 9 is represented by (). Note that (), (),...,() are unused codes. Table is the truth table of the -digit binary to decimal converter. j= MVLSC d6 7/6/5 6: page #

Design Methods for Binary to Decimal Converters Binary Decimal Binary Coded Decimal x x x x y y 5 6 7 8 9 5 TABLE Truth table of a binary to decimal converter. In Table, the inputs in the binary representation are denoted by x = (x,x,x,x ). The outputs in the decimal representation are denoted by y = (y,y ), and in the binary-coded-decimal representation are denoted by (BIT(y, ), BIT(y, ), BIT(y, ), BIT(y, ), BIT(y, ), BIT(y, ), BIT(y, ), BIT(y, )).. Conventional realization Random Logic Realization Radix converters can be implemented by using comparators, subtracters, and multiplexers. Example. Consider the 8-digit binary to decimal converter (8bindec) shown in Figure. It converts 8-digit binary numbers x into -digit BCD. The output range is to 55. First, the subtracter S calculates x. The comparator C compares the input x with. When x (x <), BIT(y, ) (), and the multiplexer M passes t = x (t = x) to the next stage. Next, the subtracter S calculates t. The comparator C compares the value t with. When t (t < ), BIT(y, ) (), and the multiplexer MVLSC d6 7/6/5 6: page #

Yukihiro Iguchi et al. x 8 t t t t t 5 M M M M 5 M 6 M S S S S S 5 S 6 - S Y - S Y - S Y - S Y - S Y - Y S > = C C C C C 5 C 6 y > = 8 > = BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) BIT(y,) 8 8 8 } y > = > = } FIGURE 8-digit binary to decimal converter: Random Logic Realization. > = } y M passes t = t (t = t ) to the next stage. In this way, the circuit converts the binary number into the decimal number. Single Memory Realization Figure shows a single memory realization that stores the truth table of the radix converter. It converts n-digit binary-coded p-nary numbers into m-digit binary-coded q-nary numbers. Let d in be the number of bits to represent an input digit. Then, d in = log p. Since the number of input digits is n, the total number of bits in the input is nd in.ap-nary number with n digits takes values from to p n. Let d out be the number of bits in an output digit. Then, d out = log q. The number of output digits is m = n log q p. Note that the most significant digit requires only d outm = log ([p n /q m ]) bits. Thus, the size of the memory is nd in((m ) d out d outm ) bits. This method is simple and fast, but, when the number of digits for the radix converter is large, the memory will be huge. n-digit p-nary number x n-,..., x n.din = n log p LUT m-digit binary coded q-nary number, where m = n.log q p. y m-,..., y log q n.log q p FIGURE n-digit p-nary to q-nary converter: Single memory realization. MVLSC d6 7/6/5 6: page #

Design Methods for Binary to Decimal Converters 5 Example. The 6-digit binary to decimal converter (6bindec) takes the range [, 6 ] = [, 6555]. The output is represented by m = log 6555 =5 digits. d out =, and d outm =. When the converter is realized by a single memory, its size is 6 ((5 ) ) =, 5, 8 bits. LUT Cascade Realization In a single memory realization of a p-nary to q-nary converter, the size of memory tends to be too large. To reduce the amount of hardware, LUT cascades realizations are used in [], where outputs are partitioned into groups. By using the functional decomposition theory [9], we can predict such a realization is feasible or not. To perform functional decompositions, a Binary Decision Diagram for Characteristic Function (BDD_for_CF) [] is used as the data structure. Figure shows the 6-digit binary to decimal converter [], where the LUT cascade with three cells realizes all the outputs. LUTs are implemented by memories. The total memory size is 7,78 bits. This method uses only memory, and the interconnections are limited only to adjacent memories. However, since the logic synthesis uses BDD_for_CFs, when the number of digits is large, the computation time tends to be excessive. On the other hand, the design method in [] finds LUT cascades without using BDDs_for_CF. It produces circuits with a smaller amount of memory than [] by using arithmetic decompositions. Unfortunately, this method can design only p-nary (p >) to binary converters, and is not applicable to binary to q-nary (q >) converters. Realization by Memories and q-nary Adders Design methods of binary to decimal converters using memories and adders are shown in [7]. Figure shows a 6-digit binary to decimal converter [7]. 5 5 Binary number 576-bit LUT 58-bit LUT 7 66-bit LUT K 8K K 8 K K K K BCD number Total Memory Size = 778 bits FIGURE 6-digit binary to decimal converter: LUT Cascade Realization. MVLSC d6 7/6/5 6: page 5 #5

6 Yukihiro Iguchi et al. The features of this circuit are as follows:. In the binary to decimal converter, the input is directly connected to the least significant bit of the output ().. Other 5 inputs are divided into two: The upper 9 bits and the lower 6 bits. The upper 9 bits are connected to the first LUT, and the lower 6 bits are connected to the second LUT. Furthermore, the most significant three bits are connected to the third LUT. For all possible input combinations, each LUT stores corresponding BCD numbers.. The most significant digit (K, K, and K) is obtained by a -input LUT (originally, which was implemented by gates) and a binary adder.. The total amount of memory is 8, 6 bits. 5. The second LUT stores BCD values in excess-6-code. 6. BCD additions are implemented by binary adders and special subtracters. When the carry is added to the next higher order digit, 6 is subtracted from the BCD digit. The original circuit in [7] used a -of-6 decoder which generates memory selection signals for 6 small-scale memories. Since larger memories are available today, we replaced them by a single LUT. With this method, a binary to decimal converter is implemented by using LUTs and q-nary (q = ) adders. This method reduces the total amount of memory by partitioning the inputs into two. Binary number 5 5 9 8 7 6 5 768-bit LUT 5-bit LUT -bit LUT K K K -6/ -6/ -6/ -6/ 8K K 8 8 8 K K BCD number Total Memory Size = 86 bits FIGURE 6-digit binary to decimal converter: Realization by memories and BCD adders MVLSC d6 7/6/5 6: page 6 #6

WS FUNCTION Design Methods for Binary to Decimal Converters 7 The weighted sum function (WS function) is a mathematical model of radix converters, bit-counting circuits, and convolution operations [,]. In this section, we show some properties of WS functions, and give a design method of radix converters by using them. Definition. An n-input WS function [] is defined as n WS( x) = w i x i, () i= where x = (x n,x n,...,x,x ) is the input vector, w = (w n, w n,...,w,w ) is the weight vector consisting of integers. In this paper, we represent p-nary to q-nary conversions with WS functions. From here, unless otherwise noted, w i and x i denote non-negative integers. Definition. Let MIN n (MAX n ) be the minimum (maximum) number represented by an n-variable WS function WS( x) = n i= w ix i, where w i, x i {,,...,p }, and p. When all the input x i are s, a WS function takes its minimum value, MIN n = WS(,,...,) =. When all the input x i are p, the WS function takes its maximum value, MAX n = WS(p,p,...,p ) = n i= {w i (p )} =(p ) n i= w i. Definition 5. Given i<j, [i, j] denotes the set of integers, {i, i,...,j}. Next, we will consider the range of WS functions. Definition 6. Range(f (x)) denotes the range of a function f(x). Example 5. For WS ( x) = x x, and x i {,, }, we have Range(x ) ={,, }, and Range(x ) ={,, 6}. Thus, Range(WS ( x)) = {,,,,, 5, 6, 7, 8} =[, 8]. On the other hand, for WS ( x) = x x, and x i {,, }, we have Range(x ) = {,, }, and Range(x ) = {,, 8}. Thus, Range(WS ( x)) ={,,,, 5, 6, 8, 9, } = [, ]. For WS ( x) = x x, and x i {,, }, we have Range(x ) ={,, }, and Range(x ) ={,, }. Thus, Range(WS ( x)) ={,,,,, 5, 6} = [, 6]. Example 5 shows that Range(WS( x)) =[, MAX n ] holds in some cases and does not in other cases. It depends on the combinations of values of w i. In the case of WS ( x) = x x, MIN =, and MAX = ( ) =. Because the number of values for x i is, WS ( x)takes at most = 9 different values. Therefore, Range(WS ( x)) = [, ]. MVLSC d6 7/6/5 6: page 7 #7

8 Yukihiro Iguchi et al. Lemma. The number of values produced by a WS function WS( x), isat most p n, where WS( x) = n i= w ix i, w i, x i {,,...,p }, and p. Proof. Since the cardinality of the domain is p n, the cardinality of the range is at most p n. The next theorem shows the necessary and sufficient condition for Range(WS( x)) =[, MAX n ]. Theorem. Consider a WS function WS( x) = n i= w ix i, w i, x i {,,..., p }, and p. The necessary and sufficient condition for Range(WS( x)) =[, MAX n ] is w = and w i MAX i ( i n). Proof. Assume that the elements of the weight vector are arranged in the ascending order. First, we will show the sufficiency. Consider the one-variable function, WS( x) = w x. When w =, MAX = p. Since WS( x) takes all the values in [,p ], the theorem holds. Next, assume that the theorem holds for the k-variable WS function, WS k ( x) = k i= w ix i. That is if w = and w i MAX i (i k) then Range(WS k ( x)) =[, MAX k ], and MAX k = (p ) k i= w i. Then, consider the (k )-variable WS function, WS k ( x) = k i= w i x i. Note that WS k ( x) = k i= w i x i = ( k i= w ix i ) (w k x k ). Since, Range(w k x k ) ={,w k, w k,...,(p )w k }, the range for WS k ( x) = ki= w i x i is [, MAX k ] [w k,w k MAX k ] [w k, w k MAX k ] [(p )w k,(p )w k MAX k ]. From the hypothesis of induction, we have w k MAX k. Hence, we have Range(WS k ( x)) =[,(p ) w k MAX k ]=[, MAX k ]. Thus, the theorem holds. Next, we will show the necessity. If w, MAX = (p )w. On the other hand, since x can take only p values, Range(WS( x)) = [, MAX n ]. MAX i. Therefore, Range(WS( x)) = [, MAX n ]. Hence, we have the theorem. Corollary. WS( x) = n i= pi x i takes all the values in [,p n ]. Proof. w = p =, MAX k = k i= (pi (p )) = (p ) k i= pi = (p ) pn p = pn. Since w k = p k = MAX k, Range(WS( x)) =[,p n ] by Theorem. REALIZATION USING LUT CASCADES AND ARITHMETIC DECOMPOSITIONS In this part, we consider design methods of radix converters by using LUT cascades and arithmetic decompositions. MVLSC d6 7/6/5 6: page 8 #8

Design Methods for Binary to Decimal Converters 9 x y 5 6 LUT LUT x 5 6 x 5 6 y 7 8 6 9 LUT 7 8 9 y 6 5 x LUT 5 LUT x y 6 5 5 LUT y LUT LUT (a) Conventional Method (b) Proposed Method FIGURE 5 Realization Using Memories and q-nary Adders. y y y y y Figure 5(a) shows the conventional circuit which is redrawn from Fig. with fewer blocks. This circuit is implemented by using only three LUTs and five decimal adders. In Fig. 5(a), outputs of the middle LUT are connected to q-nary adders. On the other hand, Fig. 5(b) shows the proposed method in this paper. In this method, each LUT stores BCD values, and feeds to a digit and a carry out. Consequently, outputs from each LUT blocks are connected only to the corresponding q-nary adder and the adjacent q-nary adder. However, this method requires m LUTs. Furthermore, the numbers of inputs of LUTs for lower digits are large. From here, we are going to reduce the total amount of memory for LUT cascades using the concept of WS functions.. LUT Cascade Realizations of WS functions Here, we consider the logic function realized by each LUT in Fig. 5(b). Table shows decimal numbers for i (i = 5,,...,). The rows show digits for,,, and. We can obtain functions of LUTs in Fig. 5(b) from Table. For example, the value of the digit for = can be computed with x, x 5, and the carry propagation signal from the lower digit. We obtain the following equations: z = x x 5, () z = x (x x 5 ) x 6x 8x, () z = (x 7 x ) x 8 x 5x 9 7x 5, (5) MVLSC d6 7/6/5 6: page 9 #9

Yukihiro Iguchi et al. x 5 9 8 7 6 5 6 8 7 5 6 8 9 9 5 6 8 6 8 6 8 6 8 TABLE Decimal Number Representations for Power of Two z = (x x 9 ) (x 7 x ) x 5 x 5x 8 6(x 6 x 5) 8x 9(x x ), (6) z = x (x x 5 x 9 x ) (x x 6 x x ) 6(x x 8 x ) 8(x x 7 x x 5 ), (7) y = z z z z z, (8) where z i (i =,,,, ) represents the logic function for LUT i. Since equations () (7) represent WS functions, we can use the properties in Section to construct an LUT cascade having small amount of memory. Also, we can reduce the number of levels for the LUT cascade. From here, we will consider LUT cascade realizations for WS functions that require miminum memory. Assume that the elements of the weight vector are arranged in ascending order. Figure 6 shows an LUT cascade realization of a WS function, where each LUT has only one external input x i. In this case, j-th LUT ( j n ) produces values in [,(p ) j i= w i], r j = log {(p ) j i= w i} -output lines are necessary and sufficient. In this case, r i r j, (i < j) holds. To find the minimum cascades, we have to consider the cases where adjacent LUTs are merged or not. The number of different combinations is n. In this case, input variable x i incidents to LUT i. Each LUT has d = log p two-valued external input lines. With these constraints, we have n combinations for all LUT cascades realizations. The LUT cascade shown in Fig. 7(a) realizes equation (7). Each LUT has only one external input variable x i. As shown in Table, weight coefficients are non-negative integers. Equation (7) has only non-zero weight coefficients. Note that equation (5) representing the digit, has zero weight coefficients w,w,w,w 6,...,w which are multiplied by x,x,x,x 6,...,x. In the circuit realization, terms with zero coefficients are omitted, and the input variables are reordered so that the weights are arranged in increasing order. MVLSC d6 7/6/5 6: page #

Design Methods for Binary to Decimal Converters x d d d d r x r M M M [, (p-)w x ] [, (p-)σw i x i ] [, (p-)σw i x i ] i= i= x d = log p j r j = log (p-)σw i x i= i... r r n-... x n - m=r n- Mn- n- [, (p-)σw i x i ] i= FIGURE 6 LUT cascade realization of a WS function. x (x x 5 x 9 x )(x x 6 x x )6(x x 8 x )8(x x 7 x x 5 ) x x x 5 x 9 x x x 6 x x x x 8 x x x 7 x x 5 6778 bits 5 5 5 5 6 6 6 6 7 7 (a) [,] [,] [,5] [,7] [,9] [,] [,7] [,] [,5] [,] [,7] [,] [,5] [,59] [,67] [,75] (b) x {(x x 5 x 9 x )(x x 6 x x )(x x 8 x )(x x 7 x x 5 )} (c) (x x 5 x 9 x )(x x 6 x x )(x x 8 x )(x x 7 x x 5 ) x x 5 x 9 x x x 6 x x x x 8 x x x 7 x x 5 88 bits (d) 5 5 5 5 6 6 [,] [,] [,] [,] [,6] [,8] [,] [,] [,5] [,8] [,] [,5] [,9] [,] [,7] x x 5 x 9 x x x 6 x x x x 8 x x x 7 x x 5 76 bits (e) 6 5 5 5 6 8 8 8 8 768 [,] [,6] x x 5 x 9 x x x 6 [,] x x [,] x [,5] x 8 x x [,] x 7 [,5] x x 5 [,9] [,7] 76 bits (f) 6 5 5 6 8 56 8 6 768 [,] [,6] [,] [,5] [,5] [,9] [,7] x x 6 x x x x 7 8 56 [,5] x x 5 bits & -bit Adder x x 5 x 9 [,7] 5 x x x 8 x 768 6 [,9] [,7] 6 56 [,] [,] FIGURE 7 LUT Cascade for z. To find the minimum cascades, we consider the cases where adjacent LUTs are merged and not. Let s be the number of LUTs in the LUT cascade. The number of different combinations to consider is s. In this case, the input variable x i incidents to LUT i, and each LUT has d = log p two-valued inputs. With these constraints, we have s different combinations for all LUT cascades realizations. Theorem. Consider the LUT cascade in Fig. 6, where the variable ordering and assignment to the cells are fixed. In this case, the LUT cascade with the minimum memory size can be found among s different combinations. MVLSC d6 7/6/5 6: page #

Yukihiro Iguchi et al. Proof. The number of combinations to merge LUTs or not is s. We can find the minimum by exhausting these combinations. Example 6. Figure 8(a) (d) show the all possible LUT cascade realizations for the LUT cascades with cells shown in (a). The next lemmas show methods to detect mergeable LUTs in an LUT cascade. Lemma. Consider the LUT cascade in Fig. 9, where LUT H has (u k) inputs and (u k) outputs. In this case, two LUTs can be merged into one, and the amount of memory can be reduced. Proof. It is clear from the Fig. 9. Also, the LUT for H can be eliminated, and the size of memory is reduced. By using Lemma, we can find mergeable LUTs. In Fig. 9, we can reduce the LUT H and one level. Lemma. Consider the LUT cascade shown in Fig., where LUT H has (uk)-input and (uk )-output, and LUT G has (uk)-input (uk )- output. In this case, without increasing the amount of memory, two LUTs can be merged into one. Proof. The left LUT cascade realizes the function, G(X i,h(r i,x i )). The right LUT can realize the same function, because the number of primary inputs (outputs) is same as one in the left LUT cascade. (a) (c) A B C A B, C (b) A, B C (d) A, B, C FIGURE 8 All Possible LUT Cascade Realizations for the LUT cascades with cells. Xi Xi Xi Xi k l k l R i- u H uk G r R i- u G' r FIGURE 9 Mergeable LUTs. MVLSC d6 7/6/5 6: page #

Design Methods for Binary to Decimal Converters Xi Xi Xi Xi k k R i- u H uk- uk- u uk- G R i- G' FIGURE Mergeable LUTs with Same Size. H, G, and G are realized by memories, where the size of memories are (uk) (uk ), (uk ) (uk ) = (uk) (uk ), and (uk) (u k ), respectively. Note that the total amount of memories of both circuits are the same. (uk) (uk ) (uk ) (uk ) = (ukl) (uk ). Lemma shows a method to reduce one levels without increasing total amount of memory. For other cases, the merge of adjacent LUTs will increase the total amount of memory. Algorithm. (Merge of LUTs). Obtain the LUT cascade for the WS function where each LUT i has only one external input x i, (i =,,...,s ).. Apply Lemma to the LUT cascade repeatedly.. Apply Lemma to the LUT cascade repeatedly. Lemma and show conditions for merging two LUTs into one without increasing the amount of memory. In other cases, merging two LUTs increases the amount of memory. Theorem. Algorithm produces the LUT cascade with the minimum memory size. Proof. Consider the LUT cascade for the WS function after appling Algorithm. In the LUT cascade, consider two adjacent LUTs shown in Fig.. Let M, M, and M be the sizes of LUTs. Then, we have M = (uk) v, M = (vl) w = v l w, and M = (ukl) w = (uk) l w. Lemma corressponds the case v = u k. Because the LUT cascade realizes a WS function shown in Fig. 6, v u k and v w. Consider the case v<u k. Note that k, l, v, and w are positive integer. We have M = (uk) v (ukl) w M = (vl) w (ukl) w = M, (9) = M. () MVLSC d6 7/6/5 6: page #

Yukihiro Iguchi et al. Xi Xi Xi Xi k l k l Ri- u v w u M M Ri- M w FIGURE Merging Adjacent LUTs. (a) (b) Thus, M M M. From equations (9) and (), we can see that the condition M M = M is holds only if M = M = M. Thus, v = uk. Since v w, the only case is l =, and v = w. These two conditions correspond the case of Lemma. Thus, we have the theorem. Offen, we need the LUT cascade with fewer levels by increasing the amount of memory. This can be done as follows: Let s be the number of LUTs in the LUT cascade obtained by Algorithm. The number of combinations to merge adjacent LUTs or not is s. Among these cascades, select the LUT cascade that satisfies required specification. Example 7. Figure 7 shows the design process of the LUT cascade for the equation (7) by using Algorithm. The integer in each LUT in (e) and (f) denotes the size of the memory. Figure 7(a) shows the LUT cascade for the equation (7), where each LUT has only one external input x i. The total memory size is 6,778, and the number of levels is 6. In the binary to decimal conversion, the input x directly represents the output BIT(y, ) [7]. Thus, we can remove the term x from the equation (7). Next, factor it by two, and we have the formula shown in (b). (d) shows the LUT cascade that corresponds to the equation (c). The total memory size is,88, and the number of levels is 5. (e) shows the LUT cascade that is obtained by applying Lemma to (d) repeatedly. The total memory size is, 76, and the number of levels is 9. (f) shows the LUT cascade that is obtained by applying Lemma to (e) repeatedly. The total memory size is,76, and the number of levels is 7. Note that the outputs of the final LUT are represented by the BCD code, since they are inputs of BCD adders. Example 8. By applying Algorithm to equations () (7), we have LUT cascades. Then, we add the outputs by BCD adders. Figure is the resulting MVLSC d6 7/6/5 6: page #

Design Methods for Binary to Decimal Converters 5 x x 5 x 9 x x x 6 x x x x 8 x x x 7 x x 5 x 5 5 6 6 8 56 8 6 768 [,7] [,] [,6] [,] [,5] [,5] [,9] [,7] x x 9 x 7 x x 5 x x 8 x 6 x 5 x x x 5 7 8 8 6 79 cout [,6] [,] [,] x 7 x x 8 x x 9 [,56] x 5 cin 5 8 6 cout [,7] x x x 5 x x [,9] x cin 6 8 9 cout [,5] [,] x x 5 cin Total Memory Size = 5 bits FIGURE Realization of 6-digit Binary to Decimal Converter using LUT cascades and BCD adders [,] y y y y y 6-digit binary to decimal converter, where the total memory size is 5, and uses four decimal adders. Note that the circuit in Fig. requires 8,6 bits, and uses five BCD adders.. Realization Using Arithmetic Decompositions of WS Functions To further reduce the size of memory, we can use arithmetic decompositions. Theorem. A WS function can be represented as a sum of two WS functions as follows: n WS( x) = w i x i = αws A ( x) WS B ( x), () i= where WS A ( x) = n i= a ix i,ws B ( x) = n i= b ix i, and α is an integer. This is the arithmetic decomposition, where α is the decomposition coefficient. α can be an arbitrary integer, where α< n i= w ix i. In this part, we consider only the case, where α =, X A X B = φ, X A X B = X, and X A, X B, X denote set of variables. From here, we will consider to reduce the total amount of memory of LUT cascades by arithmetic decompositions. When we realize a binary to decimal converter by LUT cascades and BCD adders, the last LUTs have to represent BCD codes. That is, the last LUT need to have the decoding function. Thus, we will decompose the LUT cascade except for the last LUT by a binary adder. MVLSC d6 7/6/5 6: page 5 #5

6 Yukihiro Iguchi et al. x x 6 x x x x 5 x 9 6 8 x 6 x 56 [,7] x x x 8 x [,] x 5 x 7 6 x x 9 x x 5 x 8 8 x x 7 x x x 5 [,5] 5 x x 5 768 56 [,9] [,7] x x x [,] 5 7 [,5] 79 [,] x 9 [,56] x 5 5 [,5] 6 x 7 x x 8 x [,9] [,7] x 8 6 9 [,] x x 5 x x [,5] [,7] [,] Total Memory Size = 788 bits FIGURE 6-digit binary to decimal converter using LUT cascades, binary adders, and BCD adders. 6 x cout cin cout cin cout cin y y y y y When the numbers of bits in two inputs of a binary adder are balanced, we can use a smaller adder. Thus, we try to partition the input variables so that the sum of weight coefficients will balance. Algorithm. (Partition of the inputs into two sets). X A φ; X B φ; IN {w,w,w,...,w n };. If IN = φ then goto step (7);. Let w i be the minimum element in IN;. IN IN {w i }; 5. If ( X A X B ), then X A X A {x i }, else X B X B {x i }; 6. Go to step (); 7. Sets X A and X B represent a partition of the inputs. MVLSC d6 7/6/5 6: page 6 #6

Design Methods for Binary to Decimal Converters 7 Example 9. Figure 7(g) is obtained by the arithmetic decomposition of (f). The total memory size is, bits, and it uses a -bit binary adder. Example. Figure shows the realization of 6bindec. It was obtained by applying Algorithm to Fig.. It uses,788 bits, three binary adders, and four decimal adders. Note that Algorithm produced a balanced decomposition. 5 CONCLUSION In this paper, we presented design methods of p-nary to q-nary converters. For readability, we showed the concept by using the examples for p = and q =. However, in Table, by replacing i by p i in the top row and j by q j in the leftmost column, the method can be easily modified to any radix converters. In this case, a p-nary to q-nary converters are implemented by using LUT cascades, binary adders, and q-nary adders. 6 ACKNOWLEDGMENTS This research is supported in part by the Grant in Aid for Scientific Research of MEXT, the Kitakyushu Area Innovative Cluster Project of MEXT, and Special Research of Meiji University. REFERENCES [] T. Hanyu and M. Kameyama. (995). A mhz pipelined multiplier using.5 v-supply multiplevalued mos current-mode circuits with dual-rail source-coupled logic. IEEE Journal of Solid-State Circuits,, :9 5. [] C. H. Huang. (98). A fully parallel mixed-radix conversion algorithm for residue number applications. IEEE Trans. Comput., :98. [] Y. Iguchi, T. Sasao, and M. Matsuura. (May 6). On design of radix converters using arithmetic decompositions. In 6th International Symposium on Multiple-Valued Logic, Singapore, pages 98. [] Y. Iguchi, T. Sasao, and M. Matsuura. (7). Design methods of radix converters using arithmetic decompositions. IEICE Trans. on Information and Systems,, E9-D (Accepted). [5] Y. Iguchi, T. Sasao, and M. Matsuura. (May 7). On design of radix converters using arithmetic decompositions, binary to decimal converters. In 7th International Symposium on Multiple-Valued Logic, Singapore. [6] I. Koren. (). Computer Arithmetic Algorithms. A. K. Peters, Natick, MA, second edition. [7] S. Muroga. (98). VLSI System Design, pages 9 6. John Wiley & Sons. [8] D. Olson and K. W. Current. (May ). Hardware implementation of supplementary symmetrical logic circuit structure concepts. In th International Symposium on Multiple- Valued Logic, Portland, Oregon, USA. MVLSC d6 7/6/5 6: page 7 #7

8 Yukihiro Iguchi et al. [9] T. Sasao. (999). Switching Theory for Logic Synthesis. Kluwer Academic Publishers. [] T. Sasao. (May 5). Radix converters: Complexity and implementation by lut cascades. In 5th International Symposium on Multiple-Valued Logic, Calgary, Canada, pages 56 6. [] T. Sasao. (6). Analysis and synthesis of weighted-sum functions. IEEE Trans. CAD, 5(5):789 796. [] T. Sasao, Y. Iguchi, and T. Suzuki. (Aug. 5). On lut cascade realizations of fir filters. In DSD5, 8th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, Porto, Portugal, pages 67 7. [] T. Sasao, M. Matsuura, and Y. Iguchi. (June ). A cascade realization of multipleoutput function for reconfigurable hardware. In International Workshop on Logic and Synthesis(IWLS), Lake Tahoe, CA, USA, pages 5. MVLSC d6 7/6/5 6: page 8 #8