Modified Booth Algorithm Carry Save Adder for High-Speed Multiplier

Size: px
Start display at page:

Download "Modified Booth Algorithm Carry Save Adder for High-Speed Multiplier"

Transcription

1 Modified Booth Algorithm Carry Save Adder for High-Speed Multiplier Mahyar Shahsavari July 2012 Abstract Designing an optimized processor has been a main concern of the computer and hardware designers during the recent decades. Many approaches have been tested and implemented. Different methods for addition are applied which cause different method for multiplication and division. Having high-speed multipliers is critical for the performance of processors. Infact 8.72 % of all instructions in a typical scientific program are Multipliers (Kyoung, H.L., 2003). In this report, we present a parallel multiplier by applying modified booth algorithm along with Carry Save Adder (CSA). We enhanced Leon 3 software processor from Gaisler company. The implementation is based on a Xilinx Vertix4 board. 1 Introduction For multiplication the conventional iterative add-shift methods are inexpensive to implement in term of hardware but the resulting execution speeds are too low to satisfy the increasing demand for high-speed computing. Since the speed of CPUs have increased tremendously in recent years, parallel multipliers can be implemented such that meet high-speed requirements. Between different types of multiplication techniques, the modified Booth algorithm is very prominent. This technique, along with the use of Carry Save Adder (CSA) approach, increases the performance of parallel multipliers. In this paper, we work on enhancing the performance of 32 bit leon3 open source multiplier. The idea was to integrate the computer architecture and computer arithmetic concepts and utilize them in order to design the required functional units and optimize the overall processor performance. Furthermore, these functional units and the overall processor needed to be implemented on Virtex4 ML410 FPGA board in order to run the benchmark and measure the performance. Basically, the most important principle of computer design is to focus on the application cases like, When making a design trade-off, favor the frequent case over the infrequent case. For instance, the instruction fetch and decode unit of a processor may be used much more frequently than a multiplier and divider. Similarly, the multiplication operation is performed more often than division thus, more performance gain can be achieved by improving multipliers than the dividers. 1

2 2 Present schemes used There are 3 methods for multiplications: Binary Multiplicaton Array Multiplier Multiplier and Accumulator Unit (Tree Multiplier) Binary multiplication is a software method. In this case processors do not have a hardware multiplier. Binary multiplier is fine but it is slow, The entire process consists of three steps, partial product generation, partial product reduction and final addition. Next choice, the array multiplier, is a vast improvement in speed over the traditional bit serial multipliers. Array multiplier is very regular in its structure and uses only short wire to connect to the next full adder. Thus, it has a very simple and efficient layout in VLSI. This method still is not fast enough and the area and power would be the most obvious shortage of this technique. The tree multiplier or in the other word, Multiplier and Accumulator (MAC), has this capability to be fully paralleled. Efficiency can dramatically be improved if we use high-performance CSA and using higher radix multiplier. These multiplication schemes handle more than one bit of the multiplier in each cycle. A higher representation radix leads to fewer digits. Thus, the multiplication algorithm requires fewer cycles, which means fewer partial products. 3 Overview of our design The multiplication algorithm has 4 steps to which improving each one can have better consequences in whole process. These steps are partial product generation, partial product addition, final addition and accumulator. Using several techniques [3]such as the Baugh- Wooley (BW), Booth Algorithm (BA) and Modified Booth Algorithm (MBA) cause having faster and efficient partial product generation. For n-bit multiplier, the number of summands are n-bit, n/2 and n/2 for BW, BA and MBA respectively. In addition to the encoding step, the BA and MBA algorithms also require generation of the twos complement of the multiplier which introduces extra delay. The delay for twos-complement generation is not trivial, but has been consistently neglected in most of the proposed designs in the literature. The method for improving partial product addition characteristics is related to using the proper adder. In our design we applied Carry-Save Adder as what is illustrated in Figure 1. For storing the final multiplier result 2n-bit accumulator is required. Modified Booth Algorithm [2] is the method that we have chosen for producing partial product. In the conventional MBA, three-bit strings of the multiplier are scanned and appropriate operations are carried out on the multiplicand. We express n bit numbers A and B by sequences a n 1 a n 2... a 0 and b n 1 b n 2... b 0, respectively. The product of the two numbers can be written as n 1 n 1 P = A B = a i 2 i b j 2 j = i=0 j=0 a i b j 2 i+j (1) n 1 n 1 i=0 j=0 In a straightforward parallel multiplication operation of two n bit numbers, all the partial products are generated simultaneously. Since parallel hardware 2

3 Figure 1: A partial schematic of the adding 32-bit CSA implementation lends itself only to a fixed number of partial products, the algorithm was modified by MacSorley [1] which could encodes 3-bit strings of the multiplier at a time with an overlapping bit. The multiplier can be written as B = n i=even n/2 ( 2b i+1 + b i + b i 1 )2 i = Q i 4 i (2) i=0 where Q i = 2b 2i+1 + b 2i + b 2i 1 with b 1 = 0 and Q i { 2, 1, 0, +1, +2}. The product of the multiplication can be written as n/2 P = AQ i 4 i (3) i=0 An encoder accepts three-bit strings of the multiplier as input and outputs the appropriate control signals like what is shown in Figure 2. The truth table for the encoder and the mathematical operations effected by each three-bit sequence of the multiplicand is shown in Table 1. The control signals generated by the encoder are Z, ADD, 2ADD, 2SUB, SUB and NEG. Z is the signal for which the multiplexer modifies the multiplicand to output zero. ADD and 2ADD are signals for which the multiplexer produces the multiplicand and twice the multiplicand, respectively. The SUB and 2SUB control signals allow the multiplexer to generate the complement and complement of twice the multiplicand, respectively. Finally, NEG generates a 0 or a 1 depending upon whether the multiplexer generates a positive or a complemented number. Subtraction 3

4 Figure 2: The radix 4 schematic using booth encoding method Table 1: Modified Booth algorithm Multiplier bits Booth modified outputs b 2i+1 b 2i b 2i 1 Z ADD 2ADD 2SUB SUB NEG Mux Out A A A A A A can be carried out using 2s complement addition. This involves adding one to the complement of the multiplicand at the LSB for SUB and 2SUB operations. The extra one is generated by the encoding logic. 4 Implement Multiplier We designed a multiplier which can multiply two 32 bit signed numbers in 3 cycles. 32 bits of the multiplicand and 16 bits of the multiplier are fed to the multiple generation block. 16 outputs of multiple generation block combine with the sum and carry from the previous cycle. Lower 16 bits of sum and lower 15 bits of carry are inserted into a 16-bit CPA to produce lower 16-bits of product and after 2 iterations of this process the lower 32 bits of product are obtained. After choosing the suitable algorithms, the first step is writing the code for multiplication and its testbench to verify the correctness of our design too. After multiplication verification by itself, we would replace it in the whole project of leaon3. The VHDL code for applying Booth encoder is two_a <= (30 downto 0) & 0 ; --shift Lest to produce 2a a_bar <= not a; -- generate (-a) two_a_bar <= a_bar(30 downto 0) & 0 ; -- generate (-2a) aa <= a when b="001" or b="010" --Check to use proper booth output else two_a when b="011" 4

5 else two_a_bar when b="100" -- cin=1 else a_bar when b="101" or b="110" -- cin=1 else x" "; cin <= 1 when b="100" or b="101" or b="110" else 0 ; topbit <= a(31) when b="001" or b="010" or b="011" else a_bar(31) when b="100" or b="101" or b="110" else 0 ; Figure 3: Simulation results of testbench (Modelsim) With another look at the Table 1, we will find out easier how this code checking the 3 bits of b and base on these three bits choose the Booth encoder output as a partial product. Running testbench of designed multiplier Figure 3, can help us to see the results which confirm we are using the trustworthy multiplication. 5 Timing Report After implement our design in ISE, the time summaries in Table 2 were obtained. In order to run the Dhrystone benchmark, we had to implement the modified processor on Virtex 4 FPGA. For this we had to do placement and routing of our design. The actual clock of the design is not what is mentioned in the synthesis report. The actual speed on which the design can run is given after actually placing and routing the design on the target FPGA. Taking privilege of Booth encoder in radix 4, in addition to the improvement in time and minimum period, the level of logic decreased too. For instance in case of slack (setup path), source: l3.cpu[0].u0/p0/iu0/r.x.result 3 (F F ) and destination: l3.cpu[0].u0/cmem0/dme.dtags1.dt1.dt0[1].dtags0/xc2v.x0/a9.x[0].r0 (RAM) the level of logic decreased from 15 to 13 which could save a notable amount of area too. 5

6 Table 2: Timing Summary Processor Constraints (paths) Constraints (connections) Min period Max freq Baseline ns MHz Modefied ns MHz Figure 4: Device Utilization Summary for Baseline Processor 6 Performance Results Arfet these modifications and doing implementation, we are going to compare two Device Utilization Summaries output of ISE regard to baseline soft core as well as modified one. Figure 4 and 5 are shown below which are snapshots of design summery of Xilinx ISE tools version Summary and Conclusion In this report, We have presented an algorithm to do faster and efficient multiplication. Multiplication is more frequently used by processor. Therefore, we expected better performance of processor. We did our implementation on leon3 32 bit open source soft-core. Our platform was xilinx board Vertix 4 and the frequency which we applied was the same with what leon3 itself was applied 6

7 Figure 5: Device Utilization Summary for Modified Processor (80 MHz). By doing this modification, the number of occupied slices decreased and we save the area and consequently reduction in power consumption. Our multiplier works with 3 clock pulses so we have a faster processor now as it is shown in the timing summary section. For the future work let me see I (instead of we) have a plan to work on the divider and apply a new efficient algorithm for it. The other thing which I am thinking about for future work is using higher frequency for this core. For this work, I could not fully investigate the power reduction due to clock gating and other techniques because of the time limitation, but I am planning to work on it in summer. There is a possibility of using low power intelligent tool environment (LITE) with back annotation to investigate more about the power consumption but due to lack of time I could not work on that. As a final comment, I would like to mention that this course was very interesting project and I learnt many things of this course. I understood the concept of soft-core, how to use Modelsim, Xilinx ISE, writing VHDL codes how to check new arithmetic algorithms and ideas and many other technical things related to computer arithmetic. By this exercise, we have practically realized the role of different factors in the performance of processor, realization of the arithmetic circuits and their improvement. The only limitation which I had and took me more time to progress, was my isolation and working alone without enough feedback. 7

8 References [1] Algirdas Avizienis. Binary-compatible signed-digit arithmetic. In Proceedings of the October 27-29, 1964, fall joint computer conference, part I, AFIPS 64 (Fall, part I), pages , New York, NY, USA, ACM. [2] Shiann-Rong Kuang, Jiun-Ping Wang, and Cang-Yuan Guo. Modified booth multipliers with a regular partial product array. Trans. Cir. Sys., 56(5): , May [3] Behrooz Parhami. Computer arithmetic: algorithms and hardware designs. Oxford University Press, Oxford, UK,

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2) Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 683-690 Research India Publications http://www.ripublication.com/aeee.htm Implementation of Modified Booth

More information

Let s put together a Manual Processor

Let s put together a Manual Processor Lecture 14 Let s put together a Manual Processor Hardware Lecture 14 Slide 1 The processor Inside every computer there is at least one processor which can take an instruction, some operands and produce

More information

Multipliers. Introduction

Multipliers. Introduction Multipliers Introduction Multipliers play an important role in today s digital signal processing and various other applications. With advances in technology, many researchers have tried and are trying

More information

(Refer Slide Time: 00:01:16 min)

(Refer Slide Time: 00:01:16 min) Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control

More information

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language Chapter 4 Register Transfer and Microoperations Section 4.1 Register Transfer Language Digital systems are composed of modules that are constructed from digital components, such as registers, decoders,

More information

Lab 1: Full Adder 0.0

Lab 1: Full Adder 0.0 Lab 1: Full Adder 0.0 Introduction In this lab you will design a simple digital circuit called a full adder. You will then use logic gates to draw a schematic for the circuit. Finally, you will verify

More information

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell. CHAPTER 4 THE ADDER The adder is one of the most critical components of a processor, as it is used in the Arithmetic Logic Unit (ALU), in the floating-point unit and for address generation in case of cache

More information

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw. Sequential Circuit

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw. Sequential Circuit Modeling Sequential Elements with Verilog Prof. Chien-Nan Liu TEL: 03-4227151 ext:34534 Email: jimmy@ee.ncu.edu.tw 4-1 Sequential Circuit Outputs are functions of inputs and present states of storage elements

More information

Floating Point Fused Add-Subtract and Fused Dot-Product Units

Floating Point Fused Add-Subtract and Fused Dot-Product Units Floating Point Fused Add-Subtract and Fused Dot-Product Units S. Kishor [1], S. P. Prakash [2] PG Scholar (VLSI DESIGN), Department of ECE Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu,

More information

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1} An Efficient RNS to Binary Converter Using the oduli Set {n + 1, n, n 1} Kazeem Alagbe Gbolagade 1,, ember, IEEE and Sorin Dan Cotofana 1, Senior ember IEEE, 1. Computer Engineering Laboratory, Delft University

More information

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC 433 - Spring 2013

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC 433 - Spring 2013 Introduction to Xilinx System Generator Part II Evan Everett and Michael Wu ELEC 433 - Spring 2013 Outline Introduction to FPGAs and Xilinx System Generator System Generator basics Fixed point data representation

More information

Counters and Decoders

Counters and Decoders Physics 3330 Experiment #10 Fall 1999 Purpose Counters and Decoders In this experiment, you will design and construct a 4-bit ripple-through decade counter with a decimal read-out display. Such a counter

More information

CHAPTER 3 Boolean Algebra and Digital Logic

CHAPTER 3 Boolean Algebra and Digital Logic CHAPTER 3 Boolean Algebra and Digital Logic 3.1 Introduction 121 3.2 Boolean Algebra 122 3.2.1 Boolean Expressions 123 3.2.2 Boolean Identities 124 3.2.3 Simplification of Boolean Expressions 126 3.2.4

More information

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 15 (2014), pp. 1531-1537 International Research Publications House http://www. irphouse.com Design and FPGA

More information

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng Digital Logic Design Basics Combinational Circuits Sequential Circuits Pu-Jen Cheng Adapted from the slides prepared by S. Dandamudi for the book, Fundamentals of Computer Organization and Design. Introduction

More information

Chapter 2 Logic Gates and Introduction to Computer Architecture

Chapter 2 Logic Gates and Introduction to Computer Architecture Chapter 2 Logic Gates and Introduction to Computer Architecture 2.1 Introduction The basic components of an Integrated Circuit (IC) is logic gates which made of transistors, in digital system there are

More information

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1 WEEK 8.1 egisters and Counters ECE124 igital Circuits and Systems Page 1 Additional schematic FF symbols Active low set and reset signals. S Active high set and reset signals. S ECE124 igital Circuits

More information

LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture.

LMS is a simple but powerful algorithm and can be implemented to take advantage of the Lattice FPGA architecture. February 2012 Introduction Reference Design RD1031 Adaptive algorithms have become a mainstay in DSP. They are used in wide ranging applications including wireless channel estimation, radar guidance systems,

More information

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the

More information

High-Level Synthesis for FPGA Designs

High-Level Synthesis for FPGA Designs High-Level Synthesis for FPGA Designs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer consultant Cereslaan 10b 5384 VT Heesch

More information

Systems I: Computer Organization and Architecture

Systems I: Computer Organization and Architecture Systems I: Computer Organization and Architecture Lecture 9 - Register Transfer and Microoperations Microoperations Digital systems are modular in nature, with modules containing registers, decoders, arithmetic

More information

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. File: chap04, Chapter 04 1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1. 2. True or False? A gate is a device that accepts a single input signal and produces one

More information

Sistemas Digitais I LESI - 2º ano

Sistemas Digitais I LESI - 2º ano Sistemas Digitais I LESI - 2º ano Lesson 6 - Combinational Design Practices Prof. João Miguel Fernandes (miguel@di.uminho.pt) Dept. Informática UNIVERSIDADE DO MINHO ESCOLA DE ENGENHARIA - PLDs (1) - The

More information

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180B Lab 7: MISP Processor Design Spring 1995 UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering EEC180B Lab 7: MISP Processor Design Spring 1995 Objective: In this lab, you will complete the design of the MISP processor,

More information

System on Chip Design. Michael Nydegger

System on Chip Design. Michael Nydegger Short Questions, 26. February 2015 What is meant by the term n-well process? What does this mean for the n-type MOSFETs in your design? What is the meaning of the threshold voltage (practically)? What

More information

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Hardware Implementations of RSA Using Fast Montgomery Multiplications ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner Overview Introduction Functional Specifications Implemented Design and Optimizations

More information

BINARY CODED DECIMAL: B.C.D.

BINARY CODED DECIMAL: B.C.D. BINARY CODED DECIMAL: B.C.D. ANOTHER METHOD TO REPRESENT DECIMAL NUMBERS USEFUL BECAUSE MANY DIGITAL DEVICES PROCESS + DISPLAY NUMBERS IN TENS IN BCD EACH NUMBER IS DEFINED BY A BINARY CODE OF 4 BITS.

More information

Understanding Logic Design

Understanding Logic Design Understanding Logic Design ppendix of your Textbook does not have the needed background information. This document supplements it. When you write add DD R0, R1, R2, you imagine something like this: R1

More information

Module 3: Floyd, Digital Fundamental

Module 3: Floyd, Digital Fundamental Module 3: Lecturer : Yongsheng Gao Room : Tech - 3.25 Email : yongsheng.gao@griffith.edu.au Structure : 6 lectures 1 Tutorial Assessment: 1 Laboratory (5%) 1 Test (20%) Textbook : Floyd, Digital Fundamental

More information

VHDL Test Bench Tutorial

VHDL Test Bench Tutorial University of Pennsylvania Department of Electrical and Systems Engineering ESE171 - Digital Design Laboratory VHDL Test Bench Tutorial Purpose The goal of this tutorial is to demonstrate how to automate

More information

Flip-Flops, Registers, Counters, and a Simple Processor

Flip-Flops, Registers, Counters, and a Simple Processor June 8, 22 5:56 vra235_ch7 Sheet number Page number 349 black chapter 7 Flip-Flops, Registers, Counters, and a Simple Processor 7. Ng f3, h7 h6 349 June 8, 22 5:56 vra235_ch7 Sheet number 2 Page number

More information

Step : Create Dependency Graph for Data Path Step b: 8-way Addition? So, the data operations are: 8 multiplications one 8-way addition Balanced binary

Step : Create Dependency Graph for Data Path Step b: 8-way Addition? So, the data operations are: 8 multiplications one 8-way addition Balanced binary RTL Design RTL Overview Gate-level design is now rare! design automation is necessary to manage the complexity of modern circuits only library designers use gates automated RTL synthesis is now almost

More information

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc

More information

COMBINATIONAL CIRCUITS

COMBINATIONAL CIRCUITS COMBINATIONAL CIRCUITS http://www.tutorialspoint.com/computer_logical_organization/combinational_circuits.htm Copyright tutorialspoint.com Combinational circuit is a circuit in which we combine the different

More information

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai 2007. Jens Onno Krah (DSF) Soft Core Prozessor NIOS II Stand Mai 2007 Jens Onno Krah Cologne University of Applied Sciences www.fh-koeln.de jens_onno.krah@fh-koeln.de NIOS II 1 1 What is Nios II? Altera s Second Generation

More information

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION

DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION DIGITAL-TO-ANALOGUE AND ANALOGUE-TO-DIGITAL CONVERSION Introduction The outputs from sensors and communications receivers are analogue signals that have continuously varying amplitudes. In many systems

More information

plc numbers - 13.1 Encoded values; BCD and ASCII Error detection; parity, gray code and checksums

plc numbers - 13.1 Encoded values; BCD and ASCII Error detection; parity, gray code and checksums plc numbers - 3. Topics: Number bases; binary, octal, decimal, hexadecimal Binary calculations; s compliments, addition, subtraction and Boolean operations Encoded values; BCD and ASCII Error detection;

More information

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions CS101 Lecture 26: Low Level Programming John Magee 30 July 2013 Some material copyright Jones and Bartlett 1 Overview/Questions What did we do last time? How can we control the computer s circuits? How

More information

MIMO detector algorithms and their implementations for LTE/LTE-A

MIMO detector algorithms and their implementations for LTE/LTE-A GIGA seminar 11.01.2010 MIMO detector algorithms and their implementations for LTE/LTE-A Markus Myllylä and Johanna Ketonen 11.01.2010 2 Outline Introduction System model Detection in a MIMO-OFDM system

More information

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev. 1.4. Key Design Features. Block Diagram. Generic Parameters.

DDS. 16-bit Direct Digital Synthesizer / Periodic waveform generator Rev. 1.4. Key Design Features. Block Diagram. Generic Parameters. Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core 16-bit signed output samples 32-bit phase accumulator (tuning word) 32-bit phase shift feature Phase resolution of 2π/2

More information

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

INTEGRATED CIRCUITS. For a complete data sheet, please also download: INTEGRATED CIRCUITS DATA SHEET For a complete data sheet, please also download: The IC06 74HC/HCT/HCU/HCMOS Logic Family Specifications The IC06 74HC/HCT/HCU/HCMOS Logic Package Information The IC06 74HC/HCT/HCU/HCMOS

More information

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application A DA Serial Multiplier Technique ased on 32- Tap FIR Filter for Audio Application K Balraj 1, Ashish Raman 2, Dinesh Chand Gupta 3 Department of ECE Department of ECE Department of ECE Dr. B.R. Amedkar

More information

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2 Lecture Handout Computer Architecture Lecture No. 2 Reading Material Vincent P. Heuring&Harry F. Jordan Chapter 2,Chapter3 Computer Systems Design and Architecture 2.1, 2.2, 3.2 Summary 1) A taxonomy of

More information

ASYNCHRONOUS COUNTERS

ASYNCHRONOUS COUNTERS LB no.. SYNCHONOUS COUNTES. Introduction Counters are sequential logic circuits that counts the pulses applied at their clock input. They usually have 4 bits, delivering at the outputs the corresponding

More information

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems Harris Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH

More information

3.Basic Gate Combinations

3.Basic Gate Combinations 3.Basic Gate Combinations 3.1 TTL NAND Gate In logic circuits transistors play the role of switches. For those in the TTL gate the conducting state (on) occurs when the baseemmiter signal is high, and

More information

The 104 Duke_ACC Machine

The 104 Duke_ACC Machine The 104 Duke_ACC Machine The goal of the next two lessons is to design and simulate a simple accumulator-based processor. The specifications for this processor and some of the QuartusII design components

More information

International Journal of Electronics and Computer Science Engineering 1482

International Journal of Electronics and Computer Science Engineering 1482 International Journal of Electronics and Computer Science Engineering 1482 Available Online at www.ijecse.org ISSN- 2277-1956 Behavioral Analysis of Different ALU Architectures G.V.V.S.R.Krishna Assistant

More information

Attaining EDF Task Scheduling with O(1) Time Complexity

Attaining EDF Task Scheduling with O(1) Time Complexity Attaining EDF Task Scheduling with O(1) Time Complexity Verber Domen University of Maribor, Faculty of Electrical Engineering and Computer Sciences, Maribor, Slovenia (e-mail: domen.verber@uni-mb.si) Abstract:

More information

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic Aims and Objectives E 3.05 Digital System Design Peter Cheung Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.ic.ac.uk/pcheung/ E-mail: p.cheung@ic.ac.uk How to go

More information

RN-coding of Numbers: New Insights and Some Applications

RN-coding of Numbers: New Insights and Some Applications RN-coding of Numbers: New Insights and Some Applications Peter Kornerup Dept. of Mathematics and Computer Science SDU, Odense, Denmark & Jean-Michel Muller LIP/Arénaire (CRNS-ENS Lyon-INRIA-UCBL) Lyon,

More information

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source)

How To Fix A 3 Bit Error In Data From A Data Point To A Bit Code (Data Point) With A Power Source (Data Source) And A Power Cell (Power Source) FPGA IMPLEMENTATION OF 4D-PARITY BASED DATA CODING TECHNIQUE Vijay Tawar 1, Rajani Gupta 2 1 Student, KNPCST, Hoshangabad Road, Misrod, Bhopal, Pin no.462047 2 Head of Department (EC), KNPCST, Hoshangabad

More information

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on CS 61C: Great Ideas in Computer Architecture Finite State Machines Instructors: Krste Asanovic & Vladimir Stojanovic hbp://inst.eecs.berkeley.edu/~cs61c/sp15 1 Levels of RepresentaKon/ InterpretaKon High

More information

Modeling Latches and Flip-flops

Modeling Latches and Flip-flops Lab Workbook Introduction Sequential circuits are digital circuits in which the output depends not only on the present input (like combinatorial circuits), but also on the past sequence of inputs. In effect,

More information

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001 Agenda Introduzione Il mercato Dal circuito integrato al System on a Chip (SoC) La progettazione di un SoC La tecnologia Una fabbrica di circuiti integrati 28 How to handle complexity G The engineering

More information

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition A Tutorial Approach James O. Hamblen Georgia Institute of Technology Michael D. Furman Georgia Institute of Technology KLUWER ACADEMIC PUBLISHERS Boston

More information

SDLC Controller. Documentation. Design File Formats. Verification

SDLC Controller. Documentation. Design File Formats. Verification January 15, 2004 Product Specification 11 Stonewall Court Woodcliff Lake, NJ 07677 USA Phone: +1-201-391-8300 Fax: +1-201-391-8694 E-mail: info@cast-inc.com URL: www.cast-inc.com Features AllianceCORE

More information

Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots

Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots Registers As you probably know (if you don t then you should consider changing your course), data processing is usually

More information

A s we saw in Chapter 4, a CPU contains three main sections: the register section,

A s we saw in Chapter 4, a CPU contains three main sections: the register section, 6 CPU Design A s we saw in Chapter 4, a CPU contains three main sections: the register section, the arithmetic/logic unit (ALU), and the control unit. These sections work together to perform the sequences

More information

Lecture 7: Clocking of VLSI Systems

Lecture 7: Clocking of VLSI Systems Lecture 7: Clocking of VLSI Systems MAH, AEN EE271 Lecture 7 1 Overview Reading Wolf 5.3 Two-Phase Clocking (good description) W&E 5.5.1, 5.5.2, 5.5.3, 5.5.4, 5.5.9, 5.5.10 - Clocking Note: The analysis

More information

Manchester Encoder-Decoder for Xilinx CPLDs

Manchester Encoder-Decoder for Xilinx CPLDs Application Note: CoolRunner CPLDs R XAPP339 (v.3) October, 22 Manchester Encoder-Decoder for Xilinx CPLDs Summary This application note provides a functional description of VHDL and Verilog source code

More information

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Chapter 2 Basic Structure of Computers Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Functional Units Basic Operational Concepts Bus Structures Software

More information

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies ETEC 2301 Programmable Logic Devices Chapter 10 Counters Shawnee State University Department of Industrial and Engineering Technologies Copyright 2007 by Janna B. Gallaher Asynchronous Counter Operation

More information

Lecture 8: Binary Multiplication & Division

Lecture 8: Binary Multiplication & Division Lecture 8: Binary Multiplication & Division Today s topics: Addition/Subtraction Multiplication Division Reminder: get started early on assignment 3 1 2 s Complement Signed Numbers two = 0 ten 0001 two

More information

The string of digits 101101 in the binary number system represents the quantity

The string of digits 101101 in the binary number system represents the quantity Data Representation Section 3.1 Data Types Registers contain either data or control information Control information is a bit or group of bits used to specify the sequence of command signals needed for

More information

Implementation and Design of AES S-Box on FPGA

Implementation and Design of AES S-Box on FPGA International Journal of Research in Engineering and Science (IJRES) ISSN (Online): 232-9364, ISSN (Print): 232-9356 Volume 3 Issue ǁ Jan. 25 ǁ PP.9-4 Implementation and Design of AES S-Box on FPGA Chandrasekhar

More information

Binary Adders: Half Adders and Full Adders

Binary Adders: Half Adders and Full Adders Binary Adders: Half Adders and Full Adders In this set of slides, we present the two basic types of adders: 1. Half adders, and 2. Full adders. Each type of adder functions to add two binary bits. In order

More information

150127-Microprocessor & Assembly Language

150127-Microprocessor & Assembly Language Chapter 3 Z80 Microprocessor Architecture The Z 80 is one of the most talented 8 bit microprocessors, and many microprocessor-based systems are designed around the Z80. The Z80 microprocessor needs an

More information

SAD computation based on online arithmetic for motion. estimation

SAD computation based on online arithmetic for motion. estimation SAD computation based on online arithmetic for motion estimation J. Olivares a, J. Hormigo b, J. Villalba b, I. Benavides a and E. L. Zapata b a Dept. of Electrics and Electronics, University of Córdoba,

More information

Lecture 5: Gate Logic Logic Optimization

Lecture 5: Gate Logic Logic Optimization Lecture 5: Gate Logic Logic Optimization MAH, AEN EE271 Lecture 5 1 Overview Reading McCluskey, Logic Design Principles- or any text in boolean algebra Introduction We could design at the level of irsim

More information

9/14/2011 14.9.2011 8:38

9/14/2011 14.9.2011 8:38 Algorithms and Implementation Platforms for Wireless Communications TLT-9706/ TKT-9636 (Seminar Course) BASICS OF FIELD PROGRAMMABLE GATE ARRAYS Waqar Hussain firstname.lastname@tut.fi Department of Computer

More information

E158 Intro to CMOS VLSI Design. Alarm Clock

E158 Intro to CMOS VLSI Design. Alarm Clock E158 Intro to CMOS VLSI Design Alarm Clock Sarah Yi & Samuel (Tae) Lee 4/19/2010 Introduction The Alarm Clock chip includes the basic functions of an alarm clock such as a running clock time and alarm

More information

A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers

A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers 2014 IEEE 22nd International Symposium on Field-Programmable Custom Computing Machines A New Algorithm for Carry-Free Addition of Binary Signed-Digit Numbers Klaus Schneider and Adrian Willenbücher Embedded

More information

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System?

Management Challenge. Managing Hardware Assets. Central Processing Unit. What is a Computer System? Management Challenge Managing Hardware Assets What computer processing and storage capability does our organization need to handle its information and business transactions? What arrangement of computers

More information

Implementing the Functional Model of High Accuracy Fixed Width Modified Booth Multiplier

Implementing the Functional Model of High Accuracy Fixed Width Modified Booth Multiplier International Journal of Electronics and Computer Science Engineering 393 Available Online at www.ijecse.org ISSN: 2277-1956 Implementing the Functional Model of High Accuracy Fixed Width Modified Booth

More information

CSE2102 Digital Design II - Topics CSE2102 - Digital Design II

CSE2102 Digital Design II - Topics CSE2102 - Digital Design II CSE2102 Digital Design II - Topics CSE2102 - Digital Design II 6 - Microprocessor Interfacing - Memory and Peripheral Dr. Tim Ferguson, Monash University. AUSTRALIA. Tel: +61-3-99053227 FAX: +61-3-99053574

More information

Introduction to Digital System Design

Introduction to Digital System Design Introduction to Digital System Design Chapter 1 1 Outline 1. Why Digital? 2. Device Technologies 3. System Representation 4. Abstraction 5. Development Tasks 6. Development Flow Chapter 1 2 1. Why Digital

More information

Section 3. Sensor to ADC Design Example

Section 3. Sensor to ADC Design Example Section 3 Sensor to ADC Design Example 3-1 This section describes the design of a sensor to ADC system. The sensor measures temperature, and the measurement is interfaced into an ADC selected by the systems

More information

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT 216 ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT *P.Nirmalkumar, **J.Raja Paul Perinbam, @S.Ravi and #B.Rajan *Research Scholar,

More information

High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers

High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers High Speed and Efficient 4-Tap FIR Filter Design Using Modified ETA and Multipliers Mehta Shantanu Sheetal #1, Vigneswaran T. #2 # School of Electronics Engineering, VIT University Chennai, Tamil Nadu,

More information

Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students

Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students Session: 2220 Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students Adam S. El-Mansouri, Herbert L. Hess, Kevin M. Buck, Timothy Ewers Microelectronics

More information

DIGITAL TECHNICS II. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute. 2nd (Spring) term 2012/2013

DIGITAL TECHNICS II. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute. 2nd (Spring) term 2012/2013 DIGITAL TECHNICS II Dr. Bálint Pődör Óbuda University, Microelectronics and Technology Institute 4. LECTURE: COUNTERS AND RELATED 2nd (Spring) term 2012/2013 1 4. LECTURE: COUNTERS AND RELATED 1. Counters,

More information

Gate Delay Model. Estimating Delays. Effort Delay. Gate Delay. Computing Logical Effort. Logical Effort

Gate Delay Model. Estimating Delays. Effort Delay. Gate Delay. Computing Logical Effort. Logical Effort Estimating Delays Would be nice to have a back of the envelope method for sizing gates for speed Logical Effort Book by Sutherland, Sproull, Harris Chapter 1 is on our web page Also Chapter 4 in our textbook

More information

Delay Time Analysis of Reconfigurable Firewall Unit

Delay Time Analysis of Reconfigurable Firewall Unit Delay Time Analysis of Reconfigurable Unit Tomoaki SATO C&C Systems Center, Hirosaki University Hirosaki 036-8561 Japan Phichet MOUNGNOUL Faculty of Engineering, King Mongkut's Institute of Technology

More information

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs.

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs. equential Logic o far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs. In sequential logic the output of the

More information

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to: 55 Topic 3 Computer Performance Contents 3.1 Introduction...................................... 56 3.2 Measuring performance............................... 56 3.2.1 Clock Speed.................................

More information

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs

Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Distributed Elastic Switch Architecture for efficient Networks-on-FPGAs Antoni Roca, Jose Flich Parallel Architectures Group Universitat Politechnica de Valencia (UPV) Valencia, Spain Giorgos Dimitrakopoulos

More information

FSMD and Gezel. Jan Madsen

FSMD and Gezel. Jan Madsen FSMD and Gezel Jan Madsen Informatics and Mathematical Modeling Technical University of Denmark Richard Petersens Plads, Building 321 DK2800 Lyngby, Denmark jan@imm.dtu.dk Processors Pentium IV General-purpose

More information

Memory Elements. Combinational logic cannot remember

Memory Elements. Combinational logic cannot remember Memory Elements Combinational logic cannot remember Output logic values are function of inputs only Feedback is needed to be able to remember a logic value Memory elements are needed in most digital logic

More information

HDL Simulation Framework

HDL Simulation Framework PPC-System.mhs CoreGen Dateien.xco HDL-Design.vhd /.v SimGen HDL Wrapper Sim-Modelle.vhd /.v Platgen Coregen XST HDL Simulation Framework RAM Map Netzliste Netzliste Netzliste UNISIM NetGen vcom / vlog.bmm.ngc.ngc.ngc

More information

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements ) Sequential Logic (Materials taken from: Principles of Computer Hardware by Alan Clements ) Sequential vs. Combinational Circuits Combinatorial circuits: their outputs are computed entirely from their present

More information

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design

Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design Learning Outcomes Simple CPU Operation and Buses Dr Eddie Edwards eddie.edwards@imperial.ac.uk At the end of this lecture you will Understand how a CPU might be put together Be able to name the basic components

More information

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN. zl2211@columbia.edu. ml3088@columbia.edu

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN. zl2211@columbia.edu. ml3088@columbia.edu MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN Zheng Lai Zhao Liu Meng Li Quan Yuan zl2215@columbia.edu zl2211@columbia.edu ml3088@columbia.edu qy2123@columbia.edu I. Overview Architecture The purpose

More information

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng

Architectural Level Power Consumption of Network on Chip. Presenter: YUAN Zheng Architectural Level Power Consumption of Network Presenter: YUAN Zheng Why Architectural Low Power Design? High-speed and large volume communication among different parts on a chip Problem: Power consumption

More information

Lab 17: Building a 4-Digit 7-Segment LED Decoder

Lab 17: Building a 4-Digit 7-Segment LED Decoder Phys2303 L.A. Bumm [Nexys 1.1.2] Lab 17 (p1) Lab 17: Building a 4-Digit 7-Segment LED Decoder In this lab your will make 4 test circuits, the 4-digit 7-segment decoder, and demonstration circuit using

More information

40G MACsec Encryption in an FPGA

40G MACsec Encryption in an FPGA 40G MACsec Encryption in an FPGA Dr Tom Kean, Managing Director, Algotronix Ltd, 130-10 Calton Road, Edinburgh EH8 8JQ United Kingdom Tel: +44 131 556 9242 Email: tom@algotronix.com February 2012 1 MACsec

More information

6-BIT UNIVERSAL UP/DOWN COUNTER

6-BIT UNIVERSAL UP/DOWN COUNTER 6-BIT UNIVERSAL UP/DOWN COUNTER FEATURES DESCRIPTION 550MHz count frequency Extended 100E VEE range of 4.2V to 5.5V Look-ahead-carry input and output Fully synchronous up and down counting Asynchronous

More information

RN-Codings: New Insights and Some Applications

RN-Codings: New Insights and Some Applications RN-Codings: New Insights and Some Applications Abstract During any composite computation there is a constant need for rounding intermediate results before they can participate in further processing. Recently

More information

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path Project Summary This project involves the schematic and layout design of an 8-bit microprocessor data

More information