Simulation & Synthesis Using VHDL

Similar documents

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

The string of digits in the binary number system represents the quantity

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Floating Point Fused Add-Subtract and Fused Dot-Product Units

1. Give the 16 bit signed (twos complement) representation of the following decimal numbers, and convert to hexadecimal:

Oct: 50 8 = 6 (r = 2) 6 8 = 0 (r = 6) Writing the remainders in reverse order we get: (50) 10 = (62) 8

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

Binary Numbering Systems

ECE 0142 Computer Organization. Lecture 3 Floating Point Representations

2010/9/19. Binary number system. Binary numbers. Outline. Binary to decimal

Computer Science 281 Binary and Hexadecimal Review

VHDL Test Bench Tutorial

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

Levent EREN A-306 Office Phone: INTRODUCTION TO DIGITAL LOGIC

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Systems I: Computer Organization and Architecture

CS101 Lecture 26: Low Level Programming. John Magee 30 July 2013 Some material copyright Jones and Bartlett. Overview/Questions

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

Aims and Objectives. E 3.05 Digital System Design. Course Syllabus. Course Syllabus (1) Programmable Logic

Multipliers. Introduction

Binary Adders: Half Adders and Full Adders

Sistemas Digitais I LESI - 2º ano

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Lecture 2. Binary and Hexadecimal Numbers

1. Convert the following base 10 numbers into 8-bit 2 s complement notation 0, -1, -12

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Chapter 2 Logic Gates and Introduction to Computer Architecture

Numbering Systems. InThisAppendix...

CS321. Introduction to Numerical Methods

Chapter 1: Digital Systems and Binary Numbers

EE361: Digital Computer Organization Course Syllabus

Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML. By Prashanth Tilleti

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

Number Representation

Let s put together a Manual Processor

This 3-digit ASCII string could also be calculated as n = (Data[2]-0x30) +10*((Data[1]-0x30)+10*(Data[0]-0x30));

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

University of St. Thomas ENGR Digital Design 4 Credit Course Monday, Wednesday, Friday from 1:35 p.m. to 2:40 p.m. Lecture: Room OWS LL54

HOMEWORK # 2 SOLUTIO

Digital System Design Prof. D Roychoudhry Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

(Refer Slide Time: 00:01:16 min)

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

University of Texas at Dallas. Department of Electrical Engineering. EEDG Application Specific Integrated Circuit Design

Modeling Latches and Flip-flops

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

CSI 333 Lecture 1 Number Systems

Life Cycle of a Memory Request. Ring Example: 2 requests for lock 17

Numerical Matrix Analysis

Verification & Design Techniques Used in a Graduate Level VHDL Course

CHAPTER 3 Boolean Algebra and Digital Logic

Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1

Solution for Homework 2

Useful Number Systems

RAPID PROTOTYPING OF DIGITAL SYSTEMS Second Edition

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

ECE232: Hardware Organization and Design. Part 3: Verilog Tutorial. Basic Verilog

Binary Representation. Number Systems. Base 10, Base 2, Base 16. Positional Notation. Conversion of Any Base to Decimal.

Lab 1: Full Adder 0.0

Digital Systems Design! Lecture 1 - Introduction!!

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Digital Hardware Design Decisions and Trade-offs for Software Radio Systems

VHDL GUIDELINES FOR SYNTHESIS

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Lecture 8: Binary Multiplication & Division

A High-Performance 8-Tap FIR Filter Using Logarithmic Number System

Understanding Logic Design

FPGA. AT6000 FPGAs. Application Note AT6000 FPGAs. 3x3 Convolver with Run-Time Reconfigurable Vector Multiplier in Atmel AT6000 FPGAs.

Hardware Implementations of RSA Using Fast Montgomery Multiplications. ECE 645 Prof. Gaj Mike Koontz and Ryon Sumner

EE360: Digital Design I Course Syllabus

A Verilog HDL Test Bench Primer Application Note

Digital Electronics Detailed Outline

Attention: This material is copyright Chris Hecker. All rights reserved.

NUMBER SYSTEMS. William Stallings

MACHINE INSTRUCTIONS AND PROGRAMS

Quartus II Introduction for VHDL Users

CHAPTER 5 Round-off errors

Systems I: Computer Organization and Architecture

Positional Numbering System

Numeral Systems. The number twenty-five can be represented in many ways: Decimal system (base 10): 25 Roman numerals:

CS201: Architecture and Assembly Language

CSE140 Homework #7 - Solution

CPEN Digital Logic Design Binary Systems

Microprocessor & Assembly Language

Step : Create Dependency Graph for Data Path Step b: 8-way Addition? So, the data operations are: 8 multiplications one 8-way addition Balanced binary

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

Architectures and Platforms

Method for Multiplier Verication Employing Boolean Equivalence Checking and Arithmetic Bit Level Description

Floating point package user s guide By David Bishop (dbishop@vhdl.org)

2003 HSC Notes from the Marking Centre Software Design and Development

Transcription:

Floating Point Multipliers: Simulation & Synthesis Using VHDL By: Raj Kumar Singh - B.E. (Hons.) Electrical & Electronics Shivananda Reddy - B.E. (Hons.) Electrical & Electronics BITS, PILANI

Outline Introduction - Multipliers - VHDL & Design Flow Various Architectures (Multipliers) -Simulation -Synthesis -Analysis Conclusion

Real Numbers Numbers with fractions 3/5, 4/7 Pure binary 1001.1010 = 2 4 + 2 0 +2-1 + 2-3 =9.625 Fixed point Very limited Moving or floating point (almost universal) Widely used in computations

Which base do we use? Decimal: great for humans, especially when doing arithmetic Hex: if human looking at long strings of binary numbers, its much easier to convert to hex and look 4 bits/symbol Not good for arithmetic on paper Binary: what computers use; computers do +, -, *, / using this only To a computer, numbers always binary Regardless of how number is written: 32 ten == 32 10 == 0x20 == 100000 2 == 0b100000

Floating Point :Overview Floating point representation Normalization Overflow, underflow Rounding Floating point addition Floating point multiply

Floating Point (IEEE-754) use a fixed number of bits Sign bit S, exponent E, significand F Value: (-1) S x F x 2 E IEEE 754 standard S E F Size Exponent Significand Range Single precision 32b 8b 23b 2x10 +/-38 Double precision 64b 11b 52b 2x10 +/-308

Normalization FP numbers are usually normalized i.e. exponent is adjusted so that leading bit (MSB) of mantissa is 1 Example - Scientific notation where numbers are normalized to give a single digit before the decimal point e.g. 3.123 x 10 3 Because it is always 1, there is no need to store it

FP Overflow / Underflow FP Overflow Analogous to integer overflow Result is too big to represent FP Overflow Result is too small to represent Means exponent is too small (too negative) Both raise Problems, thus need extra Care on their Occurrences in IEEE754

FP Rounding Rounding is important Small errors can save the huge storage FP rounding hardware helps Finally, keep sticky bit that is set whenever 1 bits are lost to the right Differentiates between 0.5 and 0.500000000001 So the rounding can save a huge Memory, of course the price is Accuracy, But that can be paid

Base 2 : Representation Number Base B B symbols per digit: Base 10 (Decimal): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 Base 2 (Binary): 0, 1 Number representation: d 31 d 30... d 1 d 0 is a 32 digit number value = d 31 B 31 + d 30 B 30 +... + d 1 B 1 + d 0 B 0 Binary: 0,1 (In binary digits called bits ) 0b11010 = 1 2 4 + 1 2 3 + 0 2 2 + 1 2 1 + 0 2 0 = 16 + 8 + 2 #s often written = 26 0b Here 5 digit binary # turns into a 2 digit decimal #

And in Conclusion... We represent things in computers as particular bit patterns: N bits 2 N 1 s complement - mostly abandoned 2 s complement - universal in computing: 10000 10000... 11110 11110 11111 00000 00001... 01111 11111 00000 00001... 01111 Overflow:... numbers are ; computers having finite storage locations, so errors!

VHDL Language Hardware Description Language (HDL) High-level language for to model, simulate, and synthesize digital circuits and systems. History 1980: US Department of Defense Very High Speed Integrated Circuit program (VHSIC) 1987: Institute of Electrical and Electronics Engineers ratifies IEEE Standard 1076 (VHDL 87) 1993: VHDL language was revised and updated Verilog is the other major HDL Syntax similar to C language

Design Cycle: Simulation Functional simulation: simulate independent of FPGA type no timing Timing simulation: simulate after place and routing also (backannotation part) detailed timing Design Entry Simulation Synthesis Place & Route Simulation Program device & test

Terminology Behavioral modeling Describes the functionality of a component/system For the purpose of simulation and synthesis Structural modeling A component is described by the interconnection of lower level components/primitives For the purpose of synthesis and simulation Synthesis: Translating the HDL code into a circuit, which is then optimized Register Transfer Level (RTL): Type of behavioral model used for instance for synthesis

RTL Synthesis Input is RTL code Compilation & translation Generates technology independent netlist RTL schematic (HDL code analysis) Technology mapping Mapping to technology specific structures: Look-up tables (LUT) Registers RAM/ROM DSP blocks Other device specific components/features Logic optimization Implementation analysis (technology view) Design Entry Simulation Synthesis Place & Route Simulation Program device & test

Digital Circuits and VHDL Primitives Most digital systems can be described based on a few basic circuit elements: Combinational Logic Gates: NOT, OR, AND Flip Flop Latch Tri-state Buffer Each circuit primitive can be described in VHDL and used as the basis for describing more complex circuits.

What is an SOC? System-on-a-chip, System LSI, Systemon-Silicon, - Hardware Analog: ADC, DAC, PLL, Tx, Rx, RF Devices Digital: Processor, Memory, Interface, Accelerator, Software Multiplier, Adder etc OS Application What are the differences from an ASIC?

Traditional ASIC Design Flow Specification Development RTL Code Development Functional Verification (Simulation) Floor-planning, Synthesis, DFT Fault Coverage Analysis Timing Verification Floor-planning, Placement and Route Prototyping, Testing, and Characterization

Functional Verification Models Levels Functional Behavioral RTL Logic Gate Switch Circuit

Example: ALU Register A Register B Add Sub Add/Subtract Unit Accumulator CCR Condition Code Register Normally, the accumulator has logical and arithmetic shift capability, both left and right

Symbol for ALU ALU operation a b ALU Zero Result Overflow CarryOut

FP Arithmetic x / (Steps) Check for zero, operands Add/subtract exponents Multiply/divide significands watch sign Normalize Round Double length intermediate results

FP Multiplication: Steps Compute sign, exponent, significand Normalize Shift left, right by 1 Check for overflow, underflow Round Normalize again (if necessary)

FP Multiplication: operations Sign: P s = A s xor B s Exponent: P E = A E + B E Due to bias/excess, must subtract bias e = e1 + e2 E = e + 1023 = e1 + e2 + 1023 E = (E1 1023) + (E2 1023) + 1023 E = E1 + E2 1023 Significand: P F = A F x B F Standard integer multiply (23b or 52b + g/r/s bits) Use Wallace tree of CSAs to sum partial products

Efficient Multiplier Design Radix-4 Booth Encoding Used to generate all partial products. Sign Extension Prevention To prevent sign extension while doing signed number addition (Padding of 1 s). Optimized Wallace Addition Tree To sum up all operands to 2 vectors (sum, carry).

Start Multiplier flowchart Product0 = 1 1. Test Product0 Product0 = 0 1a. Add multiplicand to the left half of the product and place the result in the left half of the Product register 1 0 0 0 x 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 2. Shift the Product register right 1 bit No: < 32 repetitions 32nd repetition? Yes: 32 repetitions Done

Step By Step Analysis

MULTIPLY (unsigned) Paper and pencil example (unsigned): Multiplicand Multiplier Product 1000 1001 1000 0000 0000 1000 01001000 m bits x n bits = m+n bit product Binary makes it easy: 0 => place 0 ( 0 x multiplicand) 1 => place a copy ( 1 x multiplicand) successive refinement

1. Simultaneous Multiplication X2 X1 X0 Y2 Y1 Y0 X2*Y0 X1*Y0 X0*Y0 X2*Y1 X1*Y1 X0*Y1 X2*Y2 X1*Y2 X0*Y2 P4 P3 P2 P1 P0

Multiplier Schematic : Hardware How would we develop this logic?

Multiplying Negative Numbers This does not work when numbers are negative, then for Solution Convert to positive if required Multiply as above If signs were different, negate answer Use Booth s algorithm

Booth s Algorithm Designed to improve speed by using fewer adds Works best on strings of 1 s Example premise 7 = 8 1 0111 = 1000 0001 (3 adds vs 1 add 1 sub) Algorithm modified to allow for multiplication with negative numbers

Booth s Encoding Really just a new way to encode numbers Normally positionally weighted as 2 n With Booth, each position has a sign bit Can be extended to multiple bits 0-> 1 1 0 Binary +1 0-1 0 1-bit Booth +2-2 2-bit Booth

Booth s Algorithm Current bit Bit to right Explanation Example Operation 1 0 Begins run of 1 00001111000 Subtract 1 1 Middle of run of 1 00001111000 Nothing 0 1 End of a run of 1 00001111000 Add 0 0 Middle of a run of 0 00001111000 Nothing

Comparison between various Architectures S. No Algorithms Performance/Parameter s Serial Multiplier (Sequential) Booth Multiplier Combination al Multiplier Wallace Tree Multiplier 1. Optimum Area 110 LUTs 134 LUTs 4 LUTs 16 LUTs 2. Optimum Delay 9 ns 11 ns 9 ns 9 ns 3. Sequential Elements 105 DFFs 103 DFFs ---- ---- 4. Input/Output Ports 67 / 71 50 / 49 4 / 4 24 / 18 5. CLB Slices(%) 57(7.42%) 71(36.98%) 2(1.04%) 8(4.17%) 6. Function Generators 114(7.42%) 141(36.72 %) 7. Data Required Time/ Arrival Time 8. Optimum Clock (MHz) 9.54 ns 8.66 ns 9.54 ns 9.36 ns 4(1.04%) 16(4.17% ) NA 8.61 10 ns 8.52 ns 100 101.9 NA 100 9. Slack 0.89 ns 0.19 ns Unconstraine d path 1.48 ns

Observations on Multiplication Can speed up algorithm by doing 2 bits at a time, instead of just one Using Booth encoding strategy (in more depth) Multiplication algorithm Sequential version are more efficient than combinational in terms of Hardware, Synchronization, speed Can use carry save adders instead of ripple adder A Wallace tree structure to combine the partial products is another excellent enhancement in Architecture

Multiplication Using Recursive Subtraction Suppose there are two numbers M, N. We have to find A=M*N, lets assume the M & N both are B base number And also M<N. A = MN (M*B - N*(M-1)) Next step: Subtract the M*B from MN, Where MN can be found by just writing both the numbers into a large register, And M*B is also easy to generate. It is just shifting towards left of operand with zero padding. Again we will restore the number (M-1) in place of M by just decrementing. The continuous iteration will decrement the M and finally it will reach to 1.

Q and A