FPGA-VHDL implementation of Pipelined Square root Circuit for VLSI Signal Processing Applications

Similar documents
Design and FPGA Implementation of a Novel Square Root Evaluator based on Vedic Mathematics

Implementation of Modified Booth Algorithm (Radix 4) and its Comparison with Booth Algorithm (Radix-2)

Binary Adders: Half Adders and Full Adders

Floating Point Fused Add-Subtract and Fused Dot-Product Units

FPGA Implementation of an Advanced Traffic Light Controller using Verilog HDL

Let s put together a Manual Processor

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

VHDL Test Bench Tutorial

Zuse's Z3 Square Root Algorithm Talk given at Fall meeting of the Ohio Section of the MAA October College of Wooster

Modeling Latches and Flip-flops

Implementation and Design of AES S-Box on FPGA

AN IMPROVED DESIGN OF REVERSIBLE BINARY TO BINARY CODED DECIMAL CONVERTER FOR BINARY CODED DECIMAL MULTIPLICATION

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

Chapter 2 Logic Gates and Introduction to Computer Architecture

An Efficient RNS to Binary Converter Using the Moduli Set {2n + 1, 2n, 2n 1}

1.6 The Order of Operations

COMBINATIONAL CIRCUITS

Design and Implementation of Vending Machine using Verilog HDL

Multiplexers Two Types + Verilog

(Refer Slide Time: 00:01:16 min)

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

CSI 333 Lecture 1 Number Systems

Chapter 9. Systems of Linear Equations

Systems I: Computer Organization and Architecture

Section IV.1: Recursive Algorithms and Recursion Trees

Xilinx ISE. <Release Version: 10.1i> Tutorial. Department of Electrical and Computer Engineering State University of New York New Paltz

Sistemas Digitais I LESI - 2º ano

Innovative improvement of fundamental metrics including power dissipation and efficiency of the ALU system

ON SUITABILITY OF FPGA BASED EVOLVABLE HARDWARE SYSTEMS TO INTEGRATE RECONFIGURABLE CIRCUITS WITH HOST PROCESSING UNIT

Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots

CSE140 Homework #7 - Solution

Counters and Decoders

Understanding Logic Design

Radicals - Multiply and Divide Radicals

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

Design and Verification of Nine port Network Router

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

5 Combinatorial Components. 5.0 Full adder. Full subtractor

International Journal of Electronics and Computer Science Engineering 1482

Verification & Design Techniques Used in a Graduate Level VHDL Course

Efficient Interconnect Design with Novel Repeater Insertion for Low Power Applications

Asynchronous counters, except for the first block, work independently from a system clock.

Digital Systems Design! Lecture 1 - Introduction!!

ECE232: Hardware Organization and Design. Part 3: Verilog Tutorial. Basic Verilog

High-Level Synthesis for FPGA Designs

5.1 Radical Notation and Rational Exponents

LINEAR EQUATIONS IN TWO VARIABLES

SIM-PL: Software for teaching computer hardware at secondary schools in the Netherlands

Lecture 2. Binary and Hexadecimal Numbers

Hardware Implementation of the Stone Metamorphic Cipher

NUMBER SYSTEMS. William Stallings

Continued Fractions and the Euclidean Algorithm

Number Systems and Radix Conversion

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

A DA Serial Multiplier Technique based on 32- Tap FIR Filter for Audio Application

PROG0101 Fundamentals of Programming PROG0101 FUNDAMENTALS OF PROGRAMMING. Chapter 3 Algorithms

OUTILS DE DÉMONSTRATION

2 SYSTEM DESCRIPTION TECHNIQUES

Design of Low Power One-Bit Hybrid-CMOS Full Adder Cells

KS3 Computing Group 1 Programme of Study hours per week

A New Reversible TSG Gate and Its Application For Designing Efficient Adder Circuits

Digital Electronics Detailed Outline

DIGITAL COUNTERS. Q B Q A = 00 initially. Q B Q A = 01 after the first clock pulse.

Lab 1: Full Adder 0.0

Introduction to Xilinx System Generator Part II. Evan Everett and Michael Wu ELEC Spring 2013

International Journal of Advancements in Research & Technology, Volume 2, Issue3, March ISSN

Lecture 13 - Basic Number Theory.

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Technical Aspects of Creating and Assessing a Learning Environment in Digital Electronics for High School Students

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation

Reconfigurable Low Area Complexity Filter Bank Architecture for Software Defined Radio

EXPERIMENT 4. Parallel Adders, Subtractors, and Complementors

ECE 3401 Lecture 7. Concurrent Statements & Sequential Statements (Process)

8 Primes and Modular Arithmetic

University of St. Thomas ENGR Digital Design 4 Credit Course Monday, Wednesday, Friday from 1:35 p.m. to 2:40 p.m. Lecture: Room OWS LL54

Rational Exponents. Squaring both sides of the equation yields. and to be consistent, we must have

Lab 1: Introduction to Xilinx ISE Tutorial

DEPARTMENT OF INFORMATION TECHNLOGY

A High Performance Binary to BCD Converter

IJESRT. [Padama, 2(5): May, 2013] ISSN:

Implementing the Functional Model of High Accuracy Fixed Width Modified Booth Multiplier

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

Solving Rational Equations

Attaining EDF Task Scheduling with O(1) Time Complexity

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

A CDMA Based Scalable Hierarchical Architecture for Network- On-Chip

9/14/ :38

An Extension to DNA Based Fredkin Gate Circuits: Design of Reversible Sequential Circuits using Fredkin Gates

AC : A PROCESSOR DESIGN PROJECT FOR A FIRST COURSE IN COMPUTER ORGANIZATION

Fast Sequential Summation Algorithms Using Augmented Data Structures

Step : Create Dependency Graph for Data Path Step b: 8-way Addition? So, the data operations are: 8 multiplications one 8-way addition Balanced binary

Introduction to Digital System Design

Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML. By Prashanth Tilleti

Answer Key for California State Standards: Algebra I

Modeling Registers and Counters

Real Roots of Univariate Polynomials with Real Coefficients

Binary Division. Decimal Division. Hardware for Binary Division. Simple 16-bit Divider Circuit

Transcription:

FPGA-VHDL implementation of Pipelined Square root Circuit for VLSI Signal Processing Applications Arpita Jena M.Tech Scholar, Dept. of ECE, VLSI DesignCUTM, SoET, BBSR, Odisha Siba Ku. Panda Asst. professor, Dept. of ECE CUTM, SoET, BBSR, Odisha ABSTRACT An efficient mathematical operation plays an imperative role in achieving the preferred presentation in most of the real time Signal processing applications. In all types of mathematical operations, Square root is an important operation which can be used in VLSI signal processing applications. This paper presents a proficient policy to implement non restoring algorithm based on FPGA in gate level build of VHDL, which uses abundant pipelined architecture. An original basic building block called as controlled- subtract-multiplex (CSM) is introduced here. The pipelined square root circuit is designed using an ever known algorithm called non-restoring algorithm that does not require any floating-point hardware.the designed circuit is simulated and debugged using XILINX ISE 14.1. The architecture is implemented onto SPARTAN 3E family and debugged on Spartan 3 XC3S100E. The main principle of the proposed method is similar with conventional non-restoring algorithm, but it only uses subtract operation and append 01, while add operation and append 11 is not used. The proposed strategy has conducted to implement FPGA successfully. General Terms VLSI, Algorithms, Signal Processing Keywords Square root, VLSI signal processing, VHDL, CSM, FPGA, Non-restoring algorithm, Pipelining. 1. INTRODUCTION Square root calculation is one of the most vital operation in arithmetic calculations. Many algorithms are designed, implemented and improved in order to obtain the square root of a number. But it is a hard task to get an exact result by using different algorithms. They also provide much more complexities. This paper emphasizes on designing of a nonrestoring square root circuit. This paper is structured as follows: The section-2 represents an overview of the previous related works.section-3 gives a brief description of different methods of square root calculation and shows why the non-restoring algorithm provides a better way for square root extraction. The proposed technique used for the square root calculation is described in the section-4 which includes the algorithm used, the basic building block and the circuit diagram. The section-5 represents the implementation and verification of the proposed circuit. The result is analyzed in section-6.it shows the simulation results and test bench waveform output of the design. The conclusion is strained in section-7 2. BACKGROUND & RELATED WORK S.Kavitha etal. [1] described a design of adder based square root circuit using restoring and non-restoring algorithm. The circuit result is compared with different square root circuits and lower power and higher EPI is obtained by using this circuit. Yamin Li etal. [2] presented a very efficient implantation of a new non-restoring square root algorithm which emphasizes on partial reminder rather than each bit of square root in each iteration by using only traditional adder/subtractor. A fully pipelined, high performance implementation of the algorithm requiring minimum number of gates is designed. S.Samavibet etal. [3] presented a classical non-restoring array structure to simplify the circuit without any loss in square root precision or the reminder providing a 64 bit non-restoring square root circuit with 44% lesser area than that of conventional circuit. Yamin Li etal. [4] described a non-restoring square root algorithm and FPGA implementation of two single precision floating point square root circuits.one is low cost iterative implementation requiring a traditional adder/subtractor and the other is pipelined implementation of square root instruction in every clock cycle. John O Leary etal. [5] provides a very efficient hardware implementation of the non-restoring square root algorithm which uses an adder/subtractor, simple combinational logics and registers. ToleSutikno etal. [6] describes an efficient implementation of modified non-restoring square root algorithm and its FPGA implementation which uses a controlled subtract multiplex block as the basic building block. I.Sajid etal. [7] presented an FPGA implementation of the non-restoring algorithm of fixed point square root using pipelined architecture. The system performance is calculated as a function of execution time and power consumption. 3. SQUARE ROOT ALGORITHMS 3.1 Newton s Method The Newton s method [8](also known as Newton-Raphson method) is one of the methods for finding a better approximation to the square roots of a real valued number. But this method requires the derivatives to be calculated directly which needs the analytical expansion for the derivative. This is a bit expensive to evaluate and could not be easily obtained. Before implementation of this Newton s method, the assumptions taken during the derivation of the formulas are needed to be reviewed. If the assumptions in the derivation are not met, the method fails to converge. Due to 20

these difficulties, this method does not provide a better way to determine the square root of a real valued number. 3.2 Digit by digit Method The digit by digit method of square root calculation of real valued number is an easy way to find the result in a sequence manually as compared to other methods. If the expansion of the square root is terminated, then the algorithm is terminated after finding the last digit. The way of preceding of square root calculation depend on the base of the number. This algorithm requires more number of arithmetic operations as well as more computational time. The expansions of the square root also provide complexity.so that it is difficult to determine the exact result. 3.3 Babylonian Method This is the first method for approximate calculation of the square root of a non-negative real number.in this method two values are determined first: one is the overestimate to the square root of the real valued non-negative number and the other is the under estimate. The average of these two supposed to provide the result. But this requires the arithmetic and geometric means which shows the average is always overestimate of the square root. Some computation error also arises during the calculation and this cannot be calculated exactly. Thus this is not an efficient method of square root calculation of real valued non-negative number. 3.4 Rough Estimation This method of square root calculation is easy to calculate the square root of a real non-negative number. An initial seed value is required in this algorithm. It also needs the geometric mean for calculating the square root. This method provides an approximate value of the square root only and a bit complex to calculate.so that it is not an impressive algorithm for square root determination. 3.5 Restoring Algorithm This algorithm is a division algorithm used for square root calculation. It uses the trial and error method in order to determine the square root of a real non-negative number. During the calculations this algorithm has some extra steps which increases the calculation time as well as the execution time. Restoring algorithm also uses multipliers and various events also requires close attention during the execution time. For this restoring algorithm of square root calculation large clock cycle is also required. These limitations of this algorithm lead towards a new algorithm for square root calculation which is known as the Non-restoring algorithm. 3.6 Non-Restoring Algorithm The previous algorithms of square root provide a difficult way to get an exact result and are not efficient to be implemented in FPGA. The non-restoring algorithm [9,10] is an efficient way for calculating the square root of a real valued number. This does not use more number of arithmetic operations. So less computational and execution time is required to determine the result. It also provides less complexity as well as exact result. No multiplier is required for the non-restoring algorithm of square root calculation. This algorithm is most suitable for FPGA implementation of the square root circuit. The proposed algorithm skips the restoring and extra addition steps of the restoring algorithm. The NR algorithm is extremely simple.this algorithm can also be used for various design of divider architectures for efficient VLSI Signal processing applications. [10] 4. TECHNIQUE USED FOR SQUARE ROOT CIRCUIT DESIGN The proposed architecture provides a new and efficient strategy to implement the non-restoring pipelined square root algorithm on FPGA. This paper adopts a fully pipelined architecture in order to design the n-bit square root circuit. A new block is introduced as the controlled subtract-multiplex (CSM) which is used as the basic building block for the design. No additional arithmetic operation is used except the subtraction operation and 01 is appended. So that this design needs less number of pipelined operation as well as reduced hardware consumption. SHIFT LEFT A,Q A=A+M Start A=0 M=DIVISOR Q=DIVIDEND COUT =N Stop SHIFT LEFT A,Q A=A-M COUNT=C Q0=0 Q0=1 OUNT-1 A=A+M COUNT =0 Fig-1 Algorithmic flow chart for non-restoring square root calculation The above figure represents the flow chart for the nonrestoring square root algorithm. Unlike the restoring algorithm this algorithm works on the negative residuals and skips the extra addition operations which results less complicacy. If A is negative, If A is positive, I. The register pair (A, Q) is shifted one bit left II. The contents of register M is added to register A I. The register pair A, Q) is shifted one bit left II. The contents of register M is subtracted from register A After n cycles, 21

I. The quotient is in register A II. If A is positive, it is the reminder else it has to be restored in order to get the reminder. The figure-2 shows the block diagram representation of a 4- bit non-restoring square root circuit which uses the CSM (controlled subtract multiplex) block (Fig-3) as the basic building block. X, Y, B are the input signals of the CSM block and U is the control signal for the multiplexer unit. The output signals of CSM are b 0 (borrow) and d 0 (difference) which are chosen according to the control signal. By using these blocks we can eliminate the circuitry. P3 0 P2 1 0 u E u D b5 d5 b4 d4 Fig-4.1 RTL view of 4 bit non-restoring square root circuit Below Figure 4.2 and 4.3 represents the internal structure and test bench waveform result of 4-bit non-restoring square root circuit. Q1 0 0 0 P1 P0 0 0 1 0 0 u u u u C C1 B A b3 d3 b2 d2 b1 d1 b0 d0 Q0 R3 R2 R1 R0 Fig-2 The 4 bit non-restoring pipelined square root circuit Fig-4.2 Internal structure of 4-bit square root circuit b0 X U MUX Fig-3 Controlled Subtract Multiplex Block d0 5. IMPLEMENTATION & VERIFICATION In this effort, the square root circuit is designed in VHDL- FPGA environment. Then it is successfully synthesized and simulated in Xilinx ISE design suite 14.1.After that it is implemented in field programmable gate array device. Then the results are verified and found correct. 6. RESULT ANALYSIS In this section our main focus is on the simulation results. The below fig-4.1 shows the Register-Transfer-level schematic of the designed 4-bit non-restoring square root. Fig-4.3 Test bench wafeform of 4-bit square root operation 22

Fig-4.4 FPGA implementation of square root circuit Figure 5.1 and 5.2 shows the RTL schematic and internal design of 8-bit non-restoring square root circuit respectively. Fig-5.1 RTL view of 8-bit non-restoring square root circuit Fig-5.2 Internal structure of 8-bit square root circuit The below figure 5.3 shows the test bench waveform result of 8-bit non-restoring square root circuit. Fig-5.3Test bench wafeform of 8-bit square root operation 7. CONCLUSION The performance of the designed Square root Circuit using non-restoring algorithms proved to be better due to the pipelined structure.this initiation here may set path for further efficient designs. The performance of the projected method in terms of simulation results has been presented which shows that it is superior and well done. Hence this method out performs its counterparts. 8. REFERENCES [1] B.C.Senthilpari,, S. Kavitha Proposed low power,high speed,adder-based,65 nm square root circuit journal of Microelectronics elsevier Science 2011. [2] Yamin Li and wanming Chu A new non-restoring squareroot algorithm and its VLSI implementation IEEE International Conference 1996. [3] S.Samavi, A.Sadrabadi, A.Fanian.2008. Modular array structure of non-restoring square root circuit Journal of system architecture-elsevier Science 2008. [4] Yamin Li and Wanming Chu Implementation of single precision floating point square root on FPGAs 5 th annual IEEE symposium 1997. [5] John O Leary Non-restoring integer square root-a case study in design by principled optimization 1994. [6] Tole Sutikno etal. An efficient implementation of nonrestoring square root algorithm in gate level International association of computer science and information technology 2011 [7] I. Sajid, M.M.Ahmed Pipelined implementation of fixed point square root in FPGA using modified non-restoring algorithm 2 nd internal conference IEEE 2011 [8] Liang-Kai-Wang and M. J. Schulte Decimal floating point square root using Newton-Raphson iteration 16 th international conference IEEE 2005 [9] Yamin Li and Wanming Chu. Parallel array implementations of non-restoring square root algorithm 2005 [10] S.k panda and Arati sahu A Novel Vedic Divider Architecture with Reduced Delay for VLSI Applications IJCA,vol 120-No.17, june-2015 23

9. AUTHOR PROFILE Mrs. Arpita Jena was born on March 29, 1992 in Cuttack, Odisha, India. She has completed the B.Tech degree in Electronics and Communication Engineering, from Seemant Engineering Collage, Mayurbhanj.Currently pursuing her M.tech degree in VLSI Design Engineering from ECE, Centurion University of Technology and Management, Bhubaneswar, Odisha 751020, India, in the academic period of 2014-2016. Mr.Siba Kumar Panda was born on November 09, 1989.He received the B.Tech degree in Electronics & Communication Engineering from Biju Patnaik University of Technology, odisha in 2012 and M.Tech degree in VLSI Signal Processing Specialization from Veer Surendra Sai University of Technology, odisha in 2014. Currently he is working as an Assistant Professor at Centurion University of Technology and Management, Bhubaneswar, Odisha. He also awarded the University Silver Medal for best Electronics & Telecommunication Engineering Post Graduate for the academic year 2012-2014 at VSSUT, Odisha. He has five number of publications in various VLSI journals His research area of interest includes VLSI implementation of Vedic Mathematics,Digital VLSI design,vlsi Optimization and VLSI Signal Processing. IJCA TM : www.ijcaonline.org 24