An Integer Square Root Algorithm



Similar documents
More Verilog. 8-bit Register with Synchronous Reset. Shift Register Example. N-bit Register with Asynchronous Reset.

Modeling Registers and Counters

Introduction to Digital Design Using Digilent FPGA Boards Block Diagram / Verilog Examples

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

Registers & Counters

Multiplexers Two Types + Verilog

Traffic Light Controller. Digital Systems Design. Dr. Ted Shaneyfelt

ECE 451 Verilog Exercises. Sept 14, James Barnes

Understanding Verilog Blocking and Non-blocking Assignments

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Computer organization

ECE232: Hardware Organization and Design. Part 3: Verilog Tutorial. Basic Verilog

Sprites in Block ROM

CHAPTER 11: Flip Flops

Lab 17: Building a 4-Digit 7-Segment LED Decoder

E158 Intro to CMOS VLSI Design. Alarm Clock

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

LAB #4 Sequential Logic, Latches, Flip-Flops, Shift Registers, and Counters

A New Paradigm for Synchronous State Machine Design in Verilog

ECE 3401 Lecture 7. Concurrent Statements & Sequential Statements (Process)

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

Counters and Decoders

State Machines in VHDL

if-then else : 2-1 mux mux: process (A, B, Select) begin if (select= 1 ) then Z <= A; else Z <= B; end if; end process;

Binary Adders: Half Adders and Full Adders

Design of Digital Circuits (SS16)

(Refer Slide Time: 00:01:16 min)

Finite State Machine Design and VHDL Coding Techniques

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

Let s put together a Manual Processor

Digital Circuit Design Using Xilinx ISE Tools

A Verilog HDL Test Bench Primer Application Note

Life Cycle of a Memory Request. Ring Example: 2 requests for lock 17

Chapter 2 Verilog HDL for Design and Test

Vending Machine using Verilog

COMBINATIONAL CIRCUITS

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

The 104 Duke_ACC Machine

Take-Home Exercise. z y x. Erik Jonsson School of Engineering and Computer Science. The University of Texas at Dallas

16-bit ALU, Register File and Memory Write Interface

Module 3: Floyd, Digital Fundamental

DEPARTMENT OF INFORMATION TECHNLOGY

PROGRAMMABLE LOGIC CONTROLLERS Unit code: A/601/1625 QCF level: 4 Credit value: 15 OUTCOME 3 PART 1

Decimal Number (base 10) Binary Number (base 2)

Lab 1: Full Adder 0.0

Modeling Latches and Flip-flops

Programming A PLC. Standard Instructions

Memory Elements. Combinational logic cannot remember

FINITE STATE MACHINE: PRINCIPLE AND PRACTICE

Gates, Circuits, and Boolean Algebra

Microprocessor & Assembly Language

Chapter 7: Advanced Modeling Techniques

EXPERIMENT 8. Flip-Flops and Sequential Circuits

Lecture 8: Synchronous Digital Systems

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

FORDHAM UNIVERSITY CISC Dept. of Computer and Info. Science Spring, Lab 2. The Full-Adder

CS 61C: Great Ideas in Computer Architecture Finite State Machines. Machine Interpreta4on

Divide: Paper & Pencil. Computer Architecture ALU Design : Division and Floating Point. Divide algorithm. DIVIDE HARDWARE Version 1

Counters are sequential circuits which "count" through a specific state sequence.

To design digital counter circuits using JK-Flip-Flop. To implement counter using 74LS193 IC.

2 SYSTEM DESCRIPTION TECHNIQUES

PLL Dynamic Reconfiguration Author: Karl Kurbjun and Carl Ribbing

ALGORITHMS AND FLOWCHARTS. By Miss Reham Tufail

2011, The McGraw-Hill Companies, Inc. Chapter 5

Combinational Logic Design Process

Chapter 5. Sequential Logic

VENDING MACHINE. ECE261 Project Proposal Presentaion. Members: ZHANG,Yulin CHEN, Zhe ZHANG,Yanni ZHANG,Yayuan

EE 261 Introduction to Logic Circuits. Module #2 Number Systems

Below is a diagram explaining the data packet and the timing related to the mouse clock while receiving a byte from the PS-2 mouse:

Coding Guidelines for Datapath Synthesis

Arbitration and Switching Between Bus Masters

5 Combinatorial Components. 5.0 Full adder. Full subtractor

Manchester Encoder-Decoder for Xilinx CPLDs

Verilog: Blocks

Jianjian Song LogicWorks 4 Tutorials (5/15/03) Page 1 of 14

CHAPTER 3 Boolean Algebra and Digital Logic

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

A Lesson on Digital Clocks, One Shots and Counters

Chapter 13: Verification

A Lesson on Digital Clocks, One Shots and Counters

Programming Logic controllers

DIGITAL COUNTERS. Q B Q A = 00 initially. Q B Q A = 01 after the first clock pulse.

Wiki Lab Book. This week is practice for wiki usage during the project.

VHDL Test Bench Tutorial

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

Basic Pulse Width Modulation

Cascaded Counters. Page 1 BYU

The components. E3: Digital electronics. Goals:

PLL frequency synthesizer

Sequential Logic Design Principles.Latches and Flip-Flops

9 Control Statements. 9.1 Introduction. 9.2 Objectives. 9.3 Statements

MICROPROCESSOR. Exclusive for IACE Students iacehyd.blogspot.in Ph: /422 Page 1

VHDL GUIDELINES FOR SYNTHESIS

Using the HT46R46 I/O Ports to Implement Half-Duplex SPI Communication

A Practical Parallel CRC Generation Method

EC313 - VHDL State Machine Example

Digital Design. Assoc. Prof. Dr. Berna Örs Yalçın

3. VERILOG HARDWARE DESCRIPTION LANGUAGE

Sistemas Digitais I LESI - 2º ano

Transcription:

An Integer Square Root Algorithm 71 Example 24 An Integer Square Root Algorithm The C algorithm shown in Fig. 2.8 performs an integer square root of the input a as shown in Table 2.1. Note from Table 2.1 that the difference between successive squares, delta, is just the sequence of odd numbers. Also note that the while loop is executed as long as square is less than or equal to a. Thus, the value of the square root, delta/2 1, occurs in the row following the square value in Table 2.1. unsigned long sqrt(unsigned long a){ unsigned long square = 1; unsigned long delta = 3; while(square <= a){ square = square + delta; delta = delta + 2; } return (delta/2-1 } Figure 2.8 Integer square root algorithm Table 2.1 Illustrating the algorithm in Fig. 2.8 n square = n^2 delta delta/2-1 0 0 1 1 3 2 4 5 1 3 9 7 2 4 16 9 3 5 25 11 4 6 36 13 5 7 49 15 6 8 64 17 7 9 81 19 8 10 100 21 9 11 121 23 10 12 144 25 11 13 169 27 12 14 196 29 13 15 225 31 14 16 256 33 15 17 289

72 Example 24 Our goal is to implement this algorithm in hardware using the same method that we used in Example 23. The datapath for this square root algorithm is shown in Fig. 2.9. We will limit the input value a[7:0] to eight bits so we can connect them to the eight switches sw[7:0]. Thus, the register areg will be an 8-bit register. From Table 2.1 the maximum value of delta will be 33. This suggests that the size of the delta register should be 6 bits. However, when we are done the final value of delta is divided by 2 and then 1 is subtracted from this result. If the value is 33 but we only use a 5-bit register for delta then the value in delta will be 1. When we divide this by 2 we will get 0, and when we subtract 1 from 0 we get 11111, the lower four bits of which are equal to 15 the correct result. Thus, we can get by with a 5-bit register for delta. However, from Table 2.1 the maximum value of square in the algorithm will be 256, and therefore the square register must be 9 bits wide. The maximum square root value will be 15 and thus the output will be stored in a 4-bit register. In addition to the four registers areg, sqreg, delreg, and outreg, this datapath contains four combinational modules. The output, lteflg, of the <= module will be 1 if square (the output of sqreg) is less than or equal to a (the output of areg). sw[7:0] s[8:0] 5 9 clr reset ald sqld dld clk clk areg sqreg delreg lteflg square[7:0] outld outreg 4 root[3:0] del2[4:0] reset clk 8 sq[8:0] 9 5 del[4:0] adder8 plus2 a[7:0] square[7:0] 4 del[4:1] <= minus1 outin[3:0] 4 clr clk Figure 2.9 Datapath for square root algorithm The Verilog program in Listing 2.5 implements the complete datapath shown in Fig. 2.9. The four registers have different bus widths and are initialized to different values. The generic register described in the Verilog file regr2.v shown in Listing 2.6 is used to instantiate all four registers in Listing 2.5. The two parameters BIT0 and BIT1 are used to set the lower two bits of the generic register on reset. Note in Listing 2.5 that the value of sqreg is initialized to 1 and the value of delreg is initialized to 3. In Listing 2.5 the combinational modules adder8, plus2, and minus1 are implemented with assign statements, and the module lte is implemented by a separate always block. The output signal lteflg in Fig. 2.9 is sent to the control unit. The control unit will then provide the datapath with the load signals for all of the registers. The design of the control unit will be described next.

An Integer Square Root Algorithm 73 Listing 2.5 SQRTpath.v // Example 24a: Square root datapath module SQRTpath ( input wire clk, input wire reset, input wire ald, input wire sqld, input wire dld, input wire outld, input wire [7:0] sw, output reg lteflg, output wire [3:0] root wire [7:0] a; wire [8:0] sq, s; wire [4:0] del, dp2; wire [3:0] dm1; assign s = sq + {4'b0000, del}; assign dp2 = del + 2; assign dm1 = del[4:1] - 1; // adder8 // plus2 // minus1 always @(*) regr2 #(.N(8),.BIT0(0),.BIT1(0)) areg (.load(ald),.clk(clk),.reset(reset),.d(sw),.q(a) regr2 #(.N(9),.BIT0(1),.BIT1(0)) sqreg (.load(sqld),.clk(clk),.reset(reset),.d(s),.q(sq) if(sq <= {1'b0,a}) lteflg <= 1; else lteflg <= 0;

74 Example 24 Listing 2.5 (cont.) SQRTpath.v regr2 #(.N(5),.BIT0(1),.BIT1(1)) delreg (.load(dld),.clk(clk),.reset(reset),.d(dp2),.q(del) regr2 #(.N(4),.BIT0(0),.BIT1(0)) outreg (.load(outld),.clk(clk),.reset(reset),.d(dm1),.q(root) module Listing 2.6 regr2.v // Example 24b: N-bit register with reset and load // Resets to initial value of lowest 2 bits module regr2 #(parameter N = 4, parameter BIT0 = 1, parameter BIT1 = 1) (input wire load, input wire clk, input wire reset, input wire [N-1:0] d, output reg [N-1:0] q always @(posedge clk or posedge reset) if(reset == 1) q[n-1:2] <= 0; q[0] <= BIT0; q[1] <= BIT1; else if(load == 1) q <= d; module

An Integer Square Root Algorithm 75 Square Root Control Unit Listing 2.5 describes the datapath shown in Fig. 2.9. In this section we will design the control unit that will control this datapath as shown in Fig. 2.10. Note in this figure that the datapath ss the signal lteflg to the control unit, and the control unit ss the register load signals ald, sqld, dld, and outld to the datapath. The control unit will be a state machine that defines the timing sequence of the algorithm. For this square root algorithm there are four states: start, test, update, and done as shown in Fig. 2.11. The program s in the start state, and stays in this state until the go signal goes high. It then goes to the test state, which will test to see if square <= a is true or false. If it is true, then the program goes to the update state, which will update the values of square and delta in the while loop of Fig. 2.8. Otherwise, the program goes to the done state, which computes the return value and stays in this done state. Listing 2.7 is a Verilog program for the control unit in Fig. 2.10, which implements the state diagram shown in Fig. 2.11. clr clk clr clk lteflg go Control Unit ald sqld Datapath sw[7:0] root[3:0] dld outld Figure 2.10 Top-level logic diagram for square root algorithm ~go start go ~lteflg test square += delta delta += 2 done lteflg update delta/2-1 Figure 2.11 State diagram for square root algorithm

76 Example 24 Note that we have implemented the state machine in Listing 2.7 as a Moore machine of the type shown in Fig. 1.37 with three always blocks: the sequential state register block and the two combinational blocks C1 and C2. Note how the C1 block finds the next state by directly implementing the state diagram in Fig. 2.11 with a case statement. The output block C2 also uses a case statement to set the register load signals to the proper values for each state. Verify that these load signals are appropriate for implementing the square root algorithm. Listing 2.7 SQRTctrl.v // Example 24c: Square root control module SQRTctrl ( input wire clk, input wire clr, input wire lteflg, input wire go, output reg ald, output reg sqld, output reg dld, output reg outld reg[1:0] present_state, next_state; parameter start = 2'b00, test =2'b01, update = 2'b10, done = 2'b11; // states // State registers always @(posedge clk or posedge clr) if (clr == 1) present_state <= start; else present_state <= next_state; // C1 module always @(*) case(present_state) start: if(go == 1) next_state = test; else next_state = start; test: if(lteflg == 1) next_state = update; else next_state = done; update: next_state = test; done: next_state = done; default next_state = start; case

An Integer Square Root Algorithm 77 Listing 2.7 (cont.) SQRTctrl.v // C2 module always @(*) ald = 0; sqld = 0; dld = 0; outld = 0; case(present_state) start: ald = 1; test: ; update: sqld = 1; dld = 1; done: outld = 1; default ; case module The top-level design shown in Fig. 2.10 can be implemented in Verilog by simply instantiating the datapath and control unit as shown in Listing 2.8. Note that we have included an output signal done that goes high when the done state is entered. A simulation of this program that computes the square root of 64 is shown in Fig. 2.12. Figure 2.12 Simulation of the square root program in Listing 2.8

78 Example 24 Listing 2.8 SQRT.v // Example 24d: Integer Square Root module sqrt ( input wire clk, input wire clr, input wire go, input wire [7:0] sw, output wire done, output wire [3:0] root wire lteflg, ald, sqld, dld, outld; assign done = outld; SQRTctrl sqrt1 (.clk(clk),.clr(clr),.lteflg(lteflg),.go(go),.ald(ald),.sqld(sqld),.dld(dld),.outld(outld) SQRTpath sqrt2 (.clk(clk),.reset(clr),.ald(ald),.sqld(sqld),.dld(dld),.outld(outld),.sw(sw),.lteflg(lteflg),.root(root) module To test the square root algorithm on the Nexys-2 board we will use the top-level design shown in Fig. 2.13. When btn[3] is pressed to reset the circuit the decimal value of the eight switch settings is displayed on the 7-segment display. We use the 8-bit binary-to-bcd program in Listing 1.14 of Example 12 to do this conversion. When btn[0] is pressed the sqrt component computes the square root of the switch settings, and when the calculation is complete the done signal switches the multiplexer so that the decimal value of the square root is displayed on the 7-segment display. Try it.

An Integer Square Root Algorithm 79 mclk 50 MHz btn[0] btn[3] clk25 clkdiv clr btn[3] sw[7:0] go clr clk sqrt root[3:0] done r[7:0] 1 b[7:0] mux2g 0 clk190 binbcd8 btn[3] cclk clr 0 & p[9:0] x[15:0] x7segb a_to_g[6:0] an[3:0] ld[7:0] Figure 2.13 Top-level design for testing square root algorithm Listing 2.9 sqrt_top.v // Example 24e: sqrt_top module sqrt_top ( input wire mclk, input wire [3:0] btn, input wire [7:0] sw, output wire [7:0] ld, output wire dp, output wire [6:0] a_to_g, output wire [3:0] an wire clk25, clk190, clr, done; wire [15:0] x; wire [9:0] p; wire [3:0] root; wire [7:0] b, r; assign clr = btn[3]; assign r = {4'b0000,root}; assign x = {6'b000000,p}; assign ld = sw; clkdiv U1 (.mclk(mclk),.clr(clr),.clk190(clk190),.clk25(clk25) sqrt U2 (.clk(clk25),.clr(clr),.go(btn[0]),.sw(sw),.done(done),.root(root)

80 Example 24 Listing 2.9 (cont.) sqrt_top.v mux2g #(.N(8)) U3 (.a(sw),.b(r),.s(done),.y(b) binbcd8 U4 (.b(b),.p(p) x7segb U5 (.x(x),.cclk(clk190),.clr(clr),.a_to_g(a_to_g),.an(an),.dp(dp) module