PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics



Similar documents
PROGETTO DI SISTEMI ELETTRONICI DIGITALI. Digital Systems Design. Digital Circuits Advanced Topics

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency

Latch Timing Parameters. Flip-flop Timing Parameters. Typical Clock System. Clocking Overhead

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

Lecture 11: Sequential Circuit Design

Lecture 7: Clocking of VLSI Systems

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

A New Paradigm for Synchronous State Machine Design in Verilog

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs

Lecture 10: Sequential Circuits

White Paper Understanding Metastability in FPGAs

ASYNCHRONOUS COUNTERS

Memory Elements. Combinational logic cannot remember

Chapter 2 Clocks and Resets

EE552. Advanced Logic Design and Switching Theory. Metastability. Ashirwad Bahukhandi. (Ashirwad Bahukhandi)

Set-Reset (SR) Latch

A Pausible Bisynchronous FIFO for GALS Systems

Lecture-3 MEMORY: Development of Memory:

Having read this workbook you should be able to: recognise the arrangement of NAND gates used to form an S-R flip-flop.

Lesson 12 Sequential Circuits: Flip-Flops

ETEC 2301 Programmable Logic Devices. Chapter 10 Counters. Shawnee State University Department of Industrial and Engineering Technologies

Flip-Flops, Registers, Counters, and a Simple Processor

Experiment # 9. Clock generator circuits & Counters. Eng. Waleed Y. Mousa

WEEK 8.1 Registers and Counters. ECE124 Digital Circuits and Systems Page 1

Modeling Sequential Elements with Verilog. Prof. Chien-Nan Liu TEL: ext: Sequential Circuit

Low latency synchronization through speculation

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012

Sequential Circuit Design

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

Combinational Logic Design Process

FPGA Clocking. Clock related issues: distribution generation (frequency synthesis) multiplexing run time programming domain crossing

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

Engr354: Digital Logic Circuits

CLOCK DOMAIN CROSSING CLOSING THE LOOP ON CLOCK DOMAIN FUNCTIONAL IMPLEMENTATION PROBLEMS

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

Chapter 9 Latches, Flip-Flops, and Timers

Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

路 論 Chapter 15 System-Level Physical Design

Contents COUNTER. Unit III- Counters

7. Latches and Flip-Flops

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 16 Timing and Clock Issues

Module 3: Floyd, Digital Fundamental

Sequential Logic Design Principles.Latches and Flip-Flops

Counters and Decoders

Topics. Flip-flop-based sequential machines. Signals in flip-flop system. Flip-flop rules. Latch-based machines. Two-sided latch constraint

Lecture 10: Multiple Clock Domains

Finite State Machine. RTL Hardware Design by P. Chu. Chapter 10 1

Demystifying Data-Driven and Pausible Clocking Schemes

Modeling Latches and Flip-flops

DM74LS169A Synchronous 4-Bit Up/Down Binary Counter

IE1204 Digital Design F12: Asynchronous Sequential Circuits (Part 1)

Sequential Logic: Clocks, Registers, etc.

Chapter 5. Sequential Logic

DM Segment Decoder/Driver/Latch with Constant Current Source Outputs

Asynchronous & Synchronous Reset Design Techniques - Part Deux

Asynchronous Counters. Asynchronous Counters

Counters. Present State Next State A B A B

So far we have investigated combinational logic for which the output of the logic devices/circuits depends only on the present state of the inputs.

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

74LS193 Synchronous 4-Bit Binary Counter with Dual Clock

Design Verification & Testing Design for Testability and Scan

6-BIT UNIVERSAL UP/DOWN COUNTER

ECE124 Digital Circuits and Systems Page 1

Power Reduction Techniques in the SoC Clock Network. Clock Power

Cascaded Counters. Page 1 BYU

CSE140: Components and Design Techniques for Digital Systems

DIGITAL COUNTERS. Q B Q A = 00 initially. Q B Q A = 01 after the first clock pulse.

PowerPC Microprocessor Clock Modes

Counters are sequential circuits which "count" through a specific state sequence.

DM54161 DM74161 DM74163 Synchronous 4-Bit Counters

Programmable Logic Design Grzegorz Budzyń Lecture. 10: FPGA clocking schemes

Theory of Logic Circuits. Laboratory manual. Exercise 3

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX

Decimal Number (base 10) Binary Number (base 2)

Flip-Flops and Sequential Circuit Design. ECE 152A Winter 2012

Flip-Flops and Sequential Circuit Design

DATA SHEET. HEF40193B MSI 4-bit up/down binary counter. For a complete data sheet, please also download: INTEGRATED CIRCUITS

Asynchronous counters, except for the first block, work independently from a system clock.

CS311 Lecture: Sequential Circuits

DM74LS193 Synchronous 4-Bit Binary Counter with Dual Clock

DIGITAL ELECTRONICS. Counters. By: Electrical Engineering Department

VHDL GUIDELINES FOR SYNTHESIS

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

CHAPTER 11 LATCHES AND FLIP-FLOPS

SN54HC191, SN74HC191 4-BIT SYNCHRONOUS UP/DOWN BINARY COUNTERS

Master/Slave Flip Flops


54LS169 DM54LS169A DM74LS169A Synchronous 4-Bit Up Down Binary Counter

DDR subsystem: Enhancing System Reliability and Yield

Lab #5: Design Example: Keypad Scanner and Encoder - Part 1 (120 pts)

Technical Note. Micron NAND Flash Controller via Xilinx Spartan -3 FPGA. Overview. TN-29-06: NAND Flash Controller on Spartan-3 Overview

Chapter 13: Verification

Alpha CPU and Clock Design Evolution

SEQUENTIAL CIRCUITS. Block diagram. Flip Flop. S-R Flip Flop. Block Diagram. Circuit Diagram

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA

Transcription:

PROGETTO DI SISTEMI ELETTRONICI DIGITALI Digital Systems Design Digital Circuits Advanced Topics 1

Sequential circuit and metastability 2

Sequential circuit - FSM A Sequential circuit contains: Storage elements: Latches or Flip-Flops Inputs Outputs Combinational Logic: Implements a multiple-output switching function Inputs are signals from the outside. Outputs are signals to the outside. Other inputs, State or Present State, are signals from storage elements. The remaining outputs, Next State are inputs to storage elements State Storage Elements Combina -tional Logic Next State 3

Bistable element the basic memory block Has two stable conditions (states) Can be used to store binary symbols If Q = high, the feedback to inverter 2 will cause its output to be low, which also forces the output of inverter 1 to be high. So this is a stable state. If Q = low, the feedback to inverter 2 will cause its output to be high, which also forces the output of inverter 1 to be low. So this is the second stable state. 4

Synchronous and Asynchronous Sequential Circuits Asynchronous sequential circuits: Outputs and state change as soon as an input changes. Synchronous sequential circuits: Outputs and state change depending on a special input (clock). The term asynchronous is because the two flip-flops are not clocked by the same signal. The following circuit is also called asynchronous, 2-bit, binary up counter: 5

Moore vs Mealy FSM 6

Latch vs. Flip-flop Latch D D Q Q En En E D Y Level sensitive Flip-flop D D Q Q Clk Clk C D Q Edge-sensitive 7

Synchronous design Combinational Logic (Larger circuits difficult to predict) Synchronous Logic driven by a CLOCK Registers, Flip Flops (Memory) Intermediate Inputs New Output every clock edge CLOCK Register EDGES 8

Synchronous design rules All flip-flops are clocked only by the same free-running master clock. No latches No asynchronous feedback No clock signals are derived from logic gating (unless gating produced by tool, i.e., correct by construction) Asynchronous flip-flop inputs (Preset or Clear) are used only for initialization. Delays from one flip-flop to another flip-flop are designed to be less than the clock period. Asynchronous inputs pass through flip-flops (one or more) before being used elsewhere. Enable (control) signals are nominally one clock period in length. Note: Synchronous design is not always the best method, but it is to be assumed unless other methods are absolutely necessary. 9

Timing Waveforms (asynchronous) A B C Y 0 ns 10 ns20 ns 30 ns 40 ns 50 ns 60 ns A B C Y 10

Timing Waveforms (synchronous) A B A_r B_r C D Clk Q 0 ns 10 ns 20 ns 30 ns 40 ns 50 ns 60 ns A B Clk D C D C Q Q A_r B_r C D D Q C Q_r 11

Metastability In an asynchronous system, the relationship between data and clock is not fixed; therefore, violations of setup and hold times can occur. When this happens, the output may go to an intermediate level between its two valid states and remain there for an indefinite amount of time before resolving itself or it may simply be delayed before making a normal transition. Two stable points, one metastable point: 12

Single-Stage Synchronizer - Metastability Minimum data set-up (tsu) and hold (thd) times must be met for the register to output synchronized data. The data input to the D flip-flop is asynchronous to the clock. The arrival time of the input data relative to the clock is not known and a danger zone (decision or metastability window) is created. After a clock-to-output delay (tco), the input data appears at the Q output. If the input data enters the danger zone, the Q output is likely to be in a metastable state until the internal silicon settles to either a logic high or low. The extra time required to resolve the logic state is called resolution time (tr). 13

Timing Waveforms (Metastability) Clk D_early Q_early D_late Q_late D_bad Q_bad 0 ns 10 ns 20 ns 30 ns 40 ns 50 ns 60 ns D D Q Changes before setup time Changes after hold time Changes after setup time or before hold time Indeterminate Q Clk C 14

Resolving the metastability The Resolving Time Constant, =1/, comes from the expression that describes the probability of a metastable event lasting longer than some time, t. If a FF is in the metastable state at time t its probability of resolve it over an additional time period Δt is the same regardless of the present age t The time to solve the metastability must follow an exponential probability distribution. Thus the probability that the system is still in the metastable state at time t is: 15

Calculating the MTBF Mean (Average) Time Between Failures or MTBF of a system, is the reciprocal of the failure rate in the special case when the failure rate is constant, i.e. distributed exponentially Failure rate = Metastability Rate * Probability of Metastability The probability of a metastable event lasting longer than some time is R(t): exp(-t/ ) Assume the data arrives uniformly over clock cycle T. The probability that data will arrive in W in a clock period T is: P = W/T = W fc, where: W = Metastability window and fc = Clock frequency If the data rate is fd, then the rate of metastability becomes W fc fd Failure rate = W fc fd exp(-tr/ ) MTBF = exp(tr/ ) /(W fc fd) T 16

Avoiding metastability The most common way to avoid metastability is to add one or more synchronizing FFs at the signals that move from one clock domain to the other. This approach allows for an entire clock period (except for the setup time of the following flip-flop) for metastable events to resolve itself. This does however increases the latency in observation of input. If clock frequency is fc=1/t, the resolution time for FF2 is bounded by T-T SU2, i.e. the MTBF is: MTBF FF2 = exp(t-t SU2 / ) /(W fc fd) 17

Avoiding metastability MTBF for different resolution time 18

Estimating device parameters A stable synchronous source is used for the measurement. The set-up and hold times of the input data relative to the clock can be adjusted while observing the Q output for the metastable event. The Metastability Window, W, can be determined by accurately measuring the Clock-To-Output Propagation Delay Time (tco) When the Set-up time or Hold time is violated, the Clock-To-Output Delay is increased. When the tco is measured longer than the tco_max (specified in the device data sheet), the device is considered to be in the metastable state 19

FPGA - Timing Analysis 20

Timing analysis Enables you to determine the delays between any pair of cells in your design Checks if your circuit will work under specified conditions Your circuit may be functionally correct and still not work On-chip What is the maximum clock frequency your design can run at? Are there any problems in the implementation due to timing? Off-chip Can your design on the FPGA can communicate with external devices reliably? 21

Timing analysis Static: Determine the longest and shortest path between register to register or register to I/O boundary. No function verification No test pattern. Each block must have a time model. Dynamic: inputs are considered Combinational Logic Combinational Logic clk 22

Timing analysis Clock setup time (tsu) Data that feeds a register via its data or enable input(s) must arrive at the input pin before the register s clock signal is asserted at the clock pin. Clock setup time is the minimum length of time that this data must arrive before the active clock edge Micro tsu is the intrinsic setup time of the register (i.e., it is an inherent characteristic of the register and is unaffected by the signals feeding the register). 23

Timing analysis Clock hold time (th) Data that feeds a register via its data or enable input(s) must be held at an input pin after the register s clock signal is asserted at the clock pin. Clock hold time is the minimum length of time that this data must be stable after the active clock edge Micro th is the intrinsic hold time of the register 24

Timing analysis Clock-to-output time (tco) Clock-to-output delay is the time required for a clock signal to travel from an input pin through a register to an output pin. This time always represents an external pin-to-pin delay. Micro tco is the intrinsic clock-tooutput delay of the register 25

Timing analysis Clock Skew Clock skew is the difference in arrival time of a clock signal at two different registers. This timing difference occurs when two clock signal paths have different lengths. Maximum Clock Frequency (fmax) Maximum clock frequency is the fastest speed the design clock can run without violating internal setup and hold time requirements. Slack Slack is the margin by which a timing requirement (e.g., fmax) was met or not met. A positive slack indicates that the circuit met the timing requirements; negative slack indicates that the design contains timing violations. 26

Timing analysis Calculating Internal fmax To determine internal fmax, you must first calculate the circuit s clock period. The clock period depends on the data path delay, the clock skew between registers, the source register s clock-to-output time, and the destination register s setup time. The Register-to-register delay (trd) in the clock period equation represents the data path delay between two registers. System fmax System fmax includes external delays, assuming that all input pins are registered just before entering the device, and all output pins are registered just after leaving the device. 27

Timing analysis Internal fmax; Register-to-Register Transfer minimum clock period (maximum clock frequency) w clk a b z w clk a b z Tmin=1/fmax= Sµt co B Dµt su +(C-E):t clk_skew 28

Timing analysis System fmax; clock period = 1/fmax_internal 29

Pipelining 30

Pipelining Clock frequency is defined as the rate at which data flows into the system and appears at the output. Pipelining decreases the combinational delay by inserting registers in a long combinational path, thus increasing the clock frequency and hence a higher performance. For a perfect clock without any jitter, the clock signal reaches all banks of registers simultaneously. If FFs are ideal (no tco, tsu and th): the maximum frequency FMAX is the reciprocal of the maximum delay path through the combinational logic. 31

FF and clock non idealities FF have delays: In real time circuit s clock input to register B would come after a small delay than at register A due to wire propagation delay: TSKW. Negative clock skew is due to early clocking i.e. clocking of registers before the relevant data is successfully latched. The variation between arrival times of the consecutive clock edges at the same point on the chip is defined as clock jitter TJIT. 32

Real world circuit Path in bold refers to the path with maximum delay between any two flip-flops in the circuit: 33

Real world circuit The total delay between the two flip-flops along the path b,f,j,l,m,n,o is: Assuming equal delays across all the flip-flops in design (which might not be the actual case) we have the generalized formula for the maximum period as: The combinational delay in the above equation can be reduced by adding more FFs, thus increasing the max frequency on which a circuit can operate. 34

Pipelining again! Pipelining splits the critical path (path with maximum combinational delay) with memory elements between the clock cycles increases the calculations per second since the clock period per stage is reduced but increases the overhead by adding memory elements 35

Performance Increase Consider a big array of combinational logic between registers. The latency of the circuit is also the clock period: Consider the same circuit to be pipelined into n stages: Pipeline stage with worst delay limits the clock period: The latency is n times the clock period: Ideally, each pipeline stage has equal delay: 36

Performance Increase So the minimum possible clock period for any pipeline stage is: Thus the final latency with this ideal clock period is: We can now calculate the speed increase of a circuit after pipelining: If we specify the register and clock overhead as a fraction k of the total clock period of an unpipelined circuit, then we have: 37

Clock gating and reset strategies 38

Clock gating In the traditional synchronous design style, the system clock is connected to the clock pin on every flip-flop in the design. This results in three major components of power consumption: 1. Power consumed by combinatorial logic whose values are changing on each clock edge (due to flops driving those combo cells). 2. Power consumed by flip-flops (this has non-zero value even if the inputs to the flip-flops, and therefore, the internal state of the flip-flops, is not changing). 3. Power consumed by the clock tree buffers in the design. Gating the clock path substantially reduces the power consumed by a Flip Flop. Gate clocking imposes that all enable signals be held constant from the active (rising) edge of the clock until the inactive (falling) edge of the clock to avoid truncating the generated clock pulse prematurely or generating multiple clock pulses (or glitches in clock): discard! 39

Clock gating The latch-based clock gating style adds a level-sensitive latch to the design to hold the enable signal from the active edge of the clock until the inactive edge of the clock. The latch captures the state of the enable signal and holds it until the complete clock pulse has been generated. The enable signal need only be stable around the rising edge of the clock. Only one input of the gate that turns the clock on and off changes at a time, ensuring that the circuit is free from any glitches or spikes on the output. 40

Synchronous reset Synchronous resets are based on the premise that the reset signal will only affect or reset the state of the flip-flop on the active edge of a clock. Advantages of Using Synchronous Resets Synchronous resets generally insure that the circuit is 100% synchronous. Synchronous reset logic will synthesize to smaller flip-flops, particularly if the reset is gated with the logic generating the Flop input. Synchronous resets ensure that reset can only occur at an active clock edge. The clock works as a filter for small reset glitches. Disadvantages Synchronous resets may need a pulse stretcher to guarantee a reset pulse width wide enough to ensure reset is present during an active edge of the clock. A synchronous reset will require a clock in order to reset the circuit. This may be a problem in some case where a gated clock is used to save power. 41

Gated Clocks vs Synchronous reset Q(7 downto 0) Eight bit binary counter with gated clock (not recommended) "00000001" + D Q CountEn Clk C Q(7 downto 0) Eight bit binary counter with free-running clock (recommended) "00000001" + CountEn 0 1 Clk D Q C 42

Asynchronous Reset Asynchronous reset flip-flops incorporate a reset pin into the flip-flop design. The most obvious advantage favoring asynchronous resets is that the circuit can be reset with or without a clock present. The biggest problem with asynchronous resets is that they are asynchronous, both at the assertion and at the de-assertion. The assertion is a non issue, the deassertion is the issue. If the asynchronous reset is released at or near the active clock edge of a flip-flop, the output of the flip-flop could go metastable. Another problem that an asynchronous reset can have, depending on its source, is spurious resets due to noise or glitches on the board or system reset. 43

Reset synchronizer The reset synchronizer logic is designed to take advantage of the best of both asynchronous and synchronous reset styles. An external reset signal asynchronously resets a pair of flip-flops, which in turn drive the master reset signal asynchronously through the reset buffer tree to the rest of the flip-flops in the design. The entire design will be asynchronously reset. 44

Reset Glitch Filtering 45

Multiple clock domains 46

Multiple Clock domains Designs with multiple clocks can have: clocks with different frequencies and/or clocks with same frequency but different phases between them. Metastability issues may arise; system desing must be partitioned so that each module should work on one clock only. A synchronizer module has to be made for all signals that cross from one clock domain to another 47

Synchronous Clock domain crossing Clocks originating from the same clock-root and having a known phase and frequency relationship between them are known as synchronous clocks. A clock crossing between such clocks is known as a synchronous clock domain crossing. Depending on frequency and phase relationship, synchronizers may be needed or not! Synchronous Clock domain crossing can be divided into several categories: Clocks with the same frequency and zero phase difference: no domain cross! 48

Synchronous Clock domain crossing Synchronous Clock domain crossing can be divided into several categories : Clocks with the same frequency and constant phase difference: tighter constraint on the combinational logic delay due to smaller setup/hold margins. Integer multiple clocks: the minimum possible phase difference between the active edges of the two clocks would always be equal to the time period of the fast clock, i.e. one complete cycle of the faster clock would always be available for sampling 49

Synchronous Clock domain crossing Synchronous Clock domain crossing can be divided into several categories. Rational multiple clocks: the minimum phase difference between the two clocks can be small enough to cause metastability: synchronization needs to be done! 50

Transfer of control signals Two or more flip-flops are cascaded to form a synchronizing circuit As previously seen, if the first flop of a synchronizer produces a metastable output, the metastability may get resolved before it is sampled by the second flip flop. This method does not guarantee that the output of the second flipflop will go metastable but it does decrease the probability of metastability. Adding more flops to the synchronizer will further reduce the probability of metastability. 51

Transfer of data signals Synchronous FIFO Simple Synchronous FIFO architecture: reading and writing is done on the same clock. The read and write addresses are generated by two pointers. A valid write enable increments the write pointer and a valid read enable increments the read pointer. A Status Block generates the fifo_empty and fifo_full signals. The Dual Port Memory (DPRAM) can have either synchronous (an explicit read signal is provided before the FIFO output is valid) or asynchronous reads (valid data is available as soon as it is written). 52

FIFO_FULL and FIFO_empty FIFO is either full or empty when read-pointer equals to the write pointer: it is necessary to distinguish between these two conditions. FIFO becomes full when a write causes both the pointers to become equal in the next clock. This makes the following condition for assertion of fifo_full signal: fifo _ full=(read _ pointer ==(write _ pointer +1))AND "write An alternative approach exploits a counter that constantly indicates the number of full or empty locations left in the FIFO: additional hardware is neede! 53

Transfer of data signals Asynchronous FIFO Asynchronous FIFO is used to transfer data across two asynchronous clock domains. Unlike handshake signaling, asynchronous FIFO is used in case of performance critical designs. where clock latency is a factor rather than system resources. The same approach of synchronous FIFO can be exploited with special care taken for FIFO empty and FIFO Full signal generation to avoid metastability conditions. 54

Transfer of data signals Asynchronous FIFO 55

FIFO full timings 56

FIFO full timings Since a typical synchronizer circuit consists of at least two FFs, synchronizing read pointer on write clock will result in changed read pointer reflected after two write clocks. This results in blocking additional writes on the FIFO for additional cycles but is harmles 57

FIFO empty timings 58

FIFO empty timings For the FIFO Empty calculation, write pointer is synchronized to the read clock and compared against the read pointer. Due to this, read side sees delayed writes (two clock delayed signal), and would still indicate FIFO empty even though it actually has some data, but it is harmless. 59

Transfer of data signals Asynchronous FIFO Suppose full and empty signals are generated using a counter which is changing from FFF to 000. Metastability can be avoided by synchronizing the counter, but this may still get sampled values that are widely off the mark (e.g. sampling counter in the middle of the updating phase). A possible solution is to count in Gray-code, with a number changing by one bit as it proceeds from one number to the next. Synchronizing gray counter will rarely result in sampled counter value getting metastable and secondly the value sampled will have at most one bit error Synchronizing read or write pointer on write clock will result in changed read or write pointer reflected after two (N) write clocks This results in blocking additional writes or read on the FIFO for additional cycles but is harmless! 60