Topics of Chapter 5 Sequential Machines Memory elements Memory elements. Basics of sequential machines. Clocking issues. Two-phase clocking. Testing of combinational (Chapter 4) and sequential (Chapter 5) circuits. Stores a value as controlled by clock. May have load signal, etc. In CMOS, memory is created by: capacitance (dynamic); feedback (static). Modern VLSI esign 3e: Chapter 5 Page 1 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 2 Copyright 1998, 2002 Prentice Hall PTR Memory element terminology Clock terminology Latch: transparent when internal memory is being set from input. Flip-flop: not transparent reading input and changing output are separate events. Clock edge: rising or falling transition. uty cycle: fraction of clock period for which clock is active (e.g., for active-low clock, fraction of time clock is 0). Modern VLSI esign 3e: Chapter 5 Page 3 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 4 Copyright 1998, 2002 Prentice Hall PTR
Memory element parameters Setup time: time before clock during which data input must be stable. Hold time: time after clock event (in example: falling edge) for which data input must remain stable. clock data ynamic latch Stores charge on inverter gate capacitance: Modern VLSI esign 3e: Chapter 5 Page 5 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 6 Copyright 1998, 2002 Prentice Hall PTR Latch characteristics Latch operation Uses complementary transmission gate to ensure that storage node is always strongly driven. Latch is transparent when transmission gate is closed. Storage capacitance comes primarily from inverter gate capacitance. = 0: transmission gate is off, inverter output is determined by storage node. = 1: transmission gate is on, inverter output follows input. Setup and hold times determined by transmission gate must ensure that value stored on transmission gate is solid. Modern VLSI esign 3e: Chapter 5 Page 7 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 8 Copyright 1998, 2002 Prentice Hall PTR
Stored charge leakage Layout Stored charge leaks away due to reversebias leakage current. Stored value is good for about 1 ms. Value must be rewritten to be valid. If not loaded every cycle, must ensure that latch is loaded often enough to keep data valid. V V SS Modern VLSI esign 3e: Chapter 5 Page 9 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 10 Copyright 1998, 2002 Prentice Hall PTR Non-dynamic latches Must use feedback to restore value. Some latches are static on one phase (pseudo-static) load on one phase, activate feedback on other phase. Static on one phase: Recirculating latch L is a combination of the inverted clock and an enable signal. Modern VLSI esign 3e: Chapter 5 Page 11 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 12 Copyright 1998, 2002 Prentice Hall PTR
Clocked inverter Clocked inverter operation circuit = 0: both clocked transistors are off, output is floating. = 1: both clocked transistors are on; circuit acts as an inverter to drive output. symbol Modern VLSI esign 3e: Chapter 5 Page 13 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 14 Copyright 1998, 2002 Prentice Hall PTR Clocked inverter latch Clocked inverter latch operation = 0: i 1 is off, i 2 -i 3 form feedback circuit. i 1 i 2 i 3 = 1: i 2 is off, breaking feedback; i1 is on, driving i 3 and output. Latch is transparent when = 1. Modern VLSI esign 3e: Chapter 5 Page 15 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 16 Copyright 1998, 2002 Prentice Hall PTR
Flip-flops Master-slave flip-flop Not transparent use multiple storage elements to isolate output from input. Major varieties: master slave master-slave; edge-triggered. Modern VLSI esign 3e: Chapter 5 Page 17 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 18 Copyright 1998, 2002 Prentice Hall PTR Master-slave operation Positive edge-triggered flip-flop = 0: master latch is disabled; slave latch is enabled, but master latch output is stable, so output does not change. = 1: master latch is enabled, loading value from input; slave latch is disabled, maintaining old output value. 0 1 i 1 0 1 i 2 = 0 = 1 Only the input value at the rising clock transition is captured! Modern VLSI esign 3e: Chapter 5 Page 19 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 20 Copyright 1998, 2002 Prentice Hall PTR
Edge-triggered flip-flop Sequential machines The edge-triggered flip-flop is the most frequently used memory logic in sequential circuitry. It requires the data only to be stable within the setup and hold margins. One has a robust design style when an entire circuit exclusively uses positive (or negative) edge-triggered flip-flops. Use memory elements to make primary output values depend on state + primary inputs. Varieties: Mealy outputs function of present state, inputs; Moore outputs depend only on state. Modern VLSI esign 3e: Chapter 5 Page 21 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 22 Copyright 1998, 2002 Prentice Hall PTR Sequential machine definition FSM structure (Mealy) Machine computes next state N and primary outputs O from current state S and primary inputs I. Next-state function: N = δ(i,s). Output function (Mealy): O = λ(i,s). Output function (Moore): O = λ(s). Modern VLSI esign 3e: Chapter 5 Page 23 Copyright 1998, 2002 Prentice Hall PTR Primary inputs Next state Combinational logic memory Primary outputs Current state Modern VLSI esign 3e: Chapter 5 Page 24 Copyright 1998, 2002 Prentice Hall PTR
FSM structure (Moore) Constraints on structure Primary inputs Combinational logic No combinational cycles. All components must have bounded delay. Next state Current state Combinational logic Primary outputs memory Modern VLSI esign 3e: Chapter 5 Page 25 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 26 Copyright 1998, 2002 Prentice Hall PTR Flip-flop rules Signals in flip-flop system Primary inputs change after clock () edge. Primary inputs must stabilize before next clock edge. Rules allow changes to propagate through combinational logic for next cycle. Flip-flop outputs hold current-state values for next-state computation. positive clock edge Modern VLSI esign 3e: Chapter 5 Page 27 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 28 Copyright 1998, 2002 Prentice Hall PTR
Signal skew Clock skew (1) Machine data signals must obey setup and hold times avoid signal skew. Clock must arrive at all memory elements in time to load data. a stable The maximum difference between clock arrival times is the clock skew. δ b x stable stable No stable signal at Modern VLSI esign 3e: Chapter 5 Page 29 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 30 Copyright 1998, 2002 Prentice Hall PTR Clock skew (2) Clock distribution (Chapter 7) Clock skew values larger than the flip-flop input-output delay lead to malfunctioning: some computations will be based on the next state rather than the current state. ε δ > ε? Goals: deliver clock to all memory elements with acceptable skew; deliver clock edges with acceptable sharpness. Clocking network design is one of the greatest challenges in the design of a large chip. δ Modern VLSI esign 3e: Chapter 5 Page 31 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 32 Copyright 1998, 2002 Prentice Hall PTR
Clock delay varies with position H-tree Modern VLSI esign 3e: Chapter 5 Page 33 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 34 Copyright 1998, 2002 Prentice Hall PTR Clock distribution tree Clock tree Clocks are generally distributed via wiring trees. Want to use low-resistance interconnect to minimize delay. Use multiple drivers to distribute driver requirements use optimal sizing principles to design buffers. Clock lines can create significant crosstalk. Modern VLSI esign 3e: Chapter 5 Page 35 Copyright 1998, 2002 Prentice Hall PTR In order to balance the delay from the clock source to the flip-flops, clock trees are used. In current-day practice clock trees are generated during layout as wiring delay is significant. Modern VLSI esign 3e: Chapter 5 Page 36 Copyright 1998, 2002 Prentice Hall PTR clk FF1 FF2 FF3 FF4
Clock distribution in highperformance ICs Latch-based machines On-chip clock generation with a PLL (phase-locked loop). Clock networks shaped as H-trees or grids. e-skewing circuits to control local skew. See EC Alpha example on page 380. See also: Zhu,.K., High-Speed Clock Network esign, Kluwer Academic Publishers, Boston, (2003). Latches do not cut combinational logic when clock is active. Latch-based machines must use multiple ranks of latches. Multiple ranks require multiple phases of clock. Modern VLSI esign 3e: Chapter 5 Page 37 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 38 Copyright 1998, 2002 Prentice Hall PTR Two-sided latch constraint Strict two-phase clocking discipline Latch must be open less than the shortest combinational delay. Period between latching operations must be longer than the longest combinational delay. Combinational logic latch Strict two-phase discipline is conservative but works. Can be relaxed later with proper knowledge of constraints. Strict two-phase machine makes latch-based machine behave more like flip-flop design, but requires multiple phases. Modern VLSI esign 3e: Chapter 5 Page 39 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 40 Copyright 1998, 2002 Prentice Hall PTR
Strict two-phase architecture Two-phase clock Phases must not overlap: non-overlap region Modern VLSI esign 3e: Chapter 5 Page 41 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 42 Copyright 1998, 2002 Prentice Hall PTR Why it works Two-coloring Each phase has a one-sided constraint: phase must be long enough for all combinational delays. If there are no combinational loops, phases can always be stretched to make that section of the machine work. Total clock period depends on sum of phase periods. I 1 (s 2 ) s 1 combinational logic 1 O 1 (s 2 ) combinational logic O 2 (s 1 ) s 2 I 2 (s 1 ) Modern VLSI esign 3e: Chapter 5 Page 43 Copyright 1998, 2002 Prentice Hall PTR 2 Modern VLSI esign 3e: Chapter 5 Page 44 Copyright 1998, 2002 Prentice Hall PTR
Clock period Unbalanced delays For each phase, phase period must be longer than sum of: combinational delay; latch propagation delay. Phase period depends on longest path. Logic with unbalanced delays leads to inefficient use of logic: short clock period long clock period Modern VLSI esign 3e: Chapter 5 Page 45 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 46 Copyright 1998, 2002 Prentice Hall PTR Retiming Retiming properties Retiming moves memory elements through combinational logic: Retiming changes encoding of values in registers, but proper values can be reconstructed with combinational logic. Retiming may increase number of registers required. Retiming must preserve number of latches around a cycle. Modern VLSI esign 3e: Chapter 5 Page 47 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 48 Copyright 1998, 2002 Prentice Hall PTR
Advanced performance analysis Example with unbalanced stages Latch-based systems always have some idle logic. Can increase performance by blurring phase boundaries. Results in cycle time closer to average of phases. One stage is much longer than the other: Modern VLSI esign 3e: Chapter 5 Page 49 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 50 Copyright 1998, 2002 Prentice Hall PTR Spreading out a phase Problems Hard to debug can t stop the system. 1 Combinational logic, 30 ns 2 Combinational logic, 70 ns Hard to initialize system state. More sensitive to process variations. 1 = high for 50 ns 2 = high for 50 ns Modern VLSI esign 3e: Chapter 5 Page 51 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 52 Copyright 1998, 2002 Prentice Hall PTR
Sequential machine design State transition graphs/tables Two ways to specify sequential machine: structure: interconnection of logic gates and memory elements. function: Boolean description of next-state and output functions. Best way depends on type of machine being described. Basic functional description of FSM. Symbolic truth table for next-state, output functions: no structure of logic; no encoding of states. State transition graph and table are functionally equivalent. Modern VLSI esign 3e: Chapter 5 Page 53 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 54 Copyright 1998, 2002 Prentice Hall PTR State assignment Power optimization Encoding bits in symbolic state = state assignment. State assignment affects: combinational logic area; combinational logic delay; memory element area. Memory elements stop glitch propagation: Modern VLSI esign 3e: Chapter 5 Page 55 Copyright 1998, 2002 Prentice Hall PTR Modern VLSI esign 3e: Chapter 5 Page 56 Copyright 1998, 2002 Prentice Hall PTR