e.g. τ = 12 ps in 180nm, 40 ps in 0.6 µm Delay has two components where, f = Effort Delay (stage effort)= gh p =Parasitic Delay



Similar documents
Lecture 5: Logical Effort

Gate Delay Model. Estimating Delays. Effort Delay. Gate Delay. Computing Logical Effort. Logical Effort

Weste04r4.fm Page 67 Monday, January 5, :24 AM. 4.1 Introduction

Module 4 : Propagation Delays in MOS Lecture 22 : Logical Effort Calculation of few Basic Logic Circuits

Optimization and Comparison of 4-Stage Inverter, 2-i/p NAND Gate, 2-i/p NOR Gate Driving Standard Load By Using Logical Effort

Lecture 5: Gate Logic Logic Optimization

Sequential 4-bit Adder Design Report

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

CMOS Binary Full Adder

CSE140 Homework #7 - Solution

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012

Pass Gate Logic An alternative to implementing complex logic is to realize it using a logic network of pass transistors (switches).

Analog & Digital Electronics Course No: PH-218

Here we introduced (1) basic circuit for logic and (2)recent nano-devices, and presented (3) some practical issues on nano-devices.

Adder.PPT(10/1/2009) 5.1. Lecture 13. Adder Circuits

RAM & ROM Based Digital Design. ECE 152A Winter 2012

Gates & Boolean Algebra. Boolean Operators. Combinational Logic. Introduction

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

Counters and Decoders

UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

Step Response of RC Circuits

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

Contents. Overview Memory Compilers Selection Guide

Chapter 10 Advanced CMOS Circuits

SECTION 2 Transmission Line Theory

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

Three-Phase Dual-Rail Pre-Charge Logic

E158 Intro to CMOS VLSI Design. Alarm Clock

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

High Speed Gate Level Synchronous Full Adder Designs

Layout of Multiple Cells

Alpha CPU and Clock Design Evolution

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7

PMOS Testing at Rochester Institute of Technology Dr. Lynn Fuller

LFSR BASED COUNTERS AVINASH AJANE, B.E. A technical report submitted to the Graduate School. in partial fulfillment of the requirements

CHAPTER 11: Flip Flops

Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

1.1 Silicon on Insulator a brief Introduction

3.Basic Gate Combinations

ATE-A1 Testing Without Relays - Using Inductors to Compensate for Parasitic Capacitance

CMOS Power Consumption and C pd Calculation

Chapter 6 TRANSISTOR-TRANSISTOR LOGIC. 3-emitter transistor.

Lecture 10: Latch and Flip-Flop Design. Outline

Content Map For Career & Technology

CMOS Thyristor Based Low Frequency Ring Oscillator

ANALOG & DIGITAL ELECTRONICS

Lecture 8: Synchronous Digital Systems

Module 7 : I/O PADs Lecture 33 : I/O PADs

Lecture 11: Sequential Circuit Design

Sequential Circuits. Combinational Circuits Outputs depend on the current inputs

Digital to Analog Converter. Raghu Tumati

Vi, fi input. Vphi output VCO. Vosc, fosc. voltage-controlled oscillator

Rectifier circuits & DC power supplies

An Introduction to the EKV Model and a Comparison of EKV to BSIM

ECE124 Digital Circuits and Systems Page 1

Subthreshold Real-Time Counter.

United States Naval Academy Electrical and Computer Engineering Department. EC262 Exam 1

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Digital Integrated Circuit (IC) Layout and Design

NAME AND SURNAME. TIME: 1 hour 30 minutes 1/6

Push-Pull FET Driver with Integrated Oscillator and Clock Output

International Journal of Electronics and Computer Science Engineering 1482

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

PowerPC Microprocessor Clock Modes

Spread-Spectrum Crystal Multiplier DS1080L. Features

Class 11: Transmission Gates, Latches

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

Phase-Locked Loop Based Clock Generators

Document Contents Introduction Layout Extraction with Parasitic Capacitances Timing Analysis DC Analysis

Gates, Circuits, and Boolean Algebra

exclusive-or and Binary Adder R eouven Elbaz reouven@uwaterloo.ca Office room: DC3576

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency

Lecture 10: Sequential Circuits

Sequential Logic Design Principles.Latches and Flip-Flops

Understanding Logic Design

A Lesson on Digital Clocks, One Shots and Counters

A Lesson on Digital Clocks, One Shots and Counters

An All-Digital Phase-Locked Loop with High Resolution for Local On-Chip Clock Synthesis

CHAPTER 3 Boolean Algebra and Digital Logic

Digital Logic Elements, Clock, and Memory Elements

Two-level logic using NAND gates

The MOS Transistor in Weak Inversion

+5 V Powered RS-232/RS-422 Transceiver AD7306

Gates. J. Robert Jump Department of Electrical And Computer Engineering Rice University Houston, TX 77251

Application Examples

Sequential Logic. (Materials taken from: Principles of Computer Hardware by Alan Clements )

The components. E3: Digital electronics. Goals:

Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots

ELEC EXPERIMENT 1 Basic Digital Logic Circuits

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

CSE140: Components and Design Techniques for Digital Systems

Sequential Logic: Clocks, Registers, etc.

Efficient Interconnect Design with Novel Repeater Insertion for Low Power Applications

CSE140: Components and Design Techniques for Digital Systems

AN-6005 Synchronous buck MOSFET loss calculations with Excel model

CHAPTER 16 MEMORY CIRCUITS

Introduction to Digital System Design

Transcription:

Logic Gate Delay Chip designers need to choose: What is the best circuit topology for a function? How many stages of logic produce least delay? How wide transistors should be? Logical Effort Helps make the above decisions. Uses a simple delay model Allows easy hand calculations Compare alternative designs easily Express delay in process independent terms d d abs τ e.g. τ 12 ps in 180nm, 40 ps in 0.6 µm Delay has two components d f+ p where, f Effort Delay (stage effort) gh p Parasitic Delay 1

Logic Gate Delay g logical Effort Measures relative ability of gate to deliver current 1 for inverter h electrical effort C out /C in Ratio of output to input capacitance Sometimes called fanout Normalized Delay: d 6 5 4 3 2 2-input NAND Inverter g 4/3 p 2 d (4/3)h + 2 g 1 p 1 d h + 1 Ef f ort Delay: f p parasitic delay Represents delay of gate driving no load Set by internal parasitic capacitance 1 0 Parasitic Delay: p 0 1 2 3 4 5 Electrical Ef f ort: h C out / C in Again d f + p gh + p 2

Logical Effort Logical Effort: It is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Can be measured from delay vs. fanout plots Or estimate by counting transistor widths C in 3 g 3/3 A 2 1 Y C in 4 g 4/3 A B 2 2 2 2 Y A B C in 5 g 5/3 4 4 1 1 Y Gate Type Number of Inputs 1 2 3 4 n Inverter 1 NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 Tristate, Mux 2 2 2 2 2 XOR, XNOR 4,4 6,12,6 8,16,16,8 3

Parasitic Delay Count diffusion capacitance on the output assuming contacted diffusions. Inverter: 3 units of diffusion capacitance, parasitic delay is 3RC τ. Normalized parasitic p inv is. p inv is the ratio of diffusion capacitance to gate capacitance for a particular process. Is considered close to 1 for simplicity More refined parasitic delay estimations can be performed using Elmore delay. Internal diffusion capacitance are considered, delay grows quadratically rather than linearly as estimated by the crude method. Parasitic delay for common gates using the crude method Gate Type Number of Inputs 1 2 3 4 n Inverter 1 NAND 2 3 4 n NOR 2 3 4 n Tristate, Mux 2 4 6 8 2n 4

Example: Ring Oscillator and FO4 inverter Estimate the frequency of a N-stage ring oscillator Logical Effort g 1, Electrical Effort h 1, Parasitic Delay p 1 Stage Delay d 2 Frequency fosc 1 / (2 * N * d) 1 / 4N Period 2N (edge has to propagate twice through the ring to attain original polarity) 31 stage ring oscillator in 0.6 µm technology has frequency of ~ 200MHz. Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort g 1, Electrical Effort h 4, Parasitic Delay p 1 Stage Delay d 5 The FO4 delay is: 200ps in 0.6 µm, 60ps in 180nm, ~f/3 ns in an f µm process. 5

Multistage Logic Networks Logical Effort generalizes to multistage networks Path Logical Effort: G g i Path Electrical Effort: H C outpath ------------------------ C inpath Path Effort: F f i g i h i 10 g 1 1 h 1 x/10 x g 2 5/3 h 2 y/x y g 3 4/3 h 3 z/y z g 4 1 h 4 20/z 20 Thus by analogy is F GH? NO! Due to branching paths. 6

Multistage Logic Networks: Branching Effort 5 15 15 90 90 G 1 H 90/5 18 GH 18 h 1 (15 + 15) / 5 6 h 2 90/15 6 F g 1 g 2 h 1 h2 36 2GH Thus we need to introduce branching effort. 7

Multistage Delay Branching Effort Accounts for branching between stages in path C onpath + C offpath b ----------------------------------------------------- C onpath B b i Path Effort: F GBH Path Effort Delay: D F f i Path Parasitic Delay: P p i Path Delay: D d i D F + P 8

Designing Fast Circuits Delay is the smallest when each stage bears the same effort f ˆ, with N stages in the path f ˆ g i h i F 1 N Thus, minimum delay of N stage path is The above equation helps to find fastest possible delay without calculating gate sizes. Capacitance transformation used to used to calculate gate widths. D NF 1 N + P f ˆ gh C out g ------------ C in Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives Check work by verifying input capacitance specification is met. 9

Logical Effort Example: 3-stage path Logical Effort G(4/3) * (5/3) * (5/3) 100/27 Electrical Effort H 45/8 x Branching Effort G 3 *2 6 Path Effort FGBH 125 Best Stage Effort Parasitic Delay P 2 + 3 + 2 7 ˆ f 3 F 5 A 8 x x y y 45 B 45 Delay D 3 * 5 + 7 22 4.4 FO4 For best sizes work backward y 45 * (5/3) / 5 15 x (15 * 2) * (5/3) / 5 10 Sizes chosen to get equal rise-fall times Check path input capacitance to check values (10 + 10 + 10) (4/3) / 5 8 A P: 4 N: 4 P: 4 N: 6 P: 12 N: 3 45 B 45 10

Best Number of Stages Another important choice is the number of stages in a path Minimum number of stages does not provide best delay in all cases E.g. drive 64-bit datapath with unit inverter Initial Driver 1 1 1 1 D NF 1 N + P N( 64) 1 N + N 8 4 2.8 16 8 23 Datapath Load 64 64 64 64 N: f: D: 1 64 65 2 8 18 3 4 15 Fastest 4 2.8 15.3 11

Best Number of Stages Consider adding inverters at the end of the path? How many produce the best delay? Logic Block: n 1 Stages Path Effort F N - n 1 Extra Inverters D n 1 NF 1 N + p i +( N n 1 )ρ inv i 1 D N F 1 N ln F 1 N + F 1 N + ρ inv 0 Define best stage effort ρ F 1 N ρ inv + ρ( 1 lnρ) 0 12

Best Stage Effort ρ inv + ρ( 1 lnρ) 0 has no closed form solution Neglecting parasitics (ρ inv 0), ρ 2.718 (e) For ρ inv 1, solve numerically for ρ 3.59 Sensitivity analysis: How sensitive is the delay to using exactly the best number of stages? 2.4 < ρ < 6.0 gives delay within 15% of optimal ρ 4 is a convenient choice D(N) /D(N) 1.6 1.4 1.2 1.0 1.51 1.15 1.26 Due to the above simplification: FO4 inverter used as 'representative' logic gate delay in a particular process (ρ6) (ρ 2.4) 0.0 0.5 0.7 1.0 1.4 2.0 N / N 13

Larger Example: Register File Decoder Decoder specifications 16 word register file Each word is 32 bits wide Each bit presents a load of 3 unit-sized transistors on the word line True and complimentary versions of address bits A[3:0] are available Each address input can drive 10 unit-sized transistors. How many stages to use? A[3:0] A[3:0] 32 bits How large should each gate be? How fast can the decoder operate? 4:16 Decoder 16 Register File 16 words 14

Larger Example: Register File Decoder Decoder effort is mainly electrical and branching Electrical Effort H (32 * 3) / 10 9.6 Branching Effort B 8 If we neglect logical effort G (assume G 1) Path Effort F GBH 76.8 Number of stages N log 4 F 3.1 Three stage design: A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0] 10 10 10 10 10 10 10 10 Logical Effort G 1 * 6/3 * 1 2 Path Effort F GBH 154 Stage Effort ˆ f F 1/3 5.36 Path Delay D 3 ˆ f + 1 + 4 + 1 22.1 Gate sizes z 96 * 1/ 5.36 18 y 18 * 2/ 5.36 6.7 y y z z word[0] 96 units of wordline capacitance word[15] 15

Larger Example: Register File Decoder Compare alternatives with a spreadsheet Design N G P D NAND4 - INV 2 2 5 29.8 NAND2 - NOR2 2 20/9 4 30.1 INV - NAND4 - INV 3 2 6 22.1 NAND4 - INV - INV - INV 4 2 7 21.1 NAND2 - NOR2 - INV - INV 4 20/9 6 20.5 NAND2 - INV - NAND2 - INV 4 16/9 6 19.7 INV - NAND2 - INV - NAND2 - INV 5 16/9 7 20.4 NAND2 - INV - NAND2 - INV - INV - INV 6 16/9 8 21.6 16

Logical Effort: Recap of Definitions TERM STAGE PATH number of stages 1 N logical effort g G electrical effort C out / C in H g i C outpath ------------------------ C inpath branching effort b C onpath + C offpath ----------------------------------------------------- C onpath B b i effort f gh F GBH effort delay f D F f i parasitic delay p P p i delay d f + p D d i D F + P 17

Logical Effort: Method Recap Compute path effort F GBH Estimate best number of stages Nlog 4 F Sketch path with N stages Estimate least delay D Determine best stage effort NF 1 N + P f ˆ F 1 N Find gate sizes using capacitance transformation f ˆ C out g------------ C in Logical Effort Summary Provides a mechanism for designing and discussing fast circuits NANDs are faster than NORs in CMOS Paths are fastest when effort delay is ~4 Path delay is weakly sensitive to stages, sizes Using fewer stages doesn't mean faster circuit Inverters and NAND2 best for driving large loads (caps) BUT REQUIRES PRACTICE TO MASTER!!! 18

Limitations of Logical Effort Chicken and Egg problem Need path to compute G But don't know number of stages without G Simplistic Delay Model Neglects input rise time effects and input arrival times Gate-source capacitance approximation Bootstrapping due to gate to drain capacitance coupling Ignores secondary effects: velocity saturation, body effect etc Does not account for interconnect More applicable to datapath circuits with regular layout structure e.g. adders, mults etc Iterations required in designs with significant interconnect delay Design for maximum speed only, no information about minimum area/power Paths with complex branching are difficult to analyze by hand 19