Lecture 5: Logical Effort



Similar documents
e.g. τ = 12 ps in 180nm, 40 ps in 0.6 µm Delay has two components where, f = Effort Delay (stage effort)= gh p =Parasitic Delay

Gate Delay Model. Estimating Delays. Effort Delay. Gate Delay. Computing Logical Effort. Logical Effort

Weste04r4.fm Page 67 Monday, January 5, :24 AM. 4.1 Introduction

Optimization and Comparison of 4-Stage Inverter, 2-i/p NAND Gate, 2-i/p NOR Gate Driving Standard Load By Using Logical Effort

Module 4 : Propagation Delays in MOS Lecture 22 : Logical Effort Calculation of few Basic Logic Circuits

Lecture 5: Gate Logic Logic Optimization

ECE410 Design Project Spring 2008 Design and Characterization of a CMOS 8-bit Microprocessor Data Path

Gates & Boolean Algebra. Boolean Operators. Combinational Logic. Introduction

Latches, the D Flip-Flop & Counter Design. ECE 152A Winter 2012

Sequential 4-bit Adder Design Report

RAM & ROM Based Digital Design. ECE 152A Winter 2012

1. True or False? A voltage level in the range 0 to 2 volts is interpreted as a binary 1.

NEW adder cells are useful for designing larger circuits despite increase in transistor count by four per cell.

CSE140 Homework #7 - Solution

CMOS Binary Full Adder

CHAPTER 11: Flip Flops

Chapter 2 Logic Gates and Introduction to Computer Architecture

Chapter 10 Advanced CMOS Circuits

Two-level logic using NAND gates

International Journal of Electronics and Computer Science Engineering 1482

Lecture 10: Latch and Flip-Flop Design. Outline

Pass Gate Logic An alternative to implementing complex logic is to realize it using a logic network of pass transistors (switches).

Adder.PPT(10/1/2009) 5.1. Lecture 13. Adder Circuits

Digital Logic Design. Basics Combinational Circuits Sequential Circuits. Pu-Jen Cheng

Counters and Decoders

CMOS Thyristor Based Low Frequency Ring Oscillator

High Speed Gate Level Synchronous Full Adder Designs

Three-Phase Dual-Rail Pre-Charge Logic

E158 Intro to CMOS VLSI Design. Alarm Clock

Layout and Cross-section of an inverter. Lecture 5. Layout Design. Electric Handles Objects. Layout & Fabrication. A V i

Lecture 10: Sequential Circuits

The 104 Duke_ACC Machine

ELEC EXPERIMENT 1 Basic Digital Logic Circuits

CHAPTER 3 Boolean Algebra and Digital Logic

Layout of Multiple Cells

Addressing The problem. When & Where do we encounter Data? The concept of addressing data' in computations. The implications for our machine design(s)

Digital Integrated Circuit (IC) Layout and Design

Fault Modeling. Why model faults? Some real defects in VLSI and PCB Common fault models Stuck-at faults. Transistor faults Summary

Chapter 6 TRANSISTOR-TRANSISTOR LOGIC. 3-emitter transistor.

Lecture 12: More on Registers, Multiplexers, Decoders, Comparators and Wot- Nots

Take-Home Exercise. z y x. Erik Jonsson School of Engineering and Computer Science. The University of Texas at Dallas

Class 11: Transmission Gates, Latches

Memory Basics. SRAM/DRAM Basics

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

GETTING STARTED WITH PROGRAMMABLE LOGIC DEVICES, THE 16V8 AND 20V8

Gates, Circuits, and Boolean Algebra

EXPERIMENT 3: TTL AND CMOS CHARACTERISTICS

Contents. Overview Memory Compilers Selection Guide

INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad

Multiplexers Two Types + Verilog

Introduction to CMOS VLSI Design

Let s put together a Manual Processor

Understanding Logic Design

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

Sequential Logic: Clocks, Registers, etc.

Analog & Digital Electronics Course No: PH-218

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

Introducción. Diseño de sistemas digitales.1

LFSR BASED COUNTERS AVINASH AJANE, B.E. A technical report submitted to the Graduate School. in partial fulfillment of the requirements

Objective. Testing Principle. Types of Testing. Characterization Test. Verification Testing. VLSI Design Verification and Testing.

Lecture 8: Synchronous Digital Systems

COMBINATIONAL and SEQUENTIAL LOGIC CIRCUITS Hardware implementation and software design

Low Power AMD Athlon 64 and AMD Opteron Processors

NAME AND SURNAME. TIME: 1 hour 30 minutes 1/6

CHAPTER 16 MEMORY CIRCUITS

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

Introduction to CMOS VLSI Design (E158) Lecture 8: Clocking of VLSI Systems

ECE124 Digital Circuits and Systems Page 1

Mixed Logic A B A B. 1. Ignore all bubbles on logic gates and inverters. This means

Timing Methodologies (cont d) Registers. Typical timing specifications. Synchronous System Model. Short Paths. System Clock Frequency

Subthreshold Real-Time Counter.

Combinational Logic Design Process

HITACHI INVERTER SJ/L100/300 SERIES PID CONTROL USERS GUIDE

Lecture 10 Sequential Circuit Design Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

Chapter 4 Register Transfer and Microoperations. Section 4.1 Register Transfer Language

DESIGN CHALLENGES OF TECHNOLOGY SCALING

EE 42/100 Lecture 24: Latches and Flip Flops. Rev B 4/21/2010 (2:04 PM) Prof. Ali M. Niknejad

361 Computer Architecture Lecture 14: Cache Memory

UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

Document Contents Introduction Layout Extraction with Parasitic Capacitances Timing Analysis DC Analysis

What is a System on a Chip?

1.1 Silicon on Insulator a brief Introduction

An All-Digital Phase-Locked Loop with High Resolution for Local On-Chip Clock Synthesis

Here we introduced (1) basic circuit for logic and (2)recent nano-devices, and presented (3) some practical issues on nano-devices.

PMOS Testing at Rochester Institute of Technology Dr. Lynn Fuller

The components. E3: Digital electronics. Goals:

Elementary Logic Gates

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May ILP Execution

Sequential Logic Design Principles.Latches and Flip-Flops

HIGH SPEED AREA EFFICIENT 1-BIT HYBRID FULL ADDER

Memristor-Based Reactance-Less Oscillator

EMBEDDED SYSTEM BASICS AND APPLICATION

Alpha CPU and Clock Design Evolution

Microprocessor & Assembly Language

Module 7 : I/O PADs Lecture 33 : I/O PADs

Vi, fi input. Vphi output VCO. Vosc, fosc. voltage-controlled oscillator

ANN Based Modeling of High Speed IC Interconnects. Q.J. Zhang, Carleton University

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Content Map For Career & Technology

Use and Application of Output Limiting Amplifiers (HFA1115, HFA1130, HFA1135)

Transcription:

Introduction to CMOS VLSI Design Lecture 5: Logical Effort David Harris Harvey Mudd College Spring 2004

Outline Introduction Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Example Summary Slide 2

Introduction Chip designers face a bewildering array of choices What is the best circuit topology for a function? How many stages of logic give least delay? How wide should the transistors be???? Logical effort is a method to make these decisions Uses a simple model of delay Allows back-of-the-envelope calculations Helps make rapid comparisons between alternatives Emphasizes remarkable symmetries Slide 3

Example Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the A[3:0] A[3:0] decoder for a register file. 32 bits Decoder specifications: 16 16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? 4:16 Decoder Register File 16 words Slide 4

Delay in a Logic Gate Express delays in process-independent unit d = d abs τ τ 12 ps in 180 nm process 40 ps in 0.6 µm process Slide 5

Delay in a Logic Gate Express delays in process-independent unit d = d τ abs Delay has two components d = f + p Slide 6

Delay in a Logic Gate Express delays in process-independent unit d Delay has two components d = d abs τ = f + p Effort delay f = gh (a.k.a. stage effort) Again has two components Slide 7

Delay in a Logic Gate Express delays in process-independent unit d = d abs τ Delay has two components d = f + p Effort delay f = gh (a.k.a. stage effort) Again has two components g: logical effort Measures relative ability of gate to deliver current g 1 for inverter Slide 8

Delay in a Logic Gate Express delays in process-independent unit d = d abs τ Delay has two components d = f + p Effort delay f = gh (a.k.a. stage effort) Again has two components h: electrical effort = C out / C in Ratio of output to input capacitance Sometimes called fanout Slide 9

Delay in a Logic Gate Express delays in process-independent unit d Delay has two components d = d abs τ = f + p Parasitic delay p Represents delay of gate driving no load Set by internal parasitic capacitance Slide 10

Delay Plots d = f + p = gh + p NormalizedDelay:d 6 5 4 3 2 2-input NAND g = p = d = Inverter g = p = d = 1 0 0 1 2 3 4 5 ElectricalEffort: h = C out / C in Slide 11

Delay Plots d = f + p = gh + p What about NOR2? NormalizedDelay:d 6 5 4 3 2 2-input NAND Inverter g = 1 p = 1 d = h + 1 Effort Delay: f g = 4/3 p = 2 d = (4/3)h + 2 1 0 Parasitic Delay: p 0 1 2 3 4 5 ElectricalEffort: h = C out / C in Slide 12

Computing Logical Effort DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay vs. fanout plots Or estimate by counting transistor widths A 2 1 Y A B 2 2 2 2 Y A B 4 4 1 1 Y C in = 3 g = 3/3 C in = 4 g = 4/3 C in = 5 g = 5/3 Slide 13

Catalog of Gates Logical effort of common gates Gate type Number of inputs 1 2 3 4 n Inverter 1 NAND 4/3 5/3 6/3 (n+2)/3 NOR 5/3 7/3 9/3 (2n+1)/3 Tristate / mux 2 2 2 2 2 XOR, XNOR 4, 4 6, 12, 6 8, 16, 16, 8 Slide 14

Catalog of Gates Parasitic delay of common gates In multiples of p inv ( 1) Gate type Number of inputs 1 2 3 4 n Inverter 1 NAND 2 3 4 n NOR 2 3 4 n Tristate / mux 2 4 6 8 2n XOR, XNOR 4 6 8 Slide 15

Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Logical Effort: g = Electrical Effort: h = Parasitic Delay: p = Stage Delay: d = Frequency: f osc = Slide 16

Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Logical Effort: g = 1 Electrical Effort: h = 1 Parasitic Delay: p = 1 Stage Delay: d = 2 Frequency: f osc = 1/(2*N*d) = 1/4N 31 stage ring oscillator in 0.6 µm process has frequency of ~ 200 MHz Slide 17

Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: g = Electrical Effort: h = Parasitic Delay: p = Stage Delay: d = Slide 18

Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: g = 1 Electrical Effort: h = 4 Parasitic Delay: p = 1 Stage Delay: d = 5 The FO4 delay is about 200 ps in 0.6 µm process 60 ps in a 180 nm process f/3 ns in an f µm process Slide 19

Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort Path Electrical Effort Path Effort G = g i H = C C out-path in-path F = fi = gh i i 10 g 1 = 1 h 1 = x/10 x g 2 = 5/3 h 2 = y/x y g 3 = 4/3 h 3 = z/y z g 4 = 1 h 4 = 20/z 20 Slide 20

Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort Path Electrical Effort Path Effort Can we write F = GH? G = g i H = C out path C in path F = fi = gh i i Slide 21

Paths that Branch No! Consider paths that branch: G = H = 5 15 90 GH = h 1 = 15 90 h 2 = F = GH? Slide 22

Paths that Branch No! Consider paths that branch: G = 1 H = 90 / 5 = 18 5 15 90 GH = 18 h 1 = (15 +15) / 5 = 6 15 90 h 2 = 90 / 15 = 6 F = g 1 g 2 h 1 h 2 = 36 = 2GH Slide 23

Branching Effort Introduce branching effort Accounts for branching between stages in path b = C on path B = b i C C on path off path Now we compute the path effort F = GBH + Note: h i = BH Slide 24

Multistage Delays Path Effort Delay D F = f i Path Parasitic Delay Path Delay P = p i = i = F + D d D P Slide 25

Designing Fast Circuits = i = F + D d D P Delay is smallest when each stage bears same effort fˆ = gh = F i i 1 N Thus minimum delay of N stage path is 1 N D = NF + P This is a key result of logical effort Find fastest possible delay Doesn t require calculating gate sizes Slide 26

Gate Sizes How wide should the gates be for least delay? fˆ = gh= g C = in i C C out in gc i fˆ out i Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. Slide 27

Example: 3-stage path Select gate sizes x and y for least delay from A to B x x y 45 A 8 x y B 45 Slide 28

Example: 3-stage path x A 8 x x Logical Effort G = Electrical Effort H = Branching Effort B = Path Effort F = Best Stage Effort Parasitic Delay P = Delay D = y y 45 B 45 ˆf = Slide 29

Example: 3-stage path x A y x 45 8 x y B 45 Logical Effort G = (4/3)*(5/3)*(5/3) = 100/27 Electrical Effort H = 45/8 Branching Effort B = 3 * 2 = 6 Path Effort F = GBH = 125 ˆ 3 = F = 5 Best Stage Effort Parasitic Delay P = 2 + 3 + 2 = 7 f Delay D = 3*5 + 7 = 22 = 4.4 FO4 Slide 30

Example: 3-stage path Work backward for sizes y = x = x x y 45 A 8 x y B 45 Slide 31

Example: 3-stage path Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10 45 A P: 4 N: 4 P: 4 N: 6 P: 12 N: 3 B 45 Slide 32

Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter InitialDriver 1 1 1 1 D = DatapathLoad 64 64 64 64 N: f: D: 1 2 3 4 Slide 33

Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter InitialDriver 1 1 1 1 D = NF 1/N + P = N(64) 1/N + N 8 4 2.8 16 8 23 DatapathLoad 64 64 64 64 N: f: D: 1 64 65 2 8 18 3 4 15 Fastest 4 2.8 15.3 Slide 34

Derivation Consider adding inverters to end of path How many give least delay? = n 1 1 N + i + i= 1 D NF p N n p D N 1 1 1 N N N ( ) Define best stage effort 1 inv = F lnf + F + p = 0 inv 1 ρ = F N Logic Block: n 1 Stages Path Effort F N - n 1 Extra Inverters p + ρ 1 lnρ = 0 inv ( ) Slide 35

Best Stage Effort p + ρ 1 lnρ = 0 inv ( ) has no closed-form solution Neglecting parasitics (p inv = 0), we find ρ = 2.718 (e) For p inv = 1, solve numerically for ρ = 3.59 Slide 36

Sensitivity Analysis How sensitive is delay to using exactly the best number of stages? 1.6 D(N) / D(N) 1.51 1.4 1.26 1.2 1.15 1.0 (ρ=6) (ρ =2.4) 0.0 0.5 0.7 1.0 1.4 2.0 2.4 < ρ < 6 gives delay within 15% of optimal We can be sloppy! I like ρ = 4 N / N Slide 37

Example, Revisited Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the A[3:0] A[3:0] decoder for a register file. 32 bits Decoder specifications: 16 16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? 4:16 Decoder Register File 16 words Slide 38

Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H = Branching Effort: B = If we neglect logical effort (assume G = 1) Path Effort: F = Number of Stages: N = Slide 39

Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H = (32*3) / 10 = 9.6 Branching Effort: B = 8 If we neglect logical effort (assume G = 1) Path Effort: F = GBH = 76.8 Number of Stages: N = log 4 F = 3.1 Try a 3-stage design Slide 40

Gate Sizes & Delay Logical Effort: G = Path Effort: F = Stage Effort: Path Delay: ˆf = D = Gate sizes: z = y = A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0] 10 10 10 10 10 10 10 10 y z word[0] 96 units of wordline capacitance y z word[15] Slide 41

Gate Sizes & Delay Logical Effort: G = 1 * 6/3 * 1 = 2 Path Effort: F = GBH = 154 Stage Effort: Path Delay: Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7 A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0] 10 10 10 10 10 10 10 10 ˆ 1/3 = F = 5.36 f D = 3fˆ + 1+ 4+ 1= 22.1 y z word[0] 96 units of wordline capacitance y z word[15] Slide 42

Comparison Compare many alternatives with a spreadsheet Design N G P D NAND4-INV 2 2 5 29.8 NAND2-NOR2 2 20/9 4 30.1 INV-NAND4-INV 3 2 6 22.1 NAND4-INV-INV-INV 4 2 7 21.1 NAND2-NOR2-INV-INV 4 20/9 6 20.5 NAND2-INV-NAND2-INV 4 16/9 6 19.7 INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4 NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6 Slide 43

Review of Definitions Term number of stages logical effort electrical effort branching effort effort effort delay parasitic delay delay Stage 1 g h = b = f f p = C C C out in on-path gh + C C on-path d = f + p off-path Path N G = g i H = C C out-path in-path B= b i F D F = GBH = f i P= p i D = d = D + P i F Slide 44

Method of Logical Effort 1) Compute path effort 2) Estimate best number of stages 3) Sketch path with N stages 4) Estimate least delay 5) Determine best stage effort F = GBH N = log 4 F 1 N D = NF + P fˆ = F 1 N 6) Find gate sizes C in i = gc i fˆ out i Slide 45

Limits of Logical Effort Chicken and egg problem Need path to compute G But don t know number of stages without G Simplistic delay model Neglects input rise time effects Interconnect Iteration required in designs with wire Maximum speed only Not minimum area/power for constrained delay Slide 46

Summary Logical effort is useful for thinking of delay in circuits Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesn t mean faster paths Delay of path is about log 4 F FO4 inverter delays Inverters and NAND2 best for driving large caps Provides language for discussing fast circuits But requires practice to master Slide 47