High-Speed Electronics Mentor User Conference 2005 - München Dr. Alex Huber, hubera@zma.ch Zentrum für Mikroelektronik Aargau, 5210 Windisch, Switzerland www.zma.ch Page 1
Outline 1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example Application: Clock-Data Recovery 6. Conclusion Page 2
1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example Application: Clock-Data Recovery 6. Conclusion Page 3
Applications Serial Data Communication (PCI-Express, Serial-ATA, etc.) Fiber-Optical Communication (e.g. 10G Ethernet) Flash-ADC, high-speed Σ -ADC Ultra-Wideband (UWB) = wireless digital interconnect Fractional-N PLL (Prescaler) Page 4
Parallel I/O Bottleneck CPU Graphics-P. Display Harddisk Memory 64bit, 3GHz D0 D1 Dn CLK f CLK C coup X-Talk Clock/Data Skew Clock Jitter Data Jitter D0 D1 Dn CLK f CLK < 166 500 MHz @ l = 1 m 10 cm, 16 64bit CPU Graphics-P. Display Harddisk Memory 64bit, 3GHz Internal throughput: 24GByte/s I/O throughput: 1 4 GByte/s m Page 5
Serial Data Communication D0 D1 Dn CLK f CLK MUX f ser D 0 f CLK = 2 GHz, n = 16 bit Clock not transmitted: f CLK = 2 GHz, n = 64 bit D 1 D 2D3 f ser = n/2 f CLK f CLK D n DEC f ser CR DEMUX 2/n High-Speed Digital Electronics required! f CLK D0 D1 Dn CLK CLK generated at Rx f SER = 8 2 GHz = 16 GHz (32 Gb/s) Unlimited number of parallel links 4 links à 32 Gb/s: 128 Gb/s = 16GB/s Page 6
I/O Technology Trend 20GHz 10GHz 5GHz 1GHz 66MHz PCI Bus VESA MCA EISA ISA Bus 1980 1990 2000 Page 7 f SER = signaling clock rate 12GHz Copper Signaling Limit (attenuation) Serial Bus Architectures 1GHz Parallel Bus Limit (cross-talk) PCI-X AGP 2x Optical PCI Express Serial-ATA IEEE 1394b Parallel Bus Architectures Year Source: National Instruments
Technologies Earlier: Dominated by bipolar processes (GaAs, Si, SiGe, InP) Nowadays: CMOS at 130nm, 90nm, (65nm) Question: How to reach > 10 GHz using CMOS technologies Page 8
1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example Application: Clock-Data Recovery 6. Conclusion Page 9
Repetion: Some Basics of MOSFET G V GS D S I D W n L DC Drain Current Transconductance Page 10 I V DS D V GS G S D I D W p L V DS ( ) W VGS UT = Cox µ n,p for VDS VGS UT L 2 g m I G S 2 C GS v GS g m v GS > i D C DS Threshold Voltage Design Variables W U = D = GS T 2 C µ ox n,p VGS L 2 V 0 D S
Speed Limitation of CMOS Gates V DD W p L C gs,p V gs,p C ds,p i d,p =g m,p *V gs,p V in Page 11 W n L V out With ideal voltage drive: V in C gs,n V gs,n t d = V out C ds,p /i d,n C ds,n i d,n =g m,n *V gs,n For V p =V DD /2: W p 2 3 W n (due to µ n 2...3µ p ) To reduce delay t d : C ds,p L ; W p W n i d,n L ; g m,n W n Technology scaling: L=1µm (1990) L=90nm (2004) Given technology: No improvement possible! V out
Reduction of Delay Time Basic idea: Decouple charging current and switch transistor V DD Load Load: PMOS, resistor, inductor,? Page 12 V in W n L V out I charge ; W n t d i charge Common Mode Voltage V cm? Limits: W n large enough for a) V-Gain = g m,n R Load > 1 b) V ds,sat (i charge ) < V DD - V cm - V Load
Current Mode Logic (CML) Differential topology: independent of V cm V in Load V DD W n L V out V cm i 0 V DD Load W n L Advantages for high-speed operation: 1. Independent choice of W n and i 0 2. Differential mode: higher SNR 3. Free choice of load: PMOS, Resistor, Inductor 4. With Inductor: BW ind 2 BW res f CLK,CML (2..3) f CLK,CMOS @ same L Disadvantages: 1. Static DC power consumption V DD i 0 2. Area consumption: 2 {PMOS Res Ind}, 3 NMOS Page 13
1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example Application: Clock-Data Recovery 6. Conclusion Page 14
CML vs. CMOS: Figures of Merit Logic swing Load Capacitance Logic depth Delay t d N RC CML V C = N I C ox CMOS N C VDD µ V U ) ( DD t ~ V ~I α Power P Power- Delay Product Energy- Delay Product P t P t 2 d d = = N I V [ N I V ] = N 2 V DD DD DD 2 [ N V V C] = N DD 3 V DD V C N I V C N 2 2 ( V ) C I V C I N N V t 2 DD d, CMOS C ( f CLK = 1 t 2 VDD C td, CMOS 2 = N V t d, CMOS N V = C 2 DD ox C t d, CMOS 2 3 2 N VDDC µ ( V U ) DD t α DD d ) C Page 15
CML vs. CMOS: Trade-Off Energy-Delay-Product EDP: EDP CMOS = C ox 2 3 2 N VDDC µ ( V U ) DD t α EDP CML = N 3 V DD ( V ) I 2 C 2 400 t d [ps] 300 200 100 0.5 1 1.5 2 V DD [V] Page 16 Delay CMOS Advantage for CML: low logic depth, very high speed Example: 0.25µm technology V DD =2.5V PMOS load 2.5 2 1.5 Energy-Delay (N=4) EDP [pj ps] 1 0 100 1000 t d [ps] 400 t d [ps] 300 200 Delay CML 100 1 10 100 I [µa] No low bound on CML delay with current! Source: An Analysis of MOS CML for Low Power and High Performance Digital Logic, Master thesis, J. Musicer, UC Berkeley
1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example Application: Clock-Data Recovery 6. Conclusion Page 17
Inverter Page 18 V DD Logic Level: V h = I 0 R C t d Gate Delay
Multiplexer V DD V DD Page 19
EXOR V A =1, V B =1: V Q =-1~0 V A =-1, V B =1: V Q =1 Page 20 V DD V DD V A =1, V B =-1: V Q =1 V DD V DD V A =-1, V B =-1: V Q =-1~0
D-Latch VDD Page 21 V DD
Master-Slave D-FlipFlop Page 22 Negative edge triggered FF with respect to CLK
Design Techniques for CML CMOS design kits typically don t provide CML standard cells (some Bipolar/BiCMOS kits might, but typically only in very mature = old = slow technologies and only symbol/layout) System design can be done on cell-level Cell design is done with analog (= transistor-level) techniques Simulation must reflect the analog nature of CML gates (= dependence of delay/transition-time on input/output impedance) Good and useful design technique: VHDL-/Verilog-AMS Page 23
1. Motivation 2. Speed Limitation of CMOS logic 3. Current Mode Logic: Advantages and Features 4. Current Mode Logic Gates 5. Example: Clock-Data Recovery for Serial Data Communication 6. Conclusion Page 24
Clock-Data Recovery (I) D in DEC f ser CR Page 25 DEMUX D 0 D 1 D n 2/n CLK Analog Full-Rate Architecture D in DFF PFD LF VCO DFF (CML) and VCO operate at f ser (e.g. 40GHz at 40Gb/s) low complexity ultra-high-speed technology (SiGe, InP) required Passive, external? DEMUX PLL
Clock-Data Recovery (II) D in DEC f ser DEMUX D 0 D 1 D n n-th Rate Architecture 2n DFF Digital (CML) D 0 D 1 CR 2/n CLK Clock frequency at DFF: f CLK, full-rate /n 2n phases required ( t = 1/(2f CLK, full-rate )) D in 2n VCO Phases Analog (?) DFF DFF VCO Reference Frequency Edge Detector Loop Filter D n Page 26
Edge Detection 2n DFF D in DFF DFF d 0 e 0 e n Edge Detector 2n Phase Clk to Loop Filter locked: CLK in-phase with DATA CLK is early w/ respect to DATA CLK d 0 d 1 d n d n+1 D0 D1 D2 D3 DATA e 0 e 1 <e n > = 0.5 CLK CLK d 0 d 1 D0 D1 D2 D3 DATA d n = e n e 0 e 1 CLK Page 27
Generation of Variable Clock Phase VCO: V ctrl cos([ω 0 +K vco V ctrl ] t) = cos(ω 0 t + φ) φ = K vco V ctrl t = K vco V ctrl dt Phase Rotator: cos(ω 0 t) φ 1 φ n n-phase VCO UP DN δϕ φ-rot no Integration! ϕ = δϕ n T cos(ω 0 t + φ) I = ϕ 2 Q Q = ϕ 3 = ϕ 1 I = ϕ 0 Page 28
Clock-Data Recovery (III): HIGHSCORE High-Speed Communication Receiver for 40 Gb/s in CMOS : CTI Project of zma, BFH Burgdorf, ETHZ, IBM ZRL 40 Gb/s optical input 0, 45,..., 315 Page 29 16 D Edge- Early Detector 16 E Late Phase- Rotator 8 parallel Sampler 8-phase DLL CML ( analog ) E D E D E D E D 8:32 DEMUX 16 16 Rate Reduction Early Late 1 1 10GHz 2.5GHz 1.25GHz up up/down Digital Loop dn counter Filter 1.25GHz 0, 45,..., 315 Ref-VCO (8 phase) 1x for 8 Links 4 4x 10Gb/s electrical output 1x/Link CMOS ( digital )
HIGHSCORE Key Features Goal: Serial Communication Receiver (=CDR) at 40 Gb/s 90 nm CMOS state-of-the-art (IBM) FO4-Delay of CMOS: 20 ps Typical clock frequencies of CMOS logic: 1.25 GHz CML operates at f CLK = 10 GHz, partially inductor-peaked Quarter-Rate Architecture No external passive components (except 1 XTAL) Fully digital loop-filter (complex functions possible!) Simulation-friendly: VHDL/Verilog representation for system characterization possible Page 30
Conclusion CMOS logic maximum speed can only be scaled by technology improvements (reduce Gate-Length) CML logic offers 2-3 times higher clock frequencies with improved Energy-Delay product compared to CMOS at same frequencies and same Gate-Length CML gates are designed and simulated with analog techniques while the system function is digital by nature A main application of very high speed digital electronics is Serial Data Communication for highest I/O Bandwidth to overcome the parallel I/O Bottleneck Highly parallel architectures allow fully digital designs with complex functions in CMOS replacing bulky passive components (and no technology change to SiGe/InP) Page 31