A 1.7 Gbps DLL-Based Clock Data Recovery for a Serial Display Interface in 0.35-μm CMOS

Similar documents
A 1.62/2.7/5.4 Gbps Clock and Data Recovery Circuit for DisplayPort 1.2 with a single VCO

A 3.2Gb/s Clock and Data Recovery Circuit Without Reference Clock for a High-Speed Serial Data Link

ISSCC 2003 / SESSION 13 / 40Gb/s COMMUNICATION ICS / PAPER 13.7

Managing High-Speed Clocks

IN RECENT YEARS, the increase of data transmission over

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.7

Duobinary Modulation For Optical Systems

Equalization/Compensation of Transmission Media. Channel (copper or fiber)

11. High-Speed Differential Interfaces in Cyclone II Devices

Model-Based Synthesis of High- Speed Serial-Link Transmitter Designs

Rong-Jyi YANG, Nonmember and Shen-Iuan LIU a), Member

8 Gbps CMOS interface for parallel fiber-optic interconnects

Clock Recovery in Serial-Data Systems Ransom Stephens, Ph.D.

Analysis and Design of Robust Multi-Gb/s Clock and Data Recovery Circuits

PCI-SIG ENGINEERING CHANGE NOTICE

Pericom PCI Express 1.0 & PCI Express 2.0 Advanced Clock Solutions

A New Programmable RF System for System-on-Chip Applications

BURST-MODE communication relies on very fast acquisition

DS2187 Receive Line Interface

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

Using Pre-Emphasis and Equalization with Stratix GX

White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces

8B/10B Coding 64B/66B Coding

A PC-BASED TIME INTERVAL COUNTER WITH 200 PS RESOLUTION

Loop Bandwidth and Clock Data Recovery (CDR) in Oscilloscope Measurements. Application Note

A 2 Gbps to 12 Gbps Wide-Range CDR with Automatic Frequency Band Selector

How To Test The Performance Of An Oversampling Cdr In An Fgpa, Jitter And Memory On A Black Box (Cdr) In A Test Program

Timing Errors and Jitter

A Laser Scanner Chip Set for Accurate Perception Systems

A Gigabit Transceiver for Data Transmission in Future HEP Experiments and An overview of optoelectronics in HEP

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

Non-Data Aided Carrier Offset Compensation for SDR Implementation

A 10,000 Frames/s 0.18 µm CMOS Digital Pixel Sensor with Pixel-Level Memory

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

Explore Efficient Test Approaches for PCIe at 16GT/s Kalev Sepp Principal Engineer Tektronix, Inc

Jitter in PCIe application on embedded boards with PLL Zero delay Clock buffer

Measurement, Modeling and Simulation of Power Line Channel for Indoor High-speed Data Communications

S. Venkatesh, Mrs. T. Gowri, Department of ECE, GIT, GITAM University, Vishakhapatnam, India

A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector

Harmonics and Noise in Photovoltaic (PV) Inverter and the Mitigation Strategies

USB 3.0 CDR Model White Paper Revision 0.5

CLOCK AND DATA RECOVERY CIRCUITS RUIYUAN ZHANG

Application Note. Line Card Redundancy Design With the XRT83SL38 T1/E1 SH/LH LIU ICs

A Combined Clock and Data Recovery Circuit with Adaptive Cancellation of Data-Dependent Jitter

Design and analysis of flip flops for low power clocking system

W a d i a D i g i t a l

Signal integrity in deep-sub-micron integrated circuits

Topics of Chapter 5 Sequential Machines. Memory elements. Memory element terminology. Clock terminology

Hello and welcome to this training module for the STM32L4 Liquid Crystal Display (LCD) controller. This controller can be used in a wide range of

Abstract. Cycle Domain Simulator for Phase-Locked Loops

A 40 Gb/s Clock and Data Recovery Module with Improved Phase-Locked Loop Circuits

DEVELOPMENT OF DEVICES AND METHODS FOR PHASE AND AC LINEARITY MEASUREMENTS IN DIGITIZERS

Lecture 2. High-Speed I/O

Eye Doctor II Advanced Signal Integrity Tools

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

Low latency synchronization through speculation

PCM Encoding and Decoding:

PowerPC Microprocessor Clock Modes

Alpha CPU and Clock Design Evolution

6.976 High Speed Communication Circuits and Systems Lecture 1 Overview of Course

PLAS: Analog memory ASIC Conceptual design & development status

On-chip clock error characterization for clock distribution system

DDR subsystem: Enhancing System Reliability and Yield

Phase Locked Loop (PLL) based Clock and Data Recovery Circuits (CDR) using Calibrated Delay Flip Flop

PCI Express: The Evolution to 8.0 GT/s. Navraj Nandra, Director of Marketing Mixed-Signal and Analog IP, Synopsys

Keysight Technologies Forward Clocking - Receiver (RX) Jitter Tolerance Test with J-BERT N4903B High-Performance Serial BERT.

Signal Types and Terminations

A Storage Architecture for High Speed Signal Processing: Embedding RAID 0 on FPGA

Link-65 MHz, +3.3V LVDS Transmitter 24-Bit Flat Panel Display (FPD) Link-65 MHz

Selecting the Optimum PCI Express Clock Source

A 3 V 12b 100 MS/s CMOS D/A Converter for High- Speed Communication Systems

A true low voltage class-ab current mirror

Equalization for High-Speed Serial Interfaces in Xilinx 7 Series FPGA Transceivers

Table 1 SDR to DDR Quick Reference

Multiple clock domains

SDI SDI SDI. Frame Synchronizer. Figure 1. Typical Example of a Professional Broadcast Video

Spread Spectrum Clock Generator

Impedance 50 (75 connectors via adapters)

International Journal of Electronics and Computer Science Engineering 1482

Power Reduction Techniques in the SoC Clock Network. Clock Power

DTSB35(53)12L-CD20 RoHS Compliant 1.25G 1310/1550nm(1550/1310nm) 20KM Transceiver

Design and Modelling of Clock and Data Recovery Integrated Circuit in 130 nm CMOS Technology for 10 Gb/s Serial Data Communications

Product Specification. RoHS-6 Compliant 10Gb/s 850nm Multimode Datacom XFP Optical Transceiver FTLX8512D3BCL

路 論 Chapter 15 System-Level Physical Design

Silicon Seminar. Optolinks and Off Detector Electronics in ATLAS Pixel Detector

Chapter 6 PLL and Clock Generator

TIP-VBY1HS Data Sheet

DRM compatible RF Tuner Unit DRT1

MEASUREMENT UNCERTAINTY IN VECTOR NETWORK ANALYZER

Continuous-Time Converter Architectures for Integrated Audio Processors: By Brian Trotter, Cirrus Logic, Inc. September 2008

Application Note Noise Frequently Asked Questions

Achieving New Levels of Channel Density in Downstream Cable Transmitter Systems: RF DACs Deliver Smaller Size and Lower Power Consumption

HOW TO GET 23 BITS OF EFFECTIVE RESOLUTION FROM YOUR 24-BIT CONVERTER

ICS SPREAD SPECTRUM CLOCK SYNTHESIZER. Description. Features. Block Diagram DATASHEET

Calibration Techniques for High- Bandwidth Source-Synchronous Interfaces

How PLL Performances Affect Wireless Systems

A 1-GSPS CMOS Flash A/D Converter for System-on-Chip Applications

LOW POWER DESIGN OF DIGITAL SYSTEMS USING ENERGY RECOVERY CLOCKING AND CLOCK GATING

Transcription:

A 1.7 Gbps DLL-Based Clock Data Recovery for a Serial Display Interface in 0.35-μm CMOS Yong-Hwan Moon, Sang-Ho Kim, Tae-Ho Kim, Hyung-Min Park, and Jin-Ku Kang This paper presents a delay-locked-loop based clock and data recovery (CDR) circuit design with a nb(n+2)b data formatting scheme for a high-speed serial display interface. The nb(n+2)b data is formatted by inserting a 01 clock information pattern in every piece of N-bit data. The proposed CDR recovers clock and data in 1:10 demultiplexed form without an external reference clock. To validate the feasibility of the scheme, a 1.7-Gbps CDR based on the proposed scheme is designed, simulated, and fabricated. Input data patterns were formatted as 10B12B for a high-performance display interface. The proposed CDR consumes approximately 8 ma under a 3.3-V power supply using a 0.35-μm CMOS process and the measured peak-to-peak jitter of the recovered clock is 44 ps. Keywords: Clock data recovery (CDR), delay-locked loop (DLL), nb(n+2)b data formatting scheme, highspeed serial interface, display interface. Manuscript received Mar. 8, 2011; revised June 17, 2011; accepted July 1, 2011. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2011-0026743). Yong-Hwan Moon (phone: +82 32 860 8721, moonyong@inha.edu), Tae-Ho Kim (corresponding author, taeho.kim.inha@gmail.com), and Jin-Ku Kang (jkang@inha.ac.kr) are with the School of Electronics Engineering, Inha University, Incheon, Rep. of Korea. Sang-Ho Kim (sharpksh@gmail.com) is with Siliconworks Co., Ilsan, Gyeonggido, Rep. of Korea. Hyung-Min Park (hyungmin@lginnotek.com) is with LG Innotek Co., Ansan, Gyeonggido, Rep. of Korea. http://dx.doi.org/10.4218/etrij.12.0111.0143 I. Introduction In high-speed serial interface applications, phase-locked loop (PLL)-based clock and data recovery (CDR) is mostly used [1]. However, the PLL-based CDR consumes considerable power and generates accumulated jitter due to the voltage-controlled oscillator. Furthermore, the conventional PLL-based CDR requires an 8B10B encoding scheme for providing DC balance and ample transition edges for clock recovery. Therefore, at the receiver side, 8B10B decoding, comma detection, and word alignment blocks are required. On the other hand, the delaylocked loop (DLL)-based CDR usually consumes less power than the PLL-based CDR. In addition, the voltage-controlled delay line (VCDL) in the DLL-based CDR does not accumulate jitter. However, the conventional DLL-based CDR design requires an external reference clock line [2]-[4]. The proposed DLL-based CDR uses a single data line encoded by the nb(n+2)b data formatting scheme with embedded clock information for a display interface [5]. The proposed CDR operates on the training data patterns and real input data patterns with the 10B12B data formatting scheme. The proposed nb(n+2)b data formatting scheme is realized by inserting a 01 clock information pattern for every piece of N-bit input data. The proposed 10B12B data formatting scheme has a 17% coding overhead. Compared to the 8B10B, the encoding scheme has a 20% coding overhead. Hence, the proposed 10B12B data formatting scheme transmits a higher data rate than the conventional 8B10B encoding scheme. In addition, the proposed nb(n+2)b data formatting scheme requires a simple word alignment operation. The proposed scheme thus has a simpler architecture and less power consumption than the conventional 8B10B encoding scheme. This paper first presents the architecture and operation ETRI Journal, Volume 34, Number 1, February 2012 2012 Yong-Hwan Moon et al. 35

10-bit Data control TX Encoder & serializer RX Equalizer Proposed CDR 10-bit Recovered data Recovered clock Start-end 0 01 inserted 01 inserted 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 Training pattern 10-bit data+2-bit (01) signal (1.7 Gbps) Path control signal (a) Clock recovery unit D8 D9 0 1 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 0 1 Real data pattern Fig. 2. Formatted input pattern. D0 D1 D2 142-MHz 12-phase clock 142-MHz 10-bit recovery data Clock recovery unit Recovered CLK Start-end Step 1 Training pattern Path control signal 0 Acquisition of lock state TX Step 3 Real data pattern Path control signal 1 Step 2 After a certain time Multiphase clock gen. (b) Fig. 1. Block diagram of (a) proposed serial interface and (b) proposed DLL-based CDR. principle. Details of the proposed clock recovery unit are then explained. The circuit design and measurement results follow. II. Architecture and Operation Figure 1(a) shows the proposed serial interface block and Fig. 1(b) presents a block diagram of the proposed DLLbased CDR. The transmitter (TX) block has an encoder and a serializer. The main block of the receiver (RX) includes the proposed CDR. The CDR has a clock recovery unit and a data recovery block. In the proposed 1.7-Gbps DLL-based CDR design, the training data and real data patterns are formatted by inserting 01 in every piece of 10-bit input data. In other words, the value n of the proposed nb(n+2)b data formatting scheme is 10. The reason why we choose the 10B12B data formatting is that the circuit is targeted for highdefinition display interface applications with a 10-bit gray level [6], [7]. The proposed CDR recovers the clock from the inserted clock information 01 on the nb(n+2)b formatted serial data. Since the input data is formatted by inserting the 01 pattern in every N-bit, the number of multiphase clocks to be generated is N+2. For 1.7-Gbps CDR design with 10B12B data formatting, the clock recovery unit needs to generate 142-MHz 12 multiphase clocks (phase difference = 588 ps = 1-bit period of 1.7 Gbps). The data recovery block retimes the data using the generated 12 multiphase clocks in the clock recovery unit. The 10B12B formatted data patterns for the proposed CDR are shown in Fig. 2. The training pattern serves as a reference clock for the DLL-based CDR. The duty ratio of the training Step 4 Multiphase clock Data recovery Step 5 Data alignment Start-end generation Fig. 3. Flow chart of proposed CDR. pattern is not a concern as long as the pattern provides a signal transition. The encoded data is transmitted at 1.7 Gbps for handling the actual 1.42-Gbps data rate because the coding overhead of the proposed 10B12B data formatting scheme is 17%. The proposed 10B12B data formatting scheme can be expanded to nb(n+2)b. If n is increased, the data formatting overhead is decreased. However, the number of delay cells in the VCDL increases, and the power consumption and the jitter can be affected. Depending on the targeted high-speed interfaces, the demultiplexing structure, the recovered clock rate, the number of multiphase clocks, and the input data rate can be varied to obtain optimal design conditions. Figure 3 shows the overall operation flow chart. The input data transmission is accomplished with two operation steps and each step sends different data patterns. At the beginning of data transmission, the TX sends the formatted training pattern for a certain period. During the training period, the proposed CDR in the RX circuit is virtually operating as a DLL. The proposed CDR thus generates a recovered clock and acquires the lock state in the training pattern period. After the TX sends the training data pattern, the TX switches to send the real data pattern. During the real data pattern period, the CDR continues to recover the clock by tracking the inserted 01 pattern, and retimes the data. The path control signal is a control signal 36 Yong-Hwan Moon et al. ETRI Journal, Volume 34, Number 1, February 2012

Serial data stream Path control signal Buffer & delay 0 1 MUX Pulse generator LPF 12 multiphase clock Sliding window 8-b multiphase clock CP PD REF FEB (clk 1) (clk 13) VCDL Serial data stream D8 Multiphase clock_i Multiphase clock_(i+2) Generated pulse D9 0 1 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 0 1 D0 Fig. 4. Block diagram of clock recovery unit generated in the TX block. The path control signal distinguishes the training data pattern period from the real data transmission period. The TX sends the training data pattern and real data transmission, while the path control signal is 0 and 1, respectively. Since the proposed DLL-based clock recovery targets high-speed digital display interfaces, the training sequence bits can be transmitted during vertical and horizontal blanking periods. These blanking periods are much longer than the required minimum training sequence period. In other serial link applications, the optimized training sequence period can be as short as possible. The complete transceiver system based on the proposed DLL-based CDR should have a loss of lock (LOL) generated from the RX and control the training/data mode in the TX. If the TX receives the LOL signal from the RX, the TX should start to send the training bits to the RX. Hence, the lock detection function block in the RX should be added for that control, but the function is not included in this prototype. III. Clock and Data Recovery Unit 1. Clock Recovery Unit Figure 4 shows the clock recovery unit of the proposed CDR. While the TX is sending the training data pattern, the path control signal connects the serial data stream (training data pattern) to the VCDL. The clock recovery unit then operates as an ordinary DLL. The training pattern acts as a 142-MHz reference clock signal. The clock recovery unit acquires the lock state through the DLL operation. Although the DLL goes to the lock state, the path maintains the pure DLL operation while the path control signal remains at 0. After the training pattern period, the TX transmits the real data pattern and the path control signal goes to 1. The MUX then selects a signal from the pulse generator. The pulse generator block extracts the clock and continues tracking the clock phase by the inserted 01 pattern. The window signal is produced for masking random data patterns by extracting only the signal transition ( 01 clock information) information. While the encoded Fig. 5. Ideal timing diagram of window and pulse generator under lock state. Data Multiphase clock_i Multiphase clock_i+2 D D-F/F1 Q Clk RST D Clk Q1 Delay 890 ps D-F/F Fig. 6. Block diagram of (a) window generator and (b) pulse generator. (a) Q RST D Q D-F/F2 Clk RST 142 MHz pulse Delay 3.2 ns random input data patterns are provided, the clock recovery unit keeps extracting the transition information and generates the clock signal. The clock recovery unit produces 12 multiphase clocks by the VCDL. The data recovery block recovers the data using the 12 multiphase clocks. Figure 5 depicts the ideal timing diagram of the window and the pulse signal under the lock state. The window signal is generated using the two multiphase clocks (11th and 1st clock signals) by tapping two signals from the VCDL. The window generator and the pulse generator are shown in Fig. 6. The window generator can be realized by a single D flip-flop (D-F/F) with a reset. The window signal marches with the inserted 01 pattern on the data stream. The pulse generator generates a 142-MHz clock by a combination of the window signal and the inserted 01 clock information. The pulse generator consists of two D-F/Fs and two delay cells. The first D-F/F of the pulse generator creates roughly 0.9-ns pulse width by both the window signal and the positive edge of the inserted 01 pattern every 7 ns (period of 142 MHz). The pulse (b) ETRI Journal, Volume 34, Number 1, February 2012 Yong-Hwan Moon et al. 37

becomes enlarged to a 3.2-ns pulse width by the second D-F/F. However, if the delay is increased to more than 3.5 ns, the next positive edge will be not generated by the feedback reset. Considering process, supply voltage, and temperature (PVT) variation for safe operation, the delay is set as 3.2 ns (+300 ps timing margin for PVT variation). The optimal pulse is hence generated with a 45% duty cycle. The duty ratio is not critical in this design, because the data recovery is accomplished with only the rising edge of the clock. The critical point here is precise window position control relative to the inserted 01 position in the encoded input data. To ensure that the window is positioned in the correct place, a sliding window scheme is implemented, as shown in Fig. 7. The window sliding scheme generates four different windows for capturing the inserted 01 pattern under PVT variation. The four windows can be shaped by different combinations of multiphase clocks from DLL. The prototype has four window choices, which are the expected window, +/ shifted windows, and an enlarged window from the optimally expected window. The final window is selected by measuring the relative position between the inserted 01 pattern and the windows. The control pins (s1, s0) are placed externally in this design stage. An adaptive window selecting scheme can be implemented with an extra logic. One possible solution is that the inserted 01 transition timing is oversampled by the finer clocks (sampling resolution of about 120 ps) and the window selection can be made by the relative positioning of the 01 transition based on sampled bits. The finer clocks should be generated with finer delays. In the proposed 10B12B encoding scheme, fixed 01 patterns are inserted into the data stream. Thus, in the frequency domain analysis, there could be some peaks in the spectrum that may cause an electromagnetic interference (EMI) problem in system integration. The spread spectrum clock generator can therefore be used to reduce the EMI problem in the TX. If the transmitted data is spread spectrum data, the CDR should have a proper bandwidth for handling the spread spectrum data. External input S[1:0] MUX _Gen _Gen _Gen _Gen Multiphase clock 4 Multiphase clock 6 Multiphase clock 5 Multiphase clock 7 Multiphase clock 3 Multiphase clock 5 Multiphase clock 3 Multiphase clock 7 Fig. 7. Block diagram of sliding window scheme. Buffer 10b Data D Q D Q D-F/F D-F/F 1~10 11~20 Phase Clk Clk 0~9 RST RST Buffer Path control signal Phase 10 Buffer Data recovery D Q D-F/F 21 Clk Word alignment Fig. 8. Block diagram of data recovery unit. 2. Data Recovery Unit 10b Recovery data Start-end signal Figure 8 shows a block diagram of the data recovery block. The data recovery block consists of D-F/Fs and buffers. The proposed data recovery starts to recover the data when the path control signal goes to 1. The proposed data recovery block recovers the 10-bit real data in parallel at 142 Mbps using 10 different phase clocks (@142 MHz) of the generated 12 multiphase clocks from the clock recovery unit. D-F/Fs 1 to 10 recover each data and D-F/Fs 11 to 20 align the recovered data at phase 10. D-F/F 21 generates the start-end signal, which indicates the start and end of recovered real data. The start-end signal is 0 in the training pattern period. If the start-end signal is 1, then the proposed CDR recovers the data during the real data pattern period. The inserted 01 clock information pattern is not recovered. Thus, the word alignment scheme is easily implemented without an additional logic compared to conventional 8B10B word alignment. IV. Circuit Design 1. DLL Design A DLL circuit used in the clock recovery unit consists of a phase detector (PD), a loop filter, and the VCDL. Figure 9 shows the PD, which is a conventional phase frequency detector with an extra D-F/F and a XOR for excluding the first reference clock edge to be compared with the first feedback clock edge. The last D-F/F (the bottom one) prevents the false lock operation and is a modified version of [8]. The total delay of the VCDL is still designed between 0.5T and 1.5T, where T is the clock period for faster locking. The unit cell of the VCDL is shown in Fig. 10(a) and Fig. 10(b) shows a block diagram of the VCDL in the DLL. The unit cell of the VCDL is based on a current-starved inverter structure for its simplicity and small area occupation. For insensitivity to the supply noise, the unit cell has a cascoded 38 Yong-Hwan Moon et al. ETRI Journal, Volume 34, Number 1, February 2012

Reference (ck1) F/F rst Up Vbp1 Vbp2 VC Feedback (ck13) rst F/F F/F Down IN Vbn2 OUT rst Vbn1 System reset Fig. 9. Circuit diagram of phase detector. current source structure. It has four bias voltages and one voltage control (VC) node for delay adjustment. Each delay stage consists of two inverter cells. Under three different operating conditions, the simulated delay versus control voltage for the VCDL (composed by 26 delay cells and 13 delay stages) is demonstrated in Fig. 11. The total delay is set around between 6 ns and 8.5 ns at a control voltage range of 0.5 V to 2.5 V on the typical operating condition. Corner simulations were carried out for assuring the targeted delay coverage under PVT variations. As shown in Fig. 11, under three process corners, the targeted total delay of 7 ns could be covered with a control voltage between 1.1 V and 1.99 V. The delay value of the VCDL is 7.8 ns for the slow condition and 6.4 ns for the fast condition, respectively, at a control voltage of 1.4 V. Thus, the total delay variations of the VCDL are +800 ps and 600 ps under two extreme operating conditions. Assuming the variations are distributed to each delay stage, the maximum deviations of each delay stage are +61.54 ps and 46.16 ps. The maximum interstage mismatch can then be about 108 ps. Considering the ideal spacing between multiphase clocks is 588 ps (one bit data period of 1.7 Gbps data), the maximum predictable clock phase variation of the interstage is about 18% of the clock timing margin. Thus, there is an ample timing margin for data recovery with input data jitter included. The measured values are also well matched to simulation as shown in Fig. 11. In a liquid crystal display (LCD) interface, high power supply noise is produced by the amplifier in the high-voltage LCD drivers. Therefore, the proposed CDR should recover the clock robustly under the power supply noise. The proposed CDR used a bias circuit highly insensitive to the power supply noise [9]. 2. Equalizer Circuit At higher data rates, the intersymbol interference (ISI) VC Vbp1 Vbp2 Vbn2 Vbn1 (a) D1 D2 D25 D26 Dummy cell (b) CLK1 Dummy cell Fig. 10. Circuit diagram of (a) unit cell and (b)vcdl. Delay (ns) 10 9 10 9 8 7 6 5 Typical Slow Fast Meas. Fig. 11. Measured and simulated total delay of VCDL. Dummy cell 4 0.5 1.0 1.5 2.0 2.5 Control voltage (V) CLK13 corruption caused by the skin effect and the dielectric loss of the electrical channel is exacerbated. To overcome this problem, channel modeling is done and the channel loss should be compensated by the equalizer. In our test environment, the channel is a 30-inch-long trace with FR-4 and the insertion loss was measured as 10 db at 1.2 GHz. An analog equalizer with capacitance degeneration is added to the RX to compensate the high frequency loss caused by the channel. The amount of compensation level is controlled up to 10 db with a 3-bit external control signal (EQ_CTRL). The block diagram and circuits of the equalizer are shown in Fig. 12. ETRI Journal, Volume 34, Number 1, February 2012 Yong-Hwan Moon et al. 39

in R D M1 V out C S R D M2 inb V in C L M1 R D V out Bias Equalizer & full swing gen Data recovery block DLL Pulse & window gen 253 µm bias R S 2C s R S /2 Unit equalizing filter & equivalent 520 µm 1.7-Gbps data through lossy channel 2-stage equalizing filter 2-stage limiting amp. IPI core 1,000 µm 3-bit gain control (external) 2,850 µm Fig. 14. Core layout and chip microphotograph. V out V in+ V in 1.7 Gbps Recovery data 10b Unit LA Fig. 12. Block diagram and circuits of equalizer. FPGA (TX) Path control signal EQ gain control 3b DUT Recovery clock Start-end 2b select FPGA for BER test (dbv) AC response of the equalizing filter with 3-bit VEQ_CTRL (dbv):f(hz) 15.0 Delta Y:0.0 VEQ_CTRL[111] Level X:1.2g y:10.753 VEQ_CTRL[110] 10.0 y:7.088 VEQ_CTRL[000] 5.0 0.0 5.0 100meg y:0.97158 1g f (Hz) y:7.088 10g Fig. 13. AC response of equalizer with EQ_CTRL value. The equalizer s compensation level is increased as the EQ_CTRL[2:0] value increases. The compensation level is as high as 10 db with EQ_CTRL[2:0]= 111. Figure 13 shows the simulated compensation level with various EQ_CTRL values. V. Measurement Results The proposed 1.7-Gbps DLL-based CDR with 10B12B data formatting was fabricated in a 3.3-V, 0.35-μm CMOS process. Figure 14 shows the core layout and the chip microphotograph of the proposed CDR. The area of the CDR core is 520 μm 253 μm. The CDR core consumed approximately 8 ma, and Fig. 15. Measurement setup. the whole chip including the output buffer consumes 34 ma at a 1.7-Gbps input data rate under 3.3-V supply voltage. The device under test (DUT) is implemented with a chip-on-board on the FR4 printed circuit board. Figure 15 shows the measurement environment for bit error rate (BER) test. The proposed CDR was tested with the fieldprogrammable gate array (FPGA) board for generation of encoded patterns. The FPGA board, TX, transmits the training pattern and the real data pattern. Measurement shows that the operating data rate is between 1.2 Gbps and 1.73 Gbps. Table 1 shows several test data patterns of the 10-bit real data and clock information. The FPGA board serializes the 10-bit data with 01 clock information and transmits the serial data to the CDR. The proposed CDR requires two types of data pattern, training and real data. The proposed CDR recovers the clock and data during the real data pattern. The BER test was executed for the real data pattern generated by the pseudo random bit sequence (PRBS) in the FPGA. With 2 31 1 random patterns, no error was detected for a day operation. As shown in Fig. 17, BER test was measured through FPGA s on the TX/RX sides. Since the signal patterns in the transmitter were generated by FPGA programming, the jitter modulation source for measuring jitter tolerance could not be incorporated in the test setup. Therefore, jitter tolerance performance could not be measured. Since the locking time of the DLL operation is measured as about 420 ns, the training period must be longer than 420 ns. Thus, about 720 40 Yong-Hwan Moon et al. ETRI Journal, Volume 34, Number 1, February 2012

Table 1. Several test patterns for checking worst case ISI. Fig. 16. Measured jitter of recovered clock. bits are required for the minimum training period for a 1.7 Gbps input rate. The 38-ps peak-to-peak jitter at 1.2-Gbps data rate and 44-ps peak-to-peak jitter at 1.7 Gbps were measured, respectively. Figure 16 shows the root-mean-square (RMS) jitter and the peak-to-peak jitter of the 142-MHz recovered clock with the 1.7-Gbps real data pattern. The RMS jitter and the peak-topeak jitter of the recovered clock are measured as 4.8 ps and 44 ps, respectively. From the test results on the sliding window scheme, the expected optimal window position shows better results than +/ shifted windows, and the extended window pulse shows the best data recovery performance. This explains why the generated widow pulse width is shortened slightly due to the RC effect on the line. In Table 1, several input test patterns are given to check for a possible ISI effect. To investigate the effect, the prototype circuit includes an equalizer at the input. The input path can be connected to the input buffer directly or through the equalizer by a digital switch. The ISI effect can be evaluated by measuring the data performance with the equalizer and without the equalizer. Under severe ISI input data conditions, the CDR with the equalizer and without the equalizer recovers the clock up to the 20-inch trace line. However, in the 30-inch trace line, only the CDR core with equalizer recovers the clock correctly. Thus, the test indicates that the equalizer should be on at full strength (set 111 for VEQ_CTRL bits) for error free operation in the longer trace line. Figure 17 shows the recovered data in data channel 1 and data channel 7 among 10 data channels in parallel at 142 Mbps for input patterns in Table 1 with the 30-inch FR4 trace line and the equalizer. The eye pattern of the output is shown in Fig. 18. The CDR operates the power supply voltage in a range from 2.9 V to 3.5 V. A performance summary of the proposed CDR is given in Table 2, and a comparison with [1] and [10] is given in Table 3. The summary illustrates the advantages that the No. Clock inform. D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 1 0 1 1 1 1 1 1 1 1 1 1 1 2 0 1 0 1 1 0 1 1 0 0 1 1 3 0 1 0 1 1 1 0 1 0 0 0 1 4 0 1 0 0 0 0 0 1 0 0 0 0 5 0 1 0 0 0 1 1 1 0 1 0 1 6 0 1 0 1 1 1 0 0 0 0 1 0 7 0 1 1 1 1 0 0 1 0 1 1 1 8 0 1 0 0 0 1 0 0 0 1 0 1 9 0 1 1 1 0 0 0 1 1 1 0 1 10 0 1 0 0 0 0 1 1 0 0 1 1 1111011010 (a) 1000101110 (b) Fig. 17. Recovered data in (a) data channel 1 and (b) data channel 7. proposed CDR provides in terms of power consumption and jitter performance. VI. Conclusion A DLL-based CDR has a design based on the 10B12B formatting scheme for application in high-performance display interfaces. The proposed data formatting scheme is realized by inserting a 01 pattern at every 10-bit input. The proposed CDR recovers the clock and data in a 1:10 demultiplexed manner without an external reference clock. The CDR consumes approximately 8 ma at 1.7-Gbps input data under a ETRI Journal, Volume 34, Number 1, February 2012 Yong-Hwan Moon et al. 41

3.3-V power supply using a 0.35-μm CMOS process. The RMS jitter and the peak-to-peak jitter of the recovered clock are measured as 4.8 ps and 44 ps, respectively. Acknowledgment The authors also thank the IDEC program and for its hardware and software assistance for the design and simulation. References Fig. 18. Eye diagram of recovered data (@142 Mbps). Table 2. Measured performance summary. Specification Technology Supply voltage Chip area (only core) Operating data rate Jitter of recovered clock Power supply voltage range Power consumption Locking time Results 0.35-μm CMOS process 3.3 V 520 μm 253 μm 1.2 Gbps to 1.73 Gbps 44.00-ps peak-to-peak @1.7 Gbps 4.80-ps RMS @1.7 Gbps 2.9 V to 3.5 V 8 ma @3.3 V without output buffer 34 ma @3.3 V with output buffer 420 ns Table 3. Performance comparison summary. Specification Proposed [1] [10] Technology 0.35-μm CMOS process 0.28-μm CMOS process 0.25-μm CMOS process Supply voltage 3.3 V 3 V 2.5 V Chip area (only core) Power consumption Jitter of recovered clock 520 μm 253μm 1 ma (DLL)+ 3 ma (data rec.)+ 0.5 ma (pulse gen.)+ 0.2 ma (bias)+ 3.3 ma (equalizer) @ 3.3 V 4.80-ps RMS @1.7 Gbps 1,000 μm 400 μm 67 ma @3 V 11-ps RMS @2 Gbps 2 mm 2 137 ma @2.5 V (total) 120.1-ps peak-to-peak @1.82 Gbps Coding scheme 10B12B 8B10B 24B28B Basic structure DLL-based CDR w/o reference clock PLL-based CDR PLL-based CDR [1] K. Yamguchi et al., A 2.0 Gb/s Clock-Embedded Interface for Full-HD 10-Bit 120-Hz LCD Drivers with 1/5-Rate Noise- Tolerant Phase and Frequency Recovery, Proc. IEEE Int. Solid- State Circuits Conf., Feb. 2009, pp. 192-194. [2] C.N. Chuang and S.I. Liu, A 3-8 GHz Delay-Locked Loop With Cycle Jitter Calibration, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 11, Nov. 2008, pp. 1094-1098. [3] X. Maillard, F. Devisch, and M. Kuijk, A 900-Mb/s CMOS Data Recovery DLL Using Half-Frequency Clock, IEEE J. Solid-State Circuits, vol. 37, no. 6, June 2002, pp. 711-715. [4] A.L. Coban, M.H. Koroglu, and K.A. Ahmed, A 2.5-3.125-Gb/s Quad Transceiver With Second-Order Analog DLL-Based CDRs, IEEE J. Solid-State Circuits, vol. 40, no. 9, Sept. 2005, pp. 1940-1947. [5] T.-H. Kim, S.-H. Kim, and J.-K. Kang, A DLL-Based Clock Data Recovery with a Modified Input Format, IEICE Electron. Express (ELEX), vol. 7, no. 8, Apr. 2010, pp. 539-545. [6] C.S. Jang et al., An Intra Interface of Flat Panel Displays for High-End TV Applications, IEEE Trans. Consum. Electron., vol. 54, no. 3, Aug. 2008, pp. 1447-1452. [7] R.I. McCartney and M.J. Bell, A Third Generation Timing Controller And Column-Driver Architecture Using Point-to-Point differential Signaling, J. Soc. Inf. Display, vol. 13, no. 2, Feb. 2005, pp. 91-97. [8] C.H. Kim et al., A 64-Mbit, 640-MByte/s Bidirectional Data Strobed, Double-Data-Rate SDRAM with a 40-mW DLL for a 256-MByte Memory System, IEEE J. Solid-State Circuits, vol. 33, no. 11, Nov. 1998, pp. 1703-1710. [9] D.A. Johns, K. Martin, Analog Integrated Circuit Design, New York: John Wiley & Sons, Inc., 1997, p. 259. [10] I. Jung et al., A 140-Mb/s to 1.82-Gb/s Continuous-Rate Embedded Clock Receiver for Flat-Panel Displays, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 10. Oct. 2009, pp. 773-777. 42 Yong-Hwan Moon et al. ETRI Journal, Volume 34, Number 1, February 2012

Yong-Hwan Moon received the BS and MS in electrical engineering in 2002 and 2004, respectively, from Inha University, Incheon, Rep. of Korea. He is currently working toward the PhD in the Department of Electrical and Engineering, Inha University. His research interests include high-speed interface and clock and data recovery circuits. Sang-Ho Kim received the BS and MS in electronics engineering from Inha University, Incheon, Rep. of Korea, in 2007 and 2010, respectively. Since 2010, he has been with Siliconworks Co., Ilsan, Rep. of Korea. His research interests include high-speed clock and data recovery and VLSI circuit design. Tae-Ho Kim received the BS and MS in electronics engineering from Inha University, Incheon, Rep. of Korea, in 2007 and 2009, respectively. He is currently working toward the PhD at the same university. His research interests include mixed-mode high-speed serial link interface and signal integrity. Hyung-Min Park received the BS and MS in electronics engineering from Inha University, Incheon, Rep. Korea, in 2009 and 2011, respectively. Since 2011, he has been with LG Innotek Co., Ansan, Rep. of Korea. His research interests include high-speed clock generation and VLSI circuit design. Jin-Ku Kang received his PhD in electrical engineering from North Carolina State University in 1996. From 1983 to 1988, he worked at Samsung Electronics, Inc., Korea. From 1996 to 1997, he was with Intel as a senior design engineer. Since 1997, he has been a professor in the School of Electronics Engineering, Inha University, Rep. of Korea. His research interests are high-speed CMOS VLSI design, mixed-mode IC design, and highspeed serial interface design. ETRI Journal, Volume 34, Number 1, February 2012 Yong-Hwan Moon et al. 43