White Paper Utilizing Leveling Techniques in DDR3 SDRAM Memory Interfaces



Similar documents
Dual DIMM DDR2 and DDR3 SDRAM Interface Design Guidelines

White Paper Understanding Metastability in FPGAs

White Paper FPGA Performance Benchmarking Methodology

Figure 1 FPGA Growth and Usage Trends

DDR subsystem: Enhancing System Reliability and Yield

SOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS

Calibration Techniques for High- Bandwidth Source-Synchronous Interfaces

Enhancing High-Speed Telecommunications Networks with FEC

White Paper Streaming Multichannel Uncompressed Video in the Broadcast Environment

Using Pre-Emphasis and Equalization with Stratix GX

Managing High-Speed Clocks

MAX II ISP Update with I/O Control & Register Data Retention

Source-Synchronous Serialization and Deserialization (up to 1050 Mb/s) Author: NIck Sawyer

DDR4 Memory Technology on HP Z Workstations

White Paper Using the Intel Flash Memory-Based EPC4, EPC8 & EPC16 Devices

LatticeECP3 High-Speed I/O Interface

Using Altera MAX Series as Microcontroller I/O Expanders

White Paper Military Productivity Factors in Large FPGA Designs

FPGAs for High-Performance DSP Applications

11. High-Speed Differential Interfaces in Cyclone II Devices

Table 1: Address Table

MorphIO: An I/O Reconfiguration Solution for Altera Devices

Engineering Change Order (ECO) Support in Programmable Logic Design

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

Intel 965 Express Chipset Family Memory Technology and Configuration Guide

Table 1 SDR to DDR Quick Reference

Configuring Memory on the HP Business Desktop dx5150

Samsung DDR4 SDRAM DDR4 SDRAM

Achieving High Performance DDR3 Data Rates

Using the On-Chip Signal Quality Monitoring Circuitry (EyeQ) Feature in Stratix IV Transceivers

December 2002, ver. 1.0 Application Note 285. This document describes the Excalibur web server demonstration design and includes the following topics:

High-Speed SERDES Interfaces In High Value FPGAs

In-System Programmability

Manchester Encoder-Decoder for Xilinx CPLDs

Intel X38 Express Chipset Memory Technology and Configuration Guide

Using the Agilent 3070 Tester for In-System Programming in Altera CPLDs

MAX 10 Clocking and PLL User Guide

DDR3 memory technology

External Memory Interface Handbook Volume 2: Design Guidelines

White Paper Increase Flexibility in Layer 2 Switches by Integrating Ethernet ASSP Functions Into FPGAs

ADQYF1A08. DDR2-1066G(CL6) 240-Pin O.C. U-DIMM 1GB (128M x 64-bits)

Sentinel-SSO: Full DDR-Bank Power and Signal Integrity. Design Automation Conference 2014

Selecting the Optimum PCI Express Clock Source

White Paper 40-nm FPGAs and the Defense Electronic Design Organization

Application Note for General PCB Design Guidelines for Mobile DRAM

AVR151: Setup and Use of the SPI. Introduction. Features. Atmel AVR 8-bit Microcontroller APPLICATION NOTE

Intel Q35/Q33, G35/G33/G31, P35/P31 Express Chipset Memory Technology and Configuration Guide

Fairchild Solutions for 133MHz Buffered Memory Modules

Video and Image Processing Suite

USB-Blaster Download Cable User Guide

ThinkServer PC DDR2 FBDIMM and PC DDR2 SDRAM Memory options boost overall performance of ThinkServer solutions

White Paper Power-Optimized Solutions for Telecom Applications

Low Power AMD Athlon 64 and AMD Opteron Processors

External Memory Interface Handbook Volume 2: Design Guidelines

Applying the Benefits of Network on a Chip Architecture to FPGA System Design

Understanding CIC Compensation Filters

Memory Module Specifications KVR667D2D4F5/4G. 4GB 512M x 72-Bit PC CL5 ECC 240-Pin FBDIMM DESCRIPTION SPECIFICATIONS

DDR Memory Overview, Development Cycle, and Challenges

Flexible Active Shutter Control Interface using the MC1323x

21152 PCI-to-PCI Bridge

Simplifying System Design Using the CS4350 PLL DAC

Memory Module Specifications KVR667D2D8F5/2GI. 2GB 256M x 72-Bit PC CL5 ECC 240-Pin FBDIMM DESCRIPTION SPECIFICATIONS

1. Memory technology & Hierarchy

DEVELOPMENT OF DEVICES AND METHODS FOR PHASE AND AC LINEARITY MEASUREMENTS IN DIGITIZERS

Switch Fabric Implementation Using Shared Memory

Comparison of DDRx and SDRAM

User s Manual HOW TO USE DDR SDRAM

Maximizing Server Storage Performance with PCI Express and Serial Attached SCSI. Article for InfoStor November 2003 Paul Griffith Adaptec, Inc.

Current vs. Voltage Feedback Amplifiers

EDI s x32 MCM-L SRAM Family: Integrated Memory Solution for TMS320C3x DSPs

ThinkServer PC DDR3 1333MHz UDIMM and RDIMM PC DDR3 1066MHz RDIMM options for the next generation of ThinkServer systems TS200 and RS210

Providing Battery-Free, FPGA-Based RAID Cache Solutions

15. Introduction to ALTMEMPHY IP

BitBlaster Serial Download Cable

Technical Note. Initialization Sequence for DDR SDRAM. Introduction. Initializing DDR SDRAM

1.55V DDR2 SDRAM FBDIMM

Clocking. Figure by MIT OCW Spring /18/05 L06 Clocks 1

GR2DR4B-EXXX/YYY/LP 1GB & 2GB DDR2 REGISTERED DIMMs (LOW PROFILE)

Hardware and Layout Design Considerations for DDR4 SDRAM Memory Interfaces

OpenSPARC T1 Processor

Technical Note Design Guide for Two DDR UDIMM Systems

Using the Altera Serial Flash Loader Megafunction with the Quartus II Software

Digital Multiplexer and Demultiplexer. Features. General Description. Input/Output Connections. When to Use a Multiplexer. Multiplexer 1.

Signal Integrity: Tips and Tricks

Configuration via Protocol (CvP) Implementation in Altera FPGAs User Guide

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

White Paper Reduce Total System Cost in Portable Applications Using Zero-Power CPLDs

Introduction to PCI Express Positioning Information

High speed pattern streaming system based on AXIe s PCIe connectivity and synchronization mechanism

Figure 1. Core Voltage Reduction Due to Process Scaling

Features. DDR SODIMM Product Datasheet. Rev. 1.0 Oct. 2011

A Scalable Large Format Display Based on Zero Client Processor

Memory technology evolution: an overview of system memory technologies

D1 D2 D3 D4 D5 D6 D7. Benefits: Stable system operation No trace length matching Low PCB cost

AN LPC1700 timer triggered memory to GPIO data transfer. Document information. LPC1700, GPIO, DMA, Timer0, Sleep Mode

Video Speed Class: The new capture protocol of SD 5.0

Technical Note DDR2 Offers New Features and Functionality

Ethernet. Ethernet Frame Structure. Ethernet Frame Structure (more) Ethernet: uses CSMA/CD

Transcription:

White Paper Introduction The DDR3 SDRAM memory architectures support higher bandwidths with bus rates of 600 Mbps to 1.6 Gbps (300 to 800 MHz), 1.5V operation for lower power, and higher densities of 2 Gbits on a 90-nm process. While this architecture is undoubtedly faster, larger, and lower power per bit, just how does one go about interfacing a DDR3 SDRAM DIMM to an FPGA? Leveling is the key word. Without having the leveling feature built directly into the FPGA I/O structure, interfacing anything to a DDR3 SDRAM DIMM is going to be complicated, costly, and involve numerous external components. So, what is leveling and why is it so important? In order to improve signal integrity for supporting higher performance, JEDEC defined fly-by termination for clocks and the command/address bus. This fly-by topology reduces the simultaneous switching noise (SSN) but causes flight-time skew between clock and data/strobes at every DRAM as the clock and address/command signals traverse the DIMM, as shown in Figure 1. Figure 1. DDR3 SDRAM DIMMs: Flight-Time Skew Reduces SSN, Data Must Be Levelled up to 2 Clock Cycles at the Controller This flight time skew can be up to 0.8 tck, enough spread not to know which of two clock cycles the data may return in. Therefore, the leveling feature was defined for DDR3 memories by JEDEC to enable controllers to compensate for this skew by adjusting the timing per byte lane. The latest FPGAs offer many features that interface with double data rate SDRAM memories for a wide range of applications such as desktops, servers, storage, LCD displays, and networking and communication equipment. However, to work with the newest DRAM technology, DDR3 SDRAM, a robust leveling scheme is required. FPGA I/O Structure FPGAs, such as the recently announced Altera Stratix III device family, provide I/Os capable of high speeds and greater flexibility to support existing and emerging external memory standards. Read Leveling During a read, the memory controller side must compensate for the delays introduced by the fly-by memory topology that impact the read cycle. Leveling should be thought of as more than just I/O delay that appears in the data path. 1T and neg-edge registers are also required to level or align all the data. Each DQS requires a separate phase shift of the resync clock position (PVT compensated). Figure 2 shows two DQS groups returning from the DIMM for the same read command. WP-01034-1.0 November 2007, ver. 1.0 1

Altera Corporation Figure 2. 1T, Neg-Edge and Leveling Registers in a Stratix III I/O Element DLL (PVT compensation) 90 PLL PVT compensated FPGA fabric 90 Stratix III I/O block Initially, each separate DQS is phase-shifted a nominal 90 degrees and the DQ data associated with its group is captured. Then a free-running resynchronization clock (at the same frequency and phase of the DQS) is used to move the data from its capture domain into the leveling circuit shown by the pink and orange links in Figure 2. At this stage, each DQS group has a separate resynchronization clock. Next the DQ data is passed to the 1T registers. Figure 2 shows an example of the 1T register being required in the upper channel to delay the DQ data bits in that particular DQS group. Note that in this example, the 1T registers are not required in the lower channel. This process begins aligning the upper channel to the lower channel. Whether the 1T registers are required or not for any given channel is decided automatically as part of the calibration scheme in free PHY IP core. Both DQS groups are then passed on the neg-edge registers. Again, optional registers are switched in or out at start up by the automatic calibration process, if required. The final stage is to align both the upper and lower channels back onto the same resynchronization clock, thus creating a source synchronous interface that passes a fully aligned, or levelled, single data rate (SDR) data to the FPGA fabric. Write Leveling Similar to read leveling but in reverse, DQS groups are launched at separate times to coincide with a clock arriving at devices on the DIMM, and must meet the tdqss parameters of +/- 0.25 tck. Other FPGA I/O Innovations High-end FPGAs have a host of other innovative I/O features that allow simple and robust interfacing to a range of memory interfaces such as dynamic on-chip termination (OCT), variable I/O delay, and half data rate (HDR) capability, as shown in Figure 3. This remainder of this paper follows these features (from left to right), pausing at each step to examine each in detail. 2

Altera Corporation Figure 3. Useful I/O Features for DDR3 SDRAM Memory Interfaces Dynamic OCT Parallel and serial OCT provide the appropriate line termination and impedance matching for both the read and write busses. This removes the need for external resistors at the FPGA and saves on external component costs, board space, and routing complexity. It also significantly reduces power consumption, because the parallel termination is effectively out of circuit on a write operation. Figure 4 shows the terminations for both read and write operations. Figure 4. Dynamic OCT - Read and Write Operations Read Dynamic OC T FPGA Write Memory Variable Delay for DQ Deskew Variable input and output delay (shown in Figure 5) can be used for trace length mismatch and electrical deskew. The fine input and output delay resolution (i.e., 50-picosecond (ps) steps) can be used for finer inter-dqs deskew (separate to the leveling function) caused either by mismatch in board length or variations in I/O buffers of the FPGA and memory devices, as demonstrated in Table 1. Ultimately, this increases the capture margin for each DQS group. 3

Altera Corporation Figure 5. Static and Dynamic Delay in an I/O Element Run Time Configurable Run Time Configurable Set at Compile Table 1. FPGA I/O Delays Path Variable Static Static Total Input 750 ps 350 ps 2,800 ps 3,900 ps 50 ps 50 ps 400 ps Path Variable I/O Buffer Total Output 1,100 ps 150 ps 1,250 ps 50 ps 50 ps The delay elements can be reached from the FPGA fabric at run time to implement automatic DDR3 deskew algorithms as part of the start-up calibration process. Figure 6 shows an illustration of how DQ data can be deskewed and centered around DQS for extra capture margin. The output delay can also be used to insert a small amount of skew into the output path to intentionally reduce the number of I/Os being switched simultaneously. Figure 6. Conceptual DQ Deskew Within a DQS Group Centered Around 90-Degree Phase-Shifted DQS dq0 dq1 dq2 dq3 dq4 dq5 dq6 dq7 dqs 0 15 30 45 60 75 90 105 120 135 150 165 180 dq0 dq1 dq2 dq3 dq4 dq5 dq6 dq7 dqs 0 15 30 45 60 75 90 105 120 135 150 165 180 Reliable Capture The DQS signals act as the input strobes and must be shifted to an optimal position for capture of read transactions. The phase-shift circuitry (shown in Figure 7) can shift the incoming DQS signals by 0, 22.5, 30, 36, 45, 60, 67.5, 72, 90, 108, 120, 135, 144, or 180, depending on the DLL frequency mode. The shifted DQS signals are then used as clocks at the I/O element input registers. 4

Altera Corporation Figure 7. DQ Capture Circuit The delay-locked loop (DLL) shown in Figure 7 maintains the phase shift in a fixed location across PVT. Figure 8 shows the relationship between the DLL and phase-shift circuitry. Figure 8. DLL and DQS Phase-Shift Circuitry DQS in Offset Reference clock 100 to 400 MHz DLL Block Phase comparator Control signal 16 delay blocks Shifted DQS The DLL uses a frequency reference to generate control signals dynamically for the delay chains in each of the DQS pins, allowing it to compensate for PVT variations. There are four DLLs in a Stratix III device, each located in a corner of the device. Each DLL can reach two sides of the device, allowing support for multiple DDR3 SDRAM memory interfaces on all sides of the device. High-Speed Data Rate Domain Crossing and Design Simplification DDR capture registers and HDR registers allow safe transfer of data from the double data rate domain (data on both edges of the clock), down to the SDR domain (data on single positive edge of clock at the same frequency, but at twice the data width), down to HDR domain (data on the positive edge of clock, but the frequency is now half that of the SDR and the data width is again doubled), making the internal design timing much easier to achieve. Figure 9 shows how the DQ data moves through these various data rate domains. 5

Altera Corporation Figure 9. Stratix III Input Path Registers Die, Package, and Digital Signal Integrity Enhancements The design of an FPGA die and package should provide robust signal integrity for the high-performance memory interfaces (i.e., having an 8:1:1 user I/O to ground and power ratio and optimized signal return paths as shown in Figure 10). In addition, the design should provide OCT, variable slew rate, and programmable drive strength to correctly manage signal quality. Figure 10. Eight User I/Os to Each Power and Ground Conclusion High-performance FPGAs complement high-performance DDR3 SDRAM DIMMs by providing high-memory bandwidth, improved timing margin, and great flexibility in system design. The combination of FPGAs with DDR3 SDRAM supports the high-throughput requirements of today s communication, networking, and digital signal processing systems. Acknowledgements Paul Evans, Product Marketing Manager for Stratix III FPGAs, High-End FPGA Products, Altera Corporation 6

Altera Corporation 101 Innovation Drive San Jose, CA 95134 www.altera.com Copyright 2007 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. 7