Advanced Digital IC Design How Does FPGA Work Arnaud Taffanel Peyman Pouyan Outline FPGA Basics Virtex 5 Power Consumption in FPGAs Low Power Approaches 2008-2-19 CMOS Design Styles FPGA Basics ASIC STANDARD IC FULL CUSTOM SEMI- CUSTOM Programmable STANDARD CELL GATE ARRAY, SEA OF GATES FPGA CPLD
Programmable Main Idea Basic idea: two-dimensional array of logic blocks and flip-flops with a means for the user to configure The interconnection between the logic blocks The function of each block. Different Programmable Devices Types of programmable logic Programmable Array (PLA) Programmable AND (PAL) Complex Programmable Device (CPLD) Field Programmable Gate Array (FPGA) Programmable Array (PLA) Two programmable planes Any combination of ANDs / Ors Sharing of AND terms across multiple Ors Programmable switches between horizontal and vertical lines A B C D Programmable AND array Programmable OR array Programmable AND (PAL) One programmable plane - AND / fixed OR Finite combination of ANDs / Ors Fewer switch count Faster than PLAs A B C D Programmable AND array Fixed OR Q0 Q1 Q2 Q3 Q0 Q1 Q2 Q3
Complex Programmable Devices (CPLD) Why FPGAs? Block contains PAL / PLA Registers Interconnect includes Full crossbar I/O Block Block Block Block Programmable interconnect Block Block Block Block I/O Partial interconnect Why FPGAs? (Contd.) Custom ICs sometimes designed to replace the large amount of glue logic Reduced system complexity and manufacturing cost, improved performance Custom ICs are very expensive to develop Custom ICs have a long delay to fabricate (time to market) Need to worry about two kinds of costs Development cost sometimes called non-recurring engineering (NRE) Manufacturing Cost Why FPGAs? (Contd.) Custom IC approach suitable only for products With very high volume (which decrease the NRE) Not time to market sensitive FPGAs introduced as an alternative to custom ICs Improved density relative to discrete SSI/MSI components With the aid of computer aided design (CAD) tools circuits could be implemented in a short amount of time relative to ASICs No physical layout process, no mask making, no IC manufacturing Lowers NRE Shortens TTM
FPGAs Why FPGAs? (Contd.) Programmable Elements Overview Compete with custom ICs Compete with microprocessors in dedicated and embedded applications Summary performance NREs Unit cost TTM ASIC ASIC FPGA ASIC FPGA MICRO FPGA MICRO MICRO ASIC FPGA MICRO Computer Aided Design Programmable Elements Overview (Contd.) Antifuse Programmable Elements Overview (Contd.) SRAM Configuration Memory Cell Read or Write Data Routing Connections
Field Programmable Gate Arrays Yes(in-system) Yes Medium No No Low Configurable Block (CLB) Look-up table (LUT) Register Or any kind of logic Adder, Multiplier, Memory, Microprocessor Input/Output Block (IOB) Special logic blocks at periphery of device for external connections Programmable interconnect Wires to connect inputs and outputs to logic blocks Field Programmable Gate Arrays (Contd.) Other FPGA building blocks LUT based Block: Clock distribution Embedded memory blocks Special purpose blocks DSP blocks Hardware multipliers, adders and registers Embedded microprocessors/microcontrollers High-speed serial transceivers
A transmission gate-based LUT 2-Input MUX as A Programmable A 0 B 1 S F Configuration A B S F= 0 0 0 0 0 X 1 X 0 Y 1 Y 0 Y X XY X 0 Y XY Y 0 X XY Y 1 X X + Y 1 0 X X 1 0 Y Y 1 1 1 1 MUX based Block: a&b c
Programmable Interconnect Fast local interconnect Horizontal and vertical lines of various lengths Switch matrixes Switch Matrix Structure Switch matrix programming illustration Switch Matrix Interconnect Switch Matrix Structure (Contd.) 6 pass transistors per switch matrix interconnect point Pass transistors act as programmable switches Pass transistor gates are driven by configuration memory cells FPGA Variations Families of FPGA s differ in Physical means of implementing user programmability Arrangement of interconnection wires Basic functionality of the logic blocks Most significant difference is in the method for providing flexible blocks and connections
Sea-Of-Module Architecture Sea-Of-Module Architecture Routing Channel Architecture Actel FPGA 8 input, single output combinational logic blocks Rows of programmable logic building blocks rows of interconnect Anti-fuse Technology I/O Buffers, Programming and Test I/O Buffers, Programming and Test I/O Buffers, Programming and Test I/O Buffers, Programming and Test Computer Aided Design Module Wiring Tracks
Actel module Basic module is a modified 4:1 multiplexer SOA S0 S1 Actel module (Contd.) Implementation of S-R Latch using actel FPGA R "0" D0 D1 2:1 MUX "0" 2:1 MUX 2:1 MUX Y 2:1 MUX Q D2 D3 2:1 MUX "1" 2:1 MUX SOB S Actel Interconnect XC4000 FPGA Architecture Module Horizontal Track Anti-fuse Vertical Track SRAM cells throughout the FPGA determine the functionality of the device
2 Four input function Generators(LUTS) 1 Three-input function 2 Registers Possible functions: Any fct of 5 var Two fcts of 4 var+one Fct of 3 var XC4000E CLB Example Implement the following functions on a single CLB of the XC4000 FPGA: X = A B (C + D) Y = AK + BK + C D K + AEJL Use look up table F to implement X Use look up table G for AEJL Use F, G and H for Y: Y = K(A+B + C D ) + AEJL = KX + AEJL= KF +G Example Virtex 5
Virtex 5 High end of the Xilinx FPGA's Family High performance (550MHz) Low-power conception 65nm CMOS process 2 slices per CLB Slices CLBs 4 Register 4 LUT Carry logic 6-input LUT : More efficients logic (ie. 4->1 Mux) CLBs Slices
Slices (contd) SLICEL Many configuration possible 64x1 64x1 Single Single port port RAM RAM 32x1 32x1 Dual Dual port port RAM RAM 32 32 stages stages Shift Shift Register Register 64x1 64x1 ROM ROM LUT LUT SLICEM Only Rom or LUT Slices (contd) 6 Input LUT Optimize common logic implementation DSP Slices High-performances DSP-Slices Up to 250 GMACs! Single precision float optimized 40 Operating mode adaptable dynamically
RAM 2 Types of RAM usable Distributed RAM Used the LUTs as RAM Closed to the logic Less RAM Use CLBs Global RAM More RAM 555MHz Far from the logic Routing problem Speed problem Power Consumption In FPGAs Power Consumption in FPGAs 1-Static power (5% to 20% ) Leakage current: reverse biased diode leakage current Sub-threshold conduction of transistors 2-Dynamic power (80% to 95% ) PD =1/2*C*Vdd^2*α* fclk It is consumed at the time of output switching of a CMOS circuit Dynamic power consumption in FPGA design Clock frequency Supply voltage Switching activity Resource utilization Power dissipation distribution in Xilinx Virtex-II FPGA
Ways to Reduce Dynamic Power Low Power Approaches in FPGAs Frequency Reduction Voltage Scaling Capacitance Reduction Input capacitance of the fan-out gates, Capacitance associated with Programmable interconnects Parasitic capacitance of the gate. Switching Activity Reduction Switched Capacitance Reduction Resource Utilization Reduction Using Built-in Macro Functions for Low Power Idea: Alternative techniques which use less routing resources than the traditional techniques 1- Low power implementation of register files
2- Low power implementation of shift registers Xilinx SRLUT input and output ports 3-Low power implementation of multiplier and accumulators
Virtex 5 Low Power 65nm CMOS process Potentially more leakage current Solved by Triple Oxide Process Technology Gain in dynamic power Many hard IP block DSP slices Virtex 5 Triple Oxide 65nm process -> more static power Tens of millions static configurations transistors Virtex 5 Triple Oxide (contd) 3 oxide Thickness Thin oxide for high speed part large oxide for 3.3v I/O midox for static configuration part Virtex 5 hard IP block Rocket IO : Low power serial IO hard IP bloc SATA (Serial Ata) Gigabyte Ethernet PCI Express Consume 100mW at 3.2 Gbps Tri-mode Ethernet MAC (10/100/1000)
References The Design Warriors Guide to FPGAS Low Power FPGA Design Techniques for Embedded Systems(PHD Thesis by Anurag Tiwari ) WWW.XILINX.COM Architecture of FPGAs and CPLDs: A Tutorial by Stephen Brown and Jonathan Rose Department of Electrical and Computer Engineering University of Toronto Peter.Nilsson: Slides of Advanced Digital IC Design Arnaud Taffanel - Peyman Pouyan