IL2225 Physical Design Nasim Farahini farahini@kth.se
Outline Physical Implementation Styles ASIC physical design Flow Floor and Power planning Placement Clock Tree Synthesis Routing Timing Analysis Verification and Energy Calculation 1 December 2013 Slide 2
Overview: Digital Design Flow System Specification X=(AB*CD)+(A+D)+(A(B+C)) Y=(A(B+C))+AC+D+A(BC+D)) Architectural Design Logic Synthesis Physical Synthesis Physical Verification / Sign-off Fabrication Packaging and Testing 3
Physical Design Physical design converts a circuit description into a geometric description. This description is used to manufacture a chip. Design Objectives Power (dynamic/static) Performance (frequency) Area (cost) Yield (cost) Gate Level Netlist Physical Layout 1 December 2013 Slide 4
Physical Design challenges 1- Design complexity Number of transistors on the chip is increasing 2- Scaling More design rules Manufacturability Variability 3- Productivity Time-to-market Engineering efficiency 1 December 2013 Slide 5
Physical Design Styles Full-custom design Manual placement of the transistors and wiring. Advantages: Less area, Better performance, Less power Disadvantages: High engineering effort, Long time-to-market, High development cost Semi-custom design (standard-cell based) Pre-physically designed commonly used logic cells which are characterized and stored in standard cell libraries. Used in Electronic Design Automation Routing of inter-cell connections Programmable Logic Devices Array of logic cells connected via routing channels Like FPGAs, Gate Arrays 6
Standard-Cell Based Physical Design Standard cells: layouts of library cells including logic elements like gates, flip-flops, and ALU functions The height of the cells are constant. 7
Physical Design Flow Gate Level Netlist Floor and Power gate-level Planning circuit Placement A very brief tour of physical design floorplanning placement Clock Tree Synthesis Timing analysis floorplanning placement Routing repeater insertion Post Route Analysis clock tree synthesis Post Route Verification P/G network / routing Metal Fill Insertion metal fill insertion Mask Generation/OPC Sign off mask generation / OPC reticles GDSII Parasitic Static extraction Timing Analysis Power Parasitic analysis extraction Power Analysis Signal Integrity Signal Integrity Clock Tree mask Metal Wires mask after OPC Slide 8
Cadence SoC Encounter Design Flow Input Files: Generated and verified files from logic synthesis Gate Level Netlist (.v file) SDC file: Standard Delay Constraints, Generated by logic synthesis tool Technology Files: LEF file: Standard-cell layout information, Contains layer, via and macro definition Lib file (.TLF) Output Files: Standard-cell timing information, e.g. delay and capacitance GDSII: database file format which is the industry standard for data exchange of IC layout design information. DEF file: Design exchange format to output the design so it is readable by other modules. 1 December 2013 Slide 9
Design Import Designà Import Design Veriglog Netlist File Toplevel of the design.lib.tlf Files.LEF File.SDC File 1 December 2013 Slide 10
Design Import Advance à Power 1 December 2013 IL2200, ASIC Design Slide 11
Flattening the Netlist: Logic Hierarchy and Physical Hierarchy Flattening the Netlist: Logic Hierarchy & Physical Hierarchy The Layout is Flat The netlist is not Netlist - hierarchical Top Layout view FLAT! A2 Top C1 C3 RAM A4 A3 C2 A B A1 A1 A2 A3 A4 C RAM Netlist - Expanded Top C1 C2 C3 = leaf cell (std or macro cell) A1 A2 A3 A4 C1 C2 C3 RAM 1 December 2013 Slide 12 Columbia University
Floor Planning The floorplanning problem is to plan the positions and shapes of the modules at the beginning of the design PS Step 2: Floorplan cycle to optimize the circuit performance: chip area total wirelength delay of critical path Routability Setting X/Y_BOUNDS Of CLUSTER or CELL Setting XY location of CELL Creation of core area for rough placement Creation of SITEs for detailed placement RAM Update of port s XY coordinates Creation of routing OBSTRUCTION 1 December 2013 Slide 13 Columbia Unive
Automatic Floor Planning Automatic Floor Planning: Analyzes the data flow between design blocks based on their connectivity and their location Relative Floor Planning: Capture and define the placement relationship of floorplan objects independently from the actual coordinates in a floorplan flexible way to place objects, such as modules, blocks, groups, blockages, pin guides, pre-routed wires, and power domains I/O pins can be used as reference objects but they cannot be relative objects 1 December 2013 Slide 14
Relative Floor Planning Pre-route example: S1 and S2 are relative to the object I2 and the Core_Boundary 1 December 2013 Slide 15
Floorplanning example Manually Floor Planed DRRA Fabric 1 December 2013 Slide 16
Floor Planning Aspect Ratio Height/Width Core Utilization Area of Stand. Cell/Area of Core Core to IO Boundary Distance from IO Boundary Core to Die Boundary Distance from Die Boundary 1 December 2013 Slide 17
Partitioning Part of Floor Planning Standard Cells are in Floating States before placement. Have not been assigned a fixed location in Core Time to define clusters and regions To keep time critical component close Soft Regions Boundary can change during standard cell placement Hard Regions Prevent Standard cell crossing boundaries 1 December 2013 Slide 18
Power Planning Deal with Power Distribution Network Power nets are considered as special nets. Need to consider current density (IR drop). Three levels of Power Distribution Rings Carries VDD and VSS around the chip Stripes Rails Carries VDD and VSS from Rings across the chip Connect VDD and VSS to the standard cell VDD and VSS 1 December 2013 Slide 19
Power Planning Rings Stripes (vertical or horizontal) VDD VSS Rails Special Route Power Distribution Network 1 December 2013 Slide 20
Power Planning 21
Power Distribution network Example Power Distribution Network in DRRA 1 December 2013 Slide 22
Placement Global placement (rough location) Detailed placement (legalization) Two associated cost functions Reduce total wiring or routing length Distribute standard cell instances homogeneously in ASIC Core such that optimal equilibrium among vertical and horizontal routing is achieved 1 December 2013 Slide 23
Placement Problem Formulation Input: Blocks (standard cells and macros) B 1,..., B n Shapes and Pin Positions for each block B i Output: Nets N 1,..., N m Coordinates (x i, y i ) for block B i. No overlaps between blocks The total wire length is minimized The area of the resulting block is minimized or given a fixed die Other consideration: timing, routability, clock, buffering
Global Placement Example bad placement good placement
Detailed Placement After global placement To Refine placement based on congestion, timing and power Congestion Driven Placement To distance standard cell instances from each other such that more routing tracks are created between them Timing Driven Placement To optimize large sets of path delays Net Based Try to control the delay on signal path by imposing an upper bound delay or weight to net 1 December 2013 Slide 26
Floor planned Hard Block Placement
Unconstrained Placement
Clock Tree Synthesis Clock Tree: General Concept Automatic insertion of buffers along the clock path to balance the clock delay to all Flip Flops. CLK CLK Main concerns for clock design? Skew For increased clock frequency, skew may contribute over 10% of the system cycle time Skew Delay Area Minimize the propagation delay Number of buffers and total wire length Power It switches at every clock cycle, a major power consumer! Slew rate is important (sharp transition) Noise May need shielding, Clock is often a very strong aggressor Unbuffered clock tree Buffered/balanced clock tree Area Power Slew rates Co
Clock Distribution: How? 1 December 2013 Slide 30
Advanced clock tree synthesis methods 0-skew clock tree synthesis Clock tree synthesis considering process variations 1 December 2013 Slide 31
Clock Distribution 1 December 2013 Slide 32
Clock tree generation based on structure and load balance (H-tree) Clock Distribution Clock tree generation based on structure and load balance (Fish-bone) Taping point Structure balance H-Tree: Structure Balancing Minimize skew by making Interconnections to subunits equal in length Structure and load balance Fish-Bone: Clock Tree Generation based on structure and load balance 1 December 2013 Slide 33
CTS considering process variations P-variations cause unpredictable delay variations in transistors and wires -> uncontrollable skew The delay variations in common part of clock tree between launch and capture flops do not cause skew Goal is to minimize non-common part of clock tree between Launch and capture clock nodes clk D Q Combinational Logic D Q Without On-Chip Variation Awareness clk D Q Combinational Logic D Q With On-Chip Variation Awareness 2 December 2013 Slide 34
Clock Tree Synthesis in SoC Encounter Fish Bone Routing Style 1 December 2013 Slide 35
Clock Tree Synthesis in SoC Encounter The color Difference tells us about clock skew 1 December 2013 Slide 36
Routing Fundamentals Goal is to realize the metal/copper connections between the pins of standard cells and macros Input : placed design fixed number of metal/copper layers Goal: routed design that is DRC clean and meets setup/hold timing Consists of two phases 1. Global route: To estimate the routing congestion 2. Detail route: To assign the nets to the routing tracks Standard cell pin Vertical routing tracks Horizontal routing tracks 37
Interconnect Organization In 65 nm technology, up to 12 metal layers for routing Higher metal layers: Wider, less resistance Proper for assigning global wires and clock nets Less delay, less power consumption Power nets are always assigned to the top level metal layer Less IR drop 1 December 2013 Slide 38
Routing Issues for 90nm Technology and Beyond 1. Timing driven routing 2. Signal integrity aware 3. DRC 4. OPC
1- Timing-Driven Routing At 90nm net delay becomes significant Quality of route can effect timing Optimize critical paths Route some nets first (Net weights) Order of routing (priorities : eg. Default : Clocks 50, others 2) Most routing freedom at start Use shortest paths possible If you have a congested design you may need to set the timing driven effort to low
2- What is Signal Integrity or SI? Signal delay caused by crosstalk noise Possible in 2 directions : push-out pull-down net 1 Aggressor net 2 Victim Speed Up Delay
What is SI? Glitch caused by crosstalk noise Aggressor Extra clock cycle! à Functional Failure Vdd Victim ^ D Q Clk
Crosstalk Prevention : Routing Routing solution Limit length of parallel nets Wire spreading (skip track - clocks) Shield special nets Coupling free routing 43
3- DRC (Design Rule Check) Design rules: Guidelines about the geometry constraints for constructing process masks Information like: Routing layers: width, spacing, pitches General Rules: a) enclosure, b) space, c) overlap d) width, e) extension Specific rules Antenna rules, metal density rules, minimum area A compromise between performance and yield More conservative rules increase probability of correct circuit function (yield) More aggressive rules increase circuit performance ( area, power, delay) 44 d a b c e
DRC Challenges 45 Count of Design rules in the runset 800 700 600 500 400 300 200 100 0 The number of design rules in the DRC runsets for different technology processes 180 130 90 65 45 nm Reasons: - More metal layers - Diff spacing rules depending on width - Recommended rules è general rules 45
4- Optical Proximity Correction (OPC) 2 December 2013 Slide 46
OPC-Aware Routing More OPC friendly 2 December 2013 Slide 47
Mask Layout Data ->Physical Mask layout data physical mask? mask layout data Basic lithographic system Resolution Enhancement Techniques fracture mask writer physical masks (ALTA 4700 mask writer) [source: Schellenberg/IEEE Spectrum] Basic Lithography system 2 December 2013 Slide 48 5
RC Extraction RC extraction is the calculation of all the routed net capacitances and resistances Used for Delay calculation Static Timing Analysis Circuit Simulation Signal Integrity Analysis 2 December 2013 Slide 49
Electromigration Electromigration is the movement of the lattice ions of the interconnect material as the result of the momentum transfer form electrons. High current density or irregular shapes for the interconnects may cause the electromigration to happen in a short time. LSI Design, IL2222 50
Power Analysis Power analysis Reduces risk of IR voltage drops in power nets Reduces Electromigration effects due to high current density Resistance of power and ground net extracted Average current of each transistor connected to power net is calculated Average currents are distributed throughout the power net Calculate node voltages and branch currents 2 December 2013 Slide 51
Power Analysis Average power is sufficient when "time constants" of effects are large Battery Life Thermal Analysis VCD is needed to simulate instantaneous power (current) Necessary for estimation of Simultaneous Switching Noise (SSN) Important for Power Grid signal integrity analysis 2 December 2013 IL2200, ASIC Design Slide 52
Energy Calculation Save your design as a Verilog Netlist Simulate the Design in NCSim/ModelSim Create a VCD File Restore the Design in Encounter Read the VCD Activity file Report Power to get the average power Energy= P ave * T period * No of Cycles 1 December 2013 Slide 53
VCD File Script run -timepoint 0 ns -absolute database -open /media/disk-1/mdpu/add2/mac_0.vcd - vcd -default -timescale us probe -create :mtile -vcd -all -depth all run -timepoint 100 us -absolute database -close /media/disk-1/mdpu/add2/mac_0.vcd 1 December 2013 Slide 54
Power Calculation in Encounter restoredesign /home/ali/physicaldesign/icad2/mdpu/ Tile_final.enc.dat Tile extractrc -outfile file.cap read_activity_file -format VCD -vcd_scope Tile_tb/mTile /media/ disk-1/mdpu/add2/mac_0.vcd reportpower -norailanalysis -outfile /media/disk-1/mdpu/add2/ reports/powenc_0.rep exit 2 December 2013 Slide 55
Verification The complete placed and routed design is verified before fabrication Functional Verification Verification is performed against behavioral RTL pre-layout and post-layout structural description (netlist) for the design validation. Rule Based Assertion based verification Assertions macros are expressions that, if false, indicate and error Instantiated in RTL Code 2 December 2013 Slide 56
LVS(Layout vs. Schematic) Top level labels needed for VDD,VSS, inputs and outputs vdd LVS IN OUT vss Extract the designed devices (nmos, pmos,n-well tap, ) Extract the connectivity between Build a netlist Compare both netlist 57