Design for Power User Experience. David Hui AMD Fellow

Similar documents
ARM Cortex-A9 MPCore Multicore Processor Hierarchical Implementation with IC Compiler

Testing Low Power Designs with Power-Aware Test Manage Manufacturing Test Power Issues with DFTMAX and TetraMAX

Hunting Asynchronous CDC Violations in the Wild

State-of-Art (SoA) System-on-Chip (SoC) Design HPC SoC Workshop

University of Texas at Dallas. Department of Electrical Engineering. EEDG Application Specific Integrated Circuit Design

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation

Low Power AMD Athlon 64 and AMD Opteron Processors

Complete ASIC & COT Solutions

IL2225 Physical Design

StarRC Custom: Next-Generation Modeling and Extraction Solution for Custom IC Designs

How To Design A Chip Layout

Introduction to Digital System Design

Implementation Details

ARM Webinar series. ARM Based SoC. Abey Thomas

Agenda. Michele Taliercio, Il circuito Integrato, Novembre 2001

Engineering Change Order (ECO) Support in Programmable Logic Design

Testing of Digital System-on- Chip (SoC)

Impact of Signal Integrity on System-On-Chip Design Methodologies

Example-driven Interconnect Synthesis for Heterogeneous Coarse-Grain Reconfigurable Logic

Designing a System-on-Chip (SoC) with an ARM Cortex -M Processor

NEC Electronics: Integrating Power Awareness in SoC Design with CPF

TIMING-DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

VLSI Design Verification and Testing

Design Methodology for Engineering Change Orders (ECOs) in a Flat Physical Standard Cells Based Design Environment

ESP-CV Custom Design Formal Equivalence Checking Based on Symbolic Simulation

Testing & Verification of Digital Circuits ECE/CS 5745/6745. Hardware Verification using Symbolic Computation

Thermal Modeling Methodology for Fast and Accurate System-Level Analysis: Application to a Memory-on-Logic 3D Circuit

Shanghai R&D Vacancies August 2014 PV, PE, Intern

DDR subsystem: Enhancing System Reliability and Yield

A Utility for Leakage Power Recovery within PrimeTime 1 SI

From Bus and Crossbar to Network-On-Chip. Arteris S.A.

NTE2053 Integrated Circuit 8 Bit MPU Compatible A/D Converter

Curriculum for a Master s Degree in ECE with focus on Mixed Signal SOC Design

Introduction to Functional Verification. Niels Burkhardt

What is a System on a Chip?

Design Verification and Test of Digital VLSI Circuits NPTEL Video Course. Module-VII Lecture-I Introduction to Digital VLSI Testing

TABLE OF CONTENTS. xiii List of Tables. xviii List of Design-for-Test Rules. xix Preface to the First Edition. xxi Preface to the Second Edition

INTRODUCTION TO DIGITAL SYSTEMS. IMPLEMENTATION: MODULES (ICs) AND NETWORKS IMPLEMENTATION OF ALGORITHMS IN HARDWARE

At-Speed Test Considering Deep Submicron Effects. D. M. H. Walker Dept. of Computer Science Texas A&M University

System-on. on-chip Design Flow. Prof. Jouni Tomberg Tampere University of Technology Institute of Digital and Computer Systems.

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

Digital IC Design Flow

Lynx Design System Delivering Higher Productivity and Predictability in IC Design

Design-Kits, Libraries & IPs

Figure 1 FPGA Growth and Usage Trends

DEVELOPING TRENDS OF SYSTEM ON A CHIP AND EMBEDDED SYSTEM

What will I learn as an Electrical Engineering student?

SLC vs. MLC: An Analysis of Flash Memory

Encounter DFT Architect

System on Chip Design. Michael Nydegger

37 Marketing Automation Best Practices David M. Raab Raab Associates Inc.

FPGA Prototyping Primer

IMPLEMENTATION OF BACKEND SYNTHESIS AND STATIC TIMING ANALYSIS OF PROCESSOR LOCAL BUS(PLB) PERFORMANCE MONITOR

Software Engineering for Real- Time Systems.

Failure code manual. content

Route Power 10 Connect Powerpin 10.1 Route Special Route 10.2 Net(s): VSS VDD

BY STEVE BROWN, CADENCE DESIGN SYSTEMS AND MICHEL GENARD, VIRTUTECH

Pre-tested System-on-Chip Design. Accelerates PLD Development

Tuning DDR4 for Power and Performance. Mike Micheletti Product Manager Teledyne LeCroy

White Paper: Pervasive Power: Integrated Energy Storage for POL Delivery

on-chip and Embedded Software Perspectives and Needs

Tuning DDR4 for Power and Performance. Mike Micheletti Product Manager Teledyne LeCroy

Charged cable event. 1 Goal of the ongoing investigation. 2 Energy sources for the CDE. Content

McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures

Analyzing Electrical Effects of RTA-driven Local Anneal Temperature Variation

Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization. TingTing Hwang Tsing Hua University, Hsin-Chu

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

ECE 410: VLSI Design Course Introduction

VARTA EasyPack. design-in handbook. The easy way to power portable devices! See also:

Allocation of Engineering Resources for RF Front End Modules R&D Submitted by

Bi-directional level shifter for I²C-bus and other systems.

White Paper FPGA Performance Benchmarking Methodology

CLOCK DOMAIN CROSSING CLOSING THE LOOP ON CLOCK DOMAIN FUNCTIONAL IMPLEMENTATION PROBLEMS

5V Tolerance Techniques for CoolRunner-II Devices

7a. System-on-chip design and prototyping platforms

INF4420 Introduction

Chip Diode Application Note

Space product assurance

Digital Systems Design! Lecture 1 - Introduction!!

Best Practises for LabVIEW FPGA Design Flow. uk.ni.com ireland.ni.com

Case Study: Improving FPGA Design Speed with Floorplanning

Liberty Low Power/UPF

SLC vs MLC: Proper Flash Selection for SSDs in Industrial, Military and Avionic Applications. A TCS Space & Component Technology White Paper

Semiconductor design Outsourcing: Global trends and Indian perspective. Vasudevan A Date: Aug 29, 2003

Quartus II Software Design Series : Foundation. Digitale Signalverarbeitung mit FPGA. Digitale Signalverarbeitung mit FPGA (DSF) Quartus II 1

Design and Verification of Nine port Network Router

Qsys and IP Core Integration

REVOLUTIONARY HARDWARE MANAGEMENT SOLUTIONS

In-Vehicle Networking

數 位 積 體 電 路 Digital Integrated Circuits

Rapid System Prototyping with FPGAs

iservdb The database closest to you IDEAS Institute

March 12, 2013 Dr. Alexander Tetelbaum. Design Automation

SIGNAL GENERATORS and OSCILLOSCOPE CALIBRATION

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

Using EDA Databases: Milkyway & OpenAccess

Application Note AN-1068 reva

Application Note AN-940

Project Plan. Project Plan. May Logging DC Wattmeter. Team Member: Advisor : Ailing Mei. Collin Christy. Andrew Kom. Client: Chongli Cai

Transcription:

Design for Power User Experience David Hui AMD Fellow

Power optimization is holistic Architecture ASIC Technology CAD Physical Design IP 2

Design for power challenges Low power architectures Advance and efficient power management architecture Performance and power scalable architectures enable efficient cross product IPs System-level power profiling and optimization Power aware design flow Design closure flow in all three phases (Architecture, Design, and Implementation) of the product development cycle Power, performance, and area Hierarchical low power SOC design Predictable power metrics to allow power closure Power efficient silicon process and technologies Low power interconnect reduce capacitance Leakage power reduction technologies IPs and monitor technology enable aggressive DVFS Accurate and low overhead on die voltage and temperature sensors Power efficient clock distribution Low thermo resistance packaging Power efficient ASIC IPs Silicon characterization Performance and power balance process technologies STA analysis which correlates better to silicon 3

DFP Strategies Power prediction and budgeting Spreadsheet RTL power optimization Sequence PowerTheatre Analysis and optimization techniques using relative power Designer update RTL Synthesis with fine grain clock gate option Leakage power reduction Power island with CPF adaptation Mix VT 4

CPF adaptation Power Forward Initiative proof point project with GPU Collaboration with Cadence Develop front to back flow and insert CPF wherever possible to replace internal flow Wireless use CPF for simulation Both GPU and wireless chips use Conformal LP 5

CPF Challenges and expectations Many years invested in current P&R flow Mixed tool flow Can t rely on entire tool chain to be power intent aware Power gating solution must be area efficient Depending on product, minimal performance impact, power efficient rather than low power No impact to physical design schedule Critical product low risk implementation Our strategy was to rely extensively on Conformal LP to validate the correctness of the power gating solution How to prove correctness? 6

Key power island technology choices Need a single power specification for chip Common Power Format (CPF) Easily understood by everyone impacted by power Automatic checking of design against spec Simulation of power functionality in frontend (logical) Checking at every major stage of design flow Conformal Low Power + CPF + Netlist In addition to normal LVS and Logical Equiv Check PD place & fix all special power cells in block floorplan phase Then allow normal tool flow to run & complete IR and in-rush current analysis 7

CPF flow Single CPF is constructed for the entire chip Concise description of power strategy Which blocks are ONOFF, which are AON Isolation rules and the cells to implement them Power switch type (either header or footer) and switched power definition For complex GPU the CPF is about 200 lines Easily understood & reviewed Much simpler than timing constraint/exception files Push down block CPF using FE 8

Common CPF commands used in a single design define_isolation_cell define_always_on_cell define_power_switch_cell define_library create_isolation_rule update_isolation_rules create_power_mode create_global_connection create_power_nets create_nominal_condition update_nominal_condition create_power_domain Update_power_domain create_power_switch_rule update_power_switch_rule 9

Checkerboard power switches (highlighted) with taps and AON cells 10

CPF power island experience (1 of 2) CPF file Full chip CPF is about 200 lines and relatively easy to create. Ideally created should be created by frontend engineers but PD engineers needed involve because knowledge of PD libraries, PDonly cells and Encounter-only directives Block CPF push down using Encounter work relatively well, but the block CPF size is about 10K lines and a lot of name mapping. Some easy of use/readability lost Physical netlist also alter original chip CPF Use Conformal LP as golden CPF parser Need to be careful with global and internal (after PG) signals Need a way to check power sequencing and time reference for control signals Our strategy was to rely extensively on Conformal LP to validate the correctness of the power gating solution Since GPU use 3 rd party simulator, power gate can not be simulated. Power on sequence rules and signal polarity can not be checked 11

CPF power island experience (2 of 2) Placement of power island special cells and power switches needs a lot of internal scripting and optimization step can potentially break timing. Tools have issue with mix AON logic in power gated islands Tools have some problem with footer power switches Hard to access real area impact Difficult to do ECO Critical to run rule check at every step when netlist is changed and major flow stages (placement, CTS, IPO, etc) We did tapeout on schedule and silicon in working as intent 12

CLP Flows Block level flow Used extensively throughout APR flow Fast (45minutes), good diagnostics 1 False error due to top level clock pushdown/incomplete library spec Chip block flow Implemented but not used Full chip flow 24+hrs in full flat mode, 4hrs with specially created power ILMs Need additional directives to handle IO power Eg non-standard power rails 13

Future power challenges Process variation continue to increase in deep submicron Mode base IR drop analysis Need optimization techniques to reduce gate leakage HVT cells are not as effective (performance impact VS leakage reduction) Voltage scale limited More complex power island requirement 14

A wish list Power closure A consistent power analysis from ESL -> RTL -> PD -> silicon Power models PVT aware Sigma base for yield prediction Power mode aware IP Need statistical methods for stimulus generation Clock control aware Spatial and temporal CPF Translator between CPF/UPF A CPF linter A bi-directional GUI An API interface with other tools More consistent tool interpretation of CPF Better integration of tools that uses CPF Ability to code in technology rules, i.e. Level shifter requirement based on voltage domain voltage differential, special rules for switch type, or etc. Support bottom up CPF Heretical CPF methodology 15