«A 32-bit DSP Ultra Low Power accelerator» E. Beigné edith.beigne@cea.fr CEA LETI MINATEC, Grenoble, France www.cea.fr
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 2
Cliquez pour Low modifier Power le SoC style challenges du titre Application Applicative flexibility Unpredictable load Multi-Core Architectures Technology Process variability Aging Architecture Massively parallel architectures CEA. All rights reserved 2015 COOL Chips E. Beigné 3
Cliquez pour modifier Energy le style Efficiency du titre Power consumption management follows two objectives today : Reduce the overall power or Increase the global energetic efficiency Performance does no more only mean High frequency but also Low power High Performance/ Low Power trade-off difficult to reach Mimimum power consumed at a given frequency CEA. All rights reserved 2015 COOL Chips E. Beigné 4
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 5
Dynamic Voltage Frequency Scaling Cliquez pour modifier le style du titre Power Domain DVFS for power-performance trade-off V regulator Ctrl. F generator Core 1! " Frequency Power V dd F are adjusted to ensure performance at minimum consumption Voltage Frequency CEA. All rights reserved 2015 COOL Chips E. Beigné 6
Adaptive Voltage Frequency Scaling Cliquez pour modifier le style du titre V regulator Power Domain AVFS approach Fmax PVT monitors Data fusion adjustment Ctrl. F generator Core Adjustment Data fusion 1! " Frequency Voltage F is adjusted to reach F max for the given conditions (V dd, T ) CEA. All rights reserved 2015 COOL Chips E. Beigné 7
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 8
Body Biasing and Multi-V T in UTBB FD-SOI Cliquez pour modifier le style du titre RVT Bulk-like Well VB NMOS NMOS PMOS VB PMOS -300mV VBPMOS 3V p-well n-well -3V VBNMOS 300mV LVT Flip-Well GND VB PMOS NMOS PMOS VB NMOS -3V VBPMOS 300mV n-well p-well -300mV VBNMOS 3V [Flatresse et al. ISSCC 2013] GND CEA. All rights reserved 2015 COOL Chips E. Beigné 9
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 10
Cliquez AVFS pour with modifier Adaptive le Body style du Biasing titre V regulator Power Domain ABB in FD-SOI Ultra Wide Voltage Range (UWVR) Body Bias increases efficiency Ctrl. F generator Core BB generator Adjustment Data fusion UWVR 1! " Frequency Eff. 0 1.3V V dd V bb should be chosen to minimize consumption Voltage Source: http://www.st.com/web/en/about_st/learn_fd-soi.html CEA. All rights reserved 2015 COOL Chips E. Beigné 11
Cliquez pour Fine modifier grain le AVFS style with du titre ABB V regulator Ctrl. F generator BB generator Adjustment Power Power Domain Domain Power Domain Core Data fusion Power Domain Power Domain Power Domain Power Domain Granularity in complex SoC Fine grain requires per core actuators Low area/complexity enable fine grain AVFS with ABB Concession on performance Speed Discretization Switching cost Efficiency Frequency Voltage Example of V DD discretization Actuators constraints to be considered for management policy CEA. All rights reserved 2015 COOL Chips E. Beigné 12
Summarizing AVFS with ABB in FD-SOI Cliquez pour modifier le style du titre V DD generator low area for fine grain F max monitor for adaptive approach Power Domain V regulator Ctrl. F generator BB generator Core T monitor for adaptive approach Adjustment Data fusion F generator low area for fine grain V BB generator to improve efficiency CEA. All rights reserved 2015 COOL Chips E. Beigné 13
Something new for AVFS power management Cliquez pour modifier le style du titre Dynamic power management today FD-SOI context (UWVR, increased efficiency) New opportunity: extended Body Bias range Source: http://www.st.com/web/en/about_st/learn_fd-soi.html Power Domain V regulator 3 on-chip constrained actuators Ctrl. F generator Core Known target performance (F target ) BB generator n Power Modes (PM) available Adjustment Data fusion F V dd V bb P F Which configuration to minimize power? taking into account actuators constraints CEA. All rights reserved 2015 COOL Chips E. Beigné 14
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 15
Address Gen. Cliquez pour modifier DSP le style architecture du titre 32bit VLIW DSP core - V/F domain Control Address Gen. Shift Reg. Mux Prog SRAM Data SRAM Register Array 2Read-1Write Ports 64x32b 32b Multiplier Bit Extension Mux Adder 40b MAC Bit Trunc Cordics + Divider Comp/Select CODA TMFLT-R TMFLT-S TMFLT-S TMFLT-S TMFLT-S TMFLT-S TMFLT-S Serial Interface Input/Output RAM 2x1Kx32 PLL Variability Power Control (CVP) CEA. All rights reserved 2015 COOL Chips E. Beigné 16
Cliquez pour modifier Chip le style Micrograph du titre Technology STMicro Transistors Core area DSP benchmark VDD range VBB range UTBB FDSOI 28 nm Flip-Well (LVT) L=24nm 1 mm² FFT 1024 0.397V-1.3V 0V/±2V Wilson, R et al., "A 460MHz at 397mV, 2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking," International Solid-State Circuits Conference (ISSCC), pp. 452-453, February 2014 E. Beigné, et al., JSSC 2015 CEA. All rights reserved 2015 COOL Chips E. Beigné 17
Cliquez Power pour distribution modifier le Body style du Biasing titre Including all biasing voltages Thin VBB power stripes Body-Biasing Supplied through specific IOs (± 2V) CEA. All rights reserved 2015 COOL Chips E. Beigné 18
Cliquez pour Optimized modifier le library style du sub-set titre Use of available standard cells optimized for low voltage: Extend the subset to the Wide Voltage Range Gate length modulation done through Poly Biasing Keep the power-delay optimal cells in UWVR Cells are characterized over [0.275V-1.4V] V DD range and [0V- (±2V)] V BB range CEA. All rights reserved 2015 COOL Chips E. Beigné 19
Cliquez pour Optimized modifier le Pulsed-latch style du titre FF Transmission-Gate Pulsed Latches A latch and a pulse generator Same behavior as a conventional Master-Slave FF Proved to be the most energy-efficient in a large portion of the design spacee D-to-Q (ps) Egy/cycle (fj) Area (µm²) ST C2MOS 117.5 6 4,41 ST SA 46.5 (- 60 %) 12.5 (+ 208 %) 6,85 (+ 55 %) TGPLMuxScan 30.5 (- 74 %) 7.2 (+ 20 %) 4,73 (+ 7 %) TGPLMuxClk 26 (- 78 %) 9.1 (+ 56 %) 5,06 (+ 14 %) CEA. All rights reserved 2015 COOL Chips E. Beigné 20
Timing margins reduction in UWVR : F MAX tracking Cliquez pour modifier le style du titre Design in typical corner case instead of worst case approach Need to compensate for PVT variations in UWVR Reduce energy per operation or increase clock frequency Track the optimal Frequency/Voltage/Biasing point Real maximum frequency is not available Functional in the UWVR / Low area and power overhead Two complementary solutions proposed on-chip Replica path based CODA (ClOning DAta paths) Timing slack sensor based TMFLT (TiMing FauLT) CEA. All rights reserved 2015 COOL Chips E. Beigné 21
Fmax tracking (solution 1) : Path Cloning Cliquez pour modifier le style du titre 16 paths cloned at 1V signoff Correlation between F MAX and frequency predictions 5 slots into 21 dies @ 1V 25 C (+4/-3% accuracy) 1 slot into 21 dies @ [0.8V-1.3V] 25 C (6% accuracy) Not intrusive / Need complementary solution in UWVR CEA. All rights reserved 2015 COOL Chips E. Beigné 22
Fmax tracking (solution 2) : Slack Monitoring Cliquez pour modifier le style du titre 128 TMFLT-Sensors 1 TMFLT-Ring Analyze the registers slack time 300ps detection window Programmable replica path Time-to-digital measurement RAM FF FF Reset_n TMFLT Sensor FF Detectio n window Warning EN CLK E CG SEL 31 Prog. delay reg1 reg2 Clock tree 0 0 0 1 1 1 SIG slack time CEA. All rights reserved 2015 COOL Chips E. Beigné 23
frequency error (%) Fmax tracking (solution 2) : Slack Monitoring Cliquez pour modifier le style du titre Frequency estimation error compared to F MAX 8 6 4 2 0-2 -4 Error of ±4% at 1V 21 dies -6 600 700 800 900 1000 1100 1200 1300 Power Supply (mv) Compared to worst case PVT corner (3σ) approach without F MAX tracking: Energy gain of 40.6% @ 0.6V Frequency gain of 24% @1V CEA. All rights reserved 2015 COOL Chips E. Beigné 24
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 25
DSP speed performance measurements Cliquez pour modifier le style du titre Frequency (MHz) 3000 2500 2000 1500 1000 500 Boost 0mV FBB 0V FBB 0.5V FBB 1.0V FBB 1.5V FBB 2.0V Boost 500mV Boost 1000mV Boost 1500mV Boost 2000mV 1GHz@570mV 2.6GHz@1.3V 0 300 400 500 600 700 800 900 1000 1100 1200 1300 460MHz@397mV VDD voltage (mv) FBB up to +2V and F MAX tracking : frequency up to 460MHz at minimum voltage CEA. All rights reserved 2015 COOL Chips E. Beigné 26
Energy/cycle (pj/cycle) 450 400 350 300 250 200 150 100 50 0 DSP energy performance measurements Cliquez pour modifier le style du titre FBB 0V FBB 0.5V FBB 1.0V FBB 1.5V FBB 2.0V Boost 0mV Boost 500mV Boost 1000mV Boost 1500mV Boost 2000mV 62pJ/cycle @ 460MHz 300 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 For a fixed voltage supply : Lowest energy at 460MHz Power consumption : 370mW at 1V Frequency (MHz) CEA. All rights reserved 2015 COOL Chips E. Beigné 27
Energy/cycle (pj/cycle) 450 400 350 300 250 200 150 100 50 DSP energy performance measurements Cliquez pour modifier le style du titre Boost FBB 0V 0mV FBB 0.5V FBB 1.0V FBB 1.5V FBB 2.0V Boost 500mV Boost 1000mV Boost 1500mV Boost 2000mV +59% frequency +14% frequency -20% energy/cycle 0 300 500 700 900 1100 1300 1500 1700 1900 2100 2300 2500 FBB and F MAX tracking : Frequency up to 59% (target 100pJ/cycle) Energy by 20% (target 1.7 GHz) Frequency (MHz) CEA. All rights reserved 2015 COOL Chips E. Beigné 28
Frequency (MHz) 3000 1000 100 10 1 Comparison with State of the Art WVR Cliquez pour modifier le style du titre This work INTEL, 22nm, ISCC 2012 INTEL, 32nm, ISCC 2012 MIT, TI, 28nm, ISSCC 2011 MIT, TI, 28nm, ISSCC 2011 NXP, IMEC, 32nm, ISSCC 2011 INTEL, 32nm, ISSCC MIT, TI, 28nm, ISCC 2012 2011 CSEM (100µW/MHz) 22nm, INTEL, ISSCC TI, CEVA, 40nm TI, 40nm TI (200µW/MHz) 0 0.2 0.4 0.6 0.8 1 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1.2 1.4 Supply Voltage (V) CEA. All rights reserved 2015 COOL Chips E. Beigné 29
Low power SOC challenges : Energy Efficiency Fine-Grain AVFS solutions FDSOI technology brings a new actuator Fine-Grain AVFS with Body Biasing Silicon results in FDSOI 28nm Future IoT needs and Conclusions CEA. All rights reserved Diego Puschini ICDV 2014 30
What about FDSOI for IoT applications? Cliquez pour modifier le style du titre Energy harvesting will bring more challenges to Voltage supplies levels Autonomous Wireless Sensor Nodes are requiring even more Versatility Moving the operating range from very low power to medium speed Leakage is a big constraint Possible use of Reverse Body Biasing and Poly Biasing Analog and RF is performant Towards fully integrated mixed-signal nodes CEA. All rights reserved 2015 COOL Chips E. Beigné 31
Cliquez pour modifier le style Conclusion du titre Circuit designers are faced with conflicting requirements Decrease the power dissipation AND Maintain the performance In a world of increasing parametric variations This can be met thanks to a combination of design techniques Reduce design margins and adapt to environment SenseReact paradigm Leverage the technology s possibilities Well engineering and back-biasing in FDSOI Take home message Develop technology that brings additional actuators to circuit designers This will unleash their creativity Think about low leakage applications like IoT Low voltage performances Performance adaptation CEA. All rights reserved 2015 COOL Chips E. Beigné 32