[EPC-43] How Software and Hardware Can Cooperate To Manage Power Consumption. September 17, 2008

Similar documents
ARM Cortex-A9 MPCore Multicore Processor Hierarchical Implementation with IC Compiler

1. Memory technology & Hierarchy

Hello, and welcome to this presentation of the STM32L4 reset and clock controller.

Comparing Power Saving Techniques for Multi cores ARM Platforms

Am186ER/Am188ER AMD Continues 16-bit Innovation

Overview of the Cortex-M3

Low Power AMD Athlon 64 and AMD Opteron Processors

The new 32-bit MSP432 MCU platform from Texas

Software Engagement with Sleeping CPUs

AN LPC1700 timer triggered memory to GPIO data transfer. Document information. LPC1700, GPIO, DMA, Timer0, Sleep Mode

8-Bit Flash Microcontroller for Smart Cards. AT89SCXXXXA Summary. Features. Description. Complete datasheet available under NDA

EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May ILP Execution

ECU State Manager Module Development and Design for Automotive Platform Software Based on AUTOSAR 4.0

An Introduction to the ARM 7 Architecture

Hello, and welcome to this presentation of the STM32 SDMMC controller module. It covers the main features of the controller which is used to connect

ARM Microprocessor and ARM-Based Microcontrollers

EVALUATING POWER MANAGEMENT CAPABILITIES OF LOW-POWER CLOUD PLATFORMS. Jens Smeds

Designing VM2 Application Boards

big.little Technology: The Future of Mobile Making very high performance available in a mobile envelope without sacrificing energy efficiency

ARM Webinar series. ARM Based SoC. Abey Thomas

CPU Performance. Lecture 8 CAP

Develop a Dallas 1-Wire Master Using the Z8F1680 Series of MCUs

Lesson-16: Real time clock DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK

Software based Finite State Machine (FSM) with general purpose processors

ES_LPC4357/53/37/33. Errata sheet LPC4357/53/37/33. Document information

PowerPC Microprocessor Clock Modes

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

COMPUTER SCIENCE AND ENGINEERING - Microprocessor Systems - Mitchell Aaron Thornton

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

Power Management in the Linux Kernel

How A V3 Appliance Employs Superior VDI Architecture to Reduce Latency and Increase Performance

Fastboot Techniques for x86 Architectures. Marcus Bortel Field Application Engineer QNX Software Systems

Computer Organization. and Instruction Execution. August 22

Green Embedded Computing and the MPC8536E PowerQUICC III Processor

361 Computer Architecture Lecture 14: Cache Memory

Microtronics technologies Mobile:

Selecting the Right MCU Can Squeeze Nanoamps out of Your Next Internet of Things Application

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-17: Memory organisation, and types of memory

Advanced Computer Architecture-CS501. Computer Systems Design and Architecture 2.1, 2.2, 3.2

Chapter 2 Basic Structure of Computers. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Serial port interface for microcontroller embedded into integrated power meter

Microprocessor & Assembly Language

Bandwidth Calculations for SA-1100 Processor LCD Displays

System Design Issues in Embedded Processing

ARM Cortex STM series

Applying Use Cases to Microcontroller Code Development. Chris Gilbert Cypress Semiconductor

Exception and Interrupt Handling in ARM

Improving Energy Efficiency On Asymmetric Multiprocessing Systems

It should be corrected as indicated by the shading below.

Implementing a Digital Answering Machine with a High-Speed 8-Bit Microcontroller

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.

Digital Signal Controller Based Automatic Transfer Switch

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux

Chapter 13. PIC Family Microcontroller

Price/performance Modern Memory Hierarchy

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

Performance Comparison of RTOS

Allows the user to protect against inadvertent write operations. Device select and address bytes are Acknowledged Data Bytes are not Acknowledged

A Brief Tutorial on Power Management in Computer Systems. David Chalupsky, Emily Qi, & Ilango Ganga Intel Corporation March 13, 2007

Developing Embedded Applications with ARM Cortex TM -M1 Processors in Actel IGLOO and Fusion FPGAs. White Paper

Backup Power Domain. AN Application Note. Introduction

Embedded Development Tools

8-bit Microcontroller. Application Note. AVR134: Real-Time Clock (RTC) using the Asynchronous Timer. Features. Theory of Operation.

Emerging IT and Energy Star PC Specification Version 4.0: Opportunities and Risks. ITI/EPA Energy Star Workshop June 21, 2005 Donna Sadowy, AMD

ZigBee Technology Overview

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

The Energy Harvesting Tipping Point for Wireless Sensor Applications

Spacecraft Computer Systems. Colonel John E. Keesee

NAND Flash FAQ. Eureka Technology. apn5_87. NAND Flash FAQ

Linux Process Scheduling Policy

Design Cycle for Microprocessors

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Module 2. Embedded Processors and Memory. Version 2 EE IIT, Kharagpur 1

Real-Time Clock. * Real-Time Computing, edited by Duncan A. Mellichamp, Van Nostrand Reinhold

An Ultra-low low energy asynchronous processor for Wireless Sensor Networks

Energy-Efficient, High-Performance Heterogeneous Core Design

2

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit

Mobile Devices - An Introduction to the Android Operating Environment. Design, Architecture, and Performance Implications

Designing a System-on-Chip (SoC) with an ARM Cortex -M Processor

Keil C51 Cross Compiler

CISC, RISC, and DSP Microprocessors

Homework # 2. Solutions. 4.1 What are the differences among sequential access, direct access, and random access?

PAC52XX Clock Control Firmware Design

USER GUIDE EDBG. Description

AN4646 Application note

big.little Technology Moves Towards Fully Heterogeneous Global Task Scheduling Improving Energy Efficiency and Performance in Mobile Devices

Freescale Semiconductor, Inc. Product Brief Integrated Portable System Processor DragonBall ΤΜ

EMBEDDED SYSTEM BASICS AND APPLICATION

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

White Paper Amy Chong Yew Ee Online Sales Account Manager APAC Online Sales Center Intel Corporation. BOM Cost Reduction by Removing S3 State

1. Computer System Structure and Components

Atmel Norway XMEGA Introduction

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Let s put together a Manual Processor

STM32L. Ultra-low-power Cortex -M3 devices

HANDLING SUSPEND MODE ON A USB MOUSE

AN962: Implementing Master-Slave Timing Redundancy in Wireless and Packet- Based Network Applications

EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iscsi

Transcription:

[EPC-43] How Software and Hardware Can Cooperate To Manage Power Consumption John Carbone, Express Logic, Inc. Bob Boys, ARM UK, Ltd. September 17, 2008

Outline Power Consumption In Today s Embedded Systems CPU, memory, other logic all consume power How processor clock speed relates to power Hardware Technologies for Power Management Sleep, Stop, Standby Modes Clock speed adjustment Software Technologies for Power Management Small memory Efficient RTOS operations Clock management Conclusions

Power Consumption In Today s Embedded Systems Sources of Power Consumption All logic in general Particularly, memory and processor Effect of memory size on power More memory = more power consumption Effect of processor clock rate on power Mode MHz MIPS Power Ratio Savings Normal 5000 5000 1000mW 100.0% - Slow Clock 1000 1000 210mW 21.0% 79.0% Sleep 1 1 14mW 1.4% 98.6% Power Down 0 0 ~0mW ~ 0.0% ~100.0% Source: ARM/National Semiconductors, 2002 http://www.arm.com/pdfs/ns3003_v1a.pdf

Hardware Technologies for Power Management Power saving technology in physical elements Gate level power optimization Multiple voltage domains in design Multi-threshold logic Power gating Silicon on insulator Power saving technology in processor Run Mode Slow Clock Mode Sleep Mode Power Down Mode

Processor Low Power Modes Run Mode Processor continues to fetch and execute instructions Clock to some of the peripherals can be turned off Slow Clock Mode Processor clock can be slowed down (e.g. multiple clock sources, or using clock divider) Clock to some peripherals can be turned off Performance and power both reduced No latency

Sleep Mode Sleep Mode Processor stopped, but timer and interrupts continue to run, or some of the clock signals to processor, system bus and peripherals could be stopped Processor clock can still be running. Since the signal toggling is reduced, power consumption is also reduced. Some parts of the microcontroller could be turned off (e.g. PLL, flash memory) Some sleep modes might require additional processing during wakeup Minimal impact on software Re-awaken on interrupt or other event Low wakeup latency

Power Down Mode Power Down Mode Most logic powered off Greatest power saving Largest latency to re-start Processor state may not be retained, complicating software recovery

Example: ARM Cortex-M3 Cortex-M3 Wake-Up Interrupt (WIC) Controller Allows almost instantaneous (12-18 cycles) return to fully active mode from Sleep Mode Wake-up latency is the response time for the circuit to power-up. Cortex-M3 provides sleep and deep-sleep modes. (Interrupt latency of 12 cycles, 18 cycles if powered up instantly). Using Wake-Up Interrupt Controller (WIC) with deep sleep allow further power reduction. Based on power gating implementation flow - processor states stored in state retention logic cell, while majority of the circuit is powered down. The 18 cycle latency consists of the 12 clock cycle interrupt latency, a number of cycle for the power up sequence, which include the "wake up latency". In the 18 cycle figure we assume it takes just 1 cycle for the power to be ready. In some cases the wakeup latency can take several cycles, but normally less than 10 cycles in the latest semiconductor process.

Hardware Summary ARM s Cortex-M3 is an example of how a processor can be designed for low power applications: Sleep modes can be architecturally defined with sleep instructions and sleep interface. Sleep and Wake can be implemented with low overhead Clock domain separation allows most of the logic to be stopped during sleep Semiconductor technology and EDA tool enhancements also play an important part in lowering power Hardware technology can be controlled by the RTOS Transparent to the application code Easier for the programmer Sleep Mode offers good balance of benefit/transparency

Software s Role In Power Management General Software can affect memory size Software can help reduce need for high clock rate Software s Role In Enabling Reduction of Processor Clock Rate Dynamically Run-time clock management Impact on the application How the RTOS can help

Reduce Memory Size Memory size influences power consumption For small applications, RTOS size can be significant Small memory RTOS can help reduce power consumption Application size can be bigger factor Reduce RTOS Size Start with a small-memory RTOS Use compiler that generates smaller code Reduce Application Size Use compiler that generates smaller code Efficient use of registers can reduce memory accesses, thus reduce power consumption in the memory system Good layout of instruction and data can reduce cache misses Use architecture that provides denser code (ie Thumb-2) Reductions here can be very significant

Reduce Processor Clock Rate Using A Lower Processor Clock Rate Means Lower Power (and lower cost) RTOS A streamlined RTOS can operate on slower processor than a large complex RTOS Fast RTOS services can help reduce power consumption Application Fast application code can help reduce power consumption Compiler Efficient compiler generates faster code to help both RTOS and application use lower MHz processor and increase opportunity to enter power-saving mode

Dynamic Management of Power Consumption What if application demands high-performance? Can t reduce memory or change CPU dynamically, but we can change the CPU by varying its clock rate and hence its power consumption. MHz = Performance MHz = Power Many Applications Have Bursts of Activity Streaming video/audio, rendering, network access Between bursts of high-demand may be periods of inactivity Application can power-down or power-off CPU when not needed Enter power-saving mode when idle Exit power saving mode upon interrupt or event Keep application coherent (oblivious to slowdown)

Idle Time Lowest priority thread, or OS component Runs only when nothing else is going on in the system Always enter sleep mode when idle? Might not be a good idea What is overhead and frequency of timer interrupt? If frequent, reawakening and processing at each timer interrupt might outweigh benefits of power saving. Determine whether gain (power saving) outweighs loss (overhead added to sleep/wake) When is next scheduled event? If soon relative to time to sleep/wake, might be little gain.

Timer Overhead Every N th Clock Tic Generates Interrupt Prescaler operates to interrupt on 1/n clock tics Reduces interrupt frequency from full clock rate (ie: 25ns), which would be impossible to support, to something less frequent (ie: 10ms) Interrupt processing involves overhead Update real-time clock Update application timers Check for scheduled event Generally 1-10μs to service timer interrupt

Idle Time Many periods of inactivity, opportunities for power saving Note 92% system idle time (in this example)

Two Approaches Two approaches to power management through use of processor sleep mode. Always Sleep When Idle Enter sleep mode whenever there is nothing to do, and awaken from sleep to service timer and external interrupts. Keeps system time intact Managed Sleep Eliminate Timer Interrupts During Sleep Stop CPU and defer timer interrupts Delay next timer interrupt until next scheduled event

Always Sleep When Idle Enter sleep mode immediately upon reaching idle state Pros Power consumption reduced during periods of inactivity Easy to implement in RTOS or application Cons Incurs overhead re-awakening and servicing timer interrupt to update timers Net gain/loss depends on frequency and relative magnitude of these overheads Without extremely low overhead mechanisms, this approach might not work well Fast RTOS timer and interrupt servicing helps make this approach viable, but hardware mechanism is key

Managed Sleep Defer timer interrupts and then enter sleep mode Pros Avoids overhead of servicing timer interrupts Enables processor to sleep for longer continuous periods, rather than to re-awaken for each tic Cons Complicates activating next timed event Introduces need to adjust timer upon wakeup Requires 2 services from the RTOS Net gain/loss depends on overhead of sleep enter/exit, which is weighed against idle time until next timed event. Fast RTOS services and timer/interrupt handling help make this approach viable

RTOS Services To Support Managed Sleep (1) Find Next Scheduled Event The first service tells the application the number of timer ticks until the next scheduled event. The application would call this service prior to entering low power mode. The timer would be re-programmed to generate its next tic interrupt just prior to the next scheduled event This avoids missing a scheduled event even though timer interrupts are eliminated

Upon Reawakening (2) How Many Tics Skipped The second service, used upon re-awakening, informs the application how many ticks (at the original tic rate) were skipped The RTOS then updates system time to correct for the tics that were skipped, and resets the system timer to the previous tick rate This reestablishes all internal timers to the state at which they would have been had the system never gone into low power mode Application continues normally

Conclusion Processor technology addresses power management through efficient design and clock adjustment services An efficient compiler can reduce size and execution time, both helpful in reducing power consumption An RTOS can exploit advanced processor technology to enable power conservation with little, if any, application software impact