Chapter 1. Introduction. 1.1 Motivation. A high-performance processor requires large power consumption to operate at



Similar documents
Advanced Operating Systems (M) Dr Colin Perkins School of Computing Science University of Glasgow

CHAPTER 1 INTRODUCTION

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

CHAPTER 1 INTRODUCTION

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration

An Implementation of Active Data Technology

Multi-core real-time scheduling

Chapter 13 Embedded Operating Systems

Dynamic Power Variations in Data Centers and Network Rooms

Real-Time Scheduling (Part 1) (Working Draft) Real-Time System Example

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware

A Lab Course on Computer Architecture

The new 32-bit MSP432 MCU platform from Texas

Dynamic Power Variations in Data Centers and Network Rooms

Feb.2012 Benefits of the big.little Architecture

Real-Time Task Scheduling for Energy-Aware Embedded Systems 1

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Design and Implementation of the Heterogeneous Multikernel Operating System

Chapter 13 Selected Storage Systems and Interface

ELEC 5260/6260/6266 Embedded Computing Systems

Power-Aware Scheduling of Conditional Task Graphs in Real-Time Multiprocessor Systems

1 Review of Least Squares Solutions to Overdetermined Systems

2. is the number of processes that are completed per time unit. A) CPU utilization B) Response time C) Turnaround time D) Throughput

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

CHAPTER 7 SUMMARY AND CONCLUSION

Throughput constraint for Synchronous Data Flow Graphs

Overview. Surveillance Systems. The Smart Camera - Hardware

Real-Time Operating Systems for MPSoCs

Which ARM Cortex Core Is Right for Your Application: A, R or M?

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

A Novel Adaptive Virtual Machine Deployment Algorithm for Cloud Computing

Automated Software and Hardware Evolution Analysis for Distributed Real-time and Embedded Systems

CS550. Distributed Operating Systems (Advanced Operating Systems) Instructor: Xian-He Sun

The Heartbeat behind Portable Medical Devices: Ultra-Low-Power Mixed-Signal Microcontrollers

Optimized Scheduling in Real-Time Environments with Column Generation

An examination of the dual-core capability of the new HP xw4300 Workstation

Cloud Computing and Robotics for Disaster Management

EECS 750: Advanced Operating Systems. 01/28 /2015 Heechul Yun

Optimizing Configuration and Application Mapping for MPSoC Architectures

Common Approaches to Real-Time Scheduling

Contents. Chapter 1. Introduction

Parametric Analysis of Mobile Cloud Computing using Simulation Modeling

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

The Future of the ARM Processor in Military Operations

A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems

COMPUTER ORGANIZATION ARCHITECTURES FOR EMBEDDED COMPUTING

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

Improving Grid Processing Efficiency through Compute-Data Confluence

A Review of Customized Dynamic Load Balancing for a Network of Workstations

VHDL DESIGN OF EDUCATIONAL, MODERN AND OPEN- ARCHITECTURE CPU

A hypervisor approach with real-time support to the MIPS M5150 processor

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

ANALYSIS OF WORKFLOW SCHEDULING PROCESS USING ENHANCED SUPERIOR ELEMENT MULTITUDE OPTIMIZATION IN CLOUD

Virtual Machines.

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT

Dynamic resource management for energy saving in the cloud computing environment

Load Balancing in Structured Peer to Peer Systems

Load Balancing in Structured Peer to Peer Systems

Least Slack Time Rate First: an Efficient Scheduling Algorithm for Pervasive Computing Environment

ARM Architecture. ARM history. Why ARM? ARM Ltd developed by Acorn computers. Computer Organization and Assembly Languages Yung-Yu Chuang

Chapter 2 Heterogeneous Multicore Architecture

Summer projects for Dept. of IT students in the summer 2015

Aperiodic Task Scheduling

Chapter 1: Introduction. What is an Operating System?

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

Navigating the Enterprise Database Selection Process: A Comparison of RDMS Acquisition Costs Abstract

Energy-aware job scheduler for highperformance

Analysis and Simulation of Scheduling Techniques for Real-Time Embedded Multi-core Architectures

A NEW APPROACH FOR LOAD BALANCING IN CLOUD COMPUTING

Zeenov Agora High Level Architecture

Resource Management In Cloud Computing With Increasing Dataset

A Novel Load Balancing Algorithms in Grid Computing

PERFORMANCE EVALUATION OF THREE DYNAMIC LOAD BALANCING ALGORITHMS ON SPMD MODEL

Weighted Total Mark. Weighted Exam Mark

Module 6. Embedded System Software. Version 2 EE IIT, Kharagpur 1

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

Transcription:

Chapter 1 Introduction 1.1 Motivation A high-performance processor requires large power consumption to operate at its high clock rate. For example, a Pentiunm-4 class processor currently consumes more than 50W. The increased power consumption demands advanced technology including thermal packaging, electricity, and air conditioning to deal with its heat dissipation. Furthermore, it takes significantly more energy to complete a task because the power consumption of a processor grows cubically with its clock rate. Both concerns considerably stall the deployment of high-performance processors on low-cost battery-powered embedded systems. Instead, many modern embedded systems such as cellphones [8], PDA [21], and Tablet PC [25] are now equipped with several low-power processors to achieve the same performance at a reduced cost and lower energy requirement. A variety of Instruction Set Architecture (ISA) and processor cores have been developed, each of which provides the best performance for a specific set of applications. In our local performance study between S3C2410 (an ARM-9 processor) [23] and TI5520 (a TI-DSP processor) [27], we observed that TI5520 consumes 9.2 times more energy than S3C2410 to execute multiplication instructions. In contrast, S3C2410 takes 2.2 times more energy than TI5520 to do matrix operations. 1

For this reason, many embedded system adopts a heterogeneous multi-processor (HeMP) design to further reduce its energy consumption. To fully utilize computational power in such a HeMP system, several research studies [11, 20, 26] have been proposed to construct a flexible programming paradigm in which a program can be executed and migrated among these heterogeneous processors. In this paper, we propose a low-power real-time scheduling algorithm for HeMP systems. A number of studies have been reported [7,10,12,14] to schedule real-time tasks on a homogeneous multi-processor (HoMP) system. These algorithms schedule tasks to complete before their deadline while minimize energy reduction. However, because heterogeneous performance on different processors is not considered, existing work delivers poor energy-saving performance if directly applied on a HeMP system. This observation is confirmed by our experimental results described later. To the best of our knowledge, our work is the first one that addresses low-power real-time scheduling on HeMP systems. Due to the complexity of this problem, we focus on scheduling a set of n framebased tasks on m heterogeneous processors to achieve minimum energy consumption. Each task must complete before a common deadline. All tasks are independent and non-preemptible. Finding an optimal solution of this problem takes exponential time complexity. Instead, we provide a couple of algorithms that solve this problem in polynomial time. Both algorithms use a local-optimal analysis to initially partition all tasks into m processors. The first algorithm takes a greedy-based approach to migrate tasks out of an over-loaded processor to achieve load-balanced and reduce energy consumption. It has O(mn log n) time complexity. The second 2

algorithm achieves load-balanced by a dynamic programming (DP) method. Its time complexity is at O(mnB), where B is the sum of execution cycles of all tasks. We find that by simply modifying the traditional HoMP list scheduling method using the index matrix as a priority basis, we get at least 30% energy improvement comparing to the most simple list scheduling, but it is not good enough. According our final experiment result, our algorithm just need 40% energy or even less can schedule a set of tasks under HeMP system than the list scheduling. Thus, the schedule decision influences energy consumption very much on HeMP system and is worthy to be taken a good care. The rest of this paper is structured as follows. Section 2 describes the energy model and the task model. Section 3 presents our task-partition method. Section 4 presents the greedy-based load-balanced algorithm. The DP-based load-balanced algorithm is described in Section 5. Section 6 presents the experimental results. Finally, Section 7 concludes this paper and discusses future works. 1.2 Related Work The technique of voltage scaling has been widely used to reduce energy consumption by speeding down the processor and extending task execution time. A real-time task must complete its computation before its deadline to avoid failure. A number of low-power real-time scheduling algorithms have been proposed [3,4,30,31] to make use of this technique to minimize energy reduction without missing any deadline. All these algorithms addressed this issue on a single-processor system. 3

As multi-processor platforms gain its popularity nowadays, the problem of scheduling real-time tasks on a set of homogeneous processors has received a lot of attentions recently [1, 2, 5, 6, 9, 29]. The Proportionate-fair (Pfair) algorithm, proposed by Baruah et al. [2, 5, 6], is an optimal one to take as input a set of periodic tasks and provide a feasible real-time HoMP schedule if such a schedule exists. This algorithm, however, considers no energy consumption and is not suitable for low-power systems. Anderson et al. [1] proposed a method of finding an optimal number of processors on which a given set of periodic tasks incurs minimum energy consumption. J.-J Chen et al. [9] finds an optimal bound on energy consumption for a set of frame-based tasks, each of which has different power characteristics. All these algorithms focused their discussion on HoMP systems. Without considering that a task may have different performance on heterogeneous processors, these algorithms cannot be directly applied on HeMP systems. There are several studies [18, 19, 24, 28] that addressed on scheduling issues on HeMP systems. All these studies [19, 24, 28] focused on the problem of scheduling a set of dependent tasks to minimize their completion time. Maheswaran et al. [19] solved this problem by dynamically mapping tasks to processors and Sih et al. [24] proposed a compile-time solution. Topcuouglu et al. [28] improved this work by providing an efficient solution at a reduced time complexity. No energy reduction and real-time constraints are considered in this group of work. Instead, Luo et al. [18] proposed an algorithm to schedule a set of dependent tasks and complete them within a common deadline while minimizing its total energy consumption. However, all above work considered only dependent tasks and cannot be generalized 4

to work with independent and concurrent tasks. In summary, we propose a novel solution to schedule a set of independent tasks on a HeMP system. Our goal is to complete all tasks within a common deadline while minimizing total energy consumption. To our best knowledge, our work is the first one to consider the issue of performance difference on heterogeneous processors in low-power real-time scheduling. 5