Chapter 1. Introduction. 1.1 Motivation. A high-performance processor requires large power consumption to operate at

Chapter 1 Introduction 1.1 Motivation A high-performance processor requires large power consumption to operate at its high clock rate. For example, a Pentiunm-4 class processor currently consumes more than 50W. The increased power consumption demands advanced technology including thermal packaging, electricity, and air conditioning to deal with its heat dissipation. Furthermore, it takes significantly more energy to complete a task because the power consumption of a processor grows cubically with its clock rate. Both concerns considerably stall the deployment of high-performance processors on low-cost battery-powered embedded systems. Instead, many modern embedded systems such as cellphones [8], PDA [21], and Tablet PC [25] are now equipped with several low-power processors to achieve the same performance at a reduced cost and lower energy requirement. A variety of Instruction Set Architecture (ISA) and processor cores have been developed, each of which provides the best performance for a specific set of applications. In our local performance study between S3C2410 (an ARM-9 processor) [23] and TI5520 (a TI-DSP processor) [27], we observed that TI5520 consumes 9.2 times more energy than S3C2410 to execute multiplication instructions. In contrast, S3C2410 takes 2.2 times more energy than TI5520 to do matrix operations. 1

For this reason, many embedded system adopts a heterogeneous multi-processor (HeMP) design to further reduce its energy consumption. To fully utilize computational power in such a HeMP system, several research studies [11, 20, 26] have been proposed to construct a flexible programming paradigm in which a program can be executed and migrated among these heterogeneous processors. In this paper, we propose a low-power real-time scheduling algorithm for HeMP systems. A number of studies have been reported [7,10,12,14] to schedule real-time tasks on a homogeneous multi-processor (HoMP) system. These algorithms schedule tasks to complete before their deadline while minimize energy reduction. However, because heterogeneous performance on different processors is not considered, existing work delivers poor energy-saving performance if directly applied on a HeMP system. This observation is confirmed by our experimental results described later. To the best of our knowledge, our work is the first one that addresses low-power real-time scheduling on HeMP systems. Due to the complexity of this problem, we focus on scheduling a set of n framebased tasks on m heterogeneous processors to achieve minimum energy consumption. Each task must complete before a common deadline. All tasks are independent and non-preemptible. Finding an optimal solution of this problem takes exponential time complexity. Instead, we provide a couple of algorithms that solve this problem in polynomial time. Both algorithms use a local-optimal analysis to initially partition all tasks into m processors. The first algorithm takes a greedy-based approach to migrate tasks out of an over-loaded processor to achieve load-balanced and reduce energy consumption. It has O(mn log n) time complexity. The second 2

algorithm achieves load-balanced by a dynamic programming (DP) method. Its time complexity is at O(mnB), where B is the sum of execution cycles of all tasks. We find that by simply modifying the traditional HoMP list scheduling method using the index matrix as a priority basis, we get at least 30% energy improvement comparing to the most simple list scheduling, but it is not good enough. According our final experiment result, our algorithm just need 40% energy or even less can schedule a set of tasks under HeMP system than the list scheduling. Thus, the schedule decision influences energy consumption very much on HeMP system and is worthy to be taken a good care. The rest of this paper is structured as follows. Section 2 describes the energy model and the task model. Section 3 presents our task-partition method. Section 4 presents the greedy-based load-balanced algorithm. The DP-based load-balanced algorithm is described in Section 5. Section 6 presents the experimental results. Finally, Section 7 concludes this paper and discusses future works. 1.2 Related Work The technique of voltage scaling has been widely used to reduce energy consumption by speeding down the processor and extending task execution time. A real-time task must complete its computation before its deadline to avoid failure. A number of low-power real-time scheduling algorithms have been proposed [3,4,30,31] to make use of this technique to minimize energy reduction without missing any deadline. All these algorithms addressed this issue on a single-processor system. 3

As multi-processor platforms gain its popularity nowadays, the problem of scheduling real-time tasks on a set of homogeneous processors has received a lot of attentions recently [1, 2, 5, 6, 9, 29]. The Proportionate-fair (Pfair) algorithm, proposed by Baruah et al. [2, 5, 6], is an optimal one to take as input a set of periodic tasks and provide a feasible real-time HoMP schedule if such a schedule exists. This algorithm, however, considers no energy consumption and is not suitable for low-power systems. Anderson et al. [1] proposed a method of finding an optimal number of processors on which a given set of periodic tasks incurs minimum energy consumption. J.-J Chen et al. [9] finds an optimal bound on energy consumption for a set of frame-based tasks, each of which has different power characteristics. All these algorithms focused their discussion on HoMP systems. Without considering that a task may have different performance on heterogeneous processors, these algorithms cannot be directly applied on HeMP systems. There are several studies [18, 19, 24, 28] that addressed on scheduling issues on HeMP systems. All these studies [19, 24, 28] focused on the problem of scheduling a set of dependent tasks to minimize their completion time. Maheswaran et al. [19] solved this problem by dynamically mapping tasks to processors and Sih et al. [24] proposed a compile-time solution. Topcuouglu et al. [28] improved this work by providing an efficient solution at a reduced time complexity. No energy reduction and real-time constraints are considered in this group of work. Instead, Luo et al. [18] proposed an algorithm to schedule a set of dependent tasks and complete them within a common deadline while minimizing its total energy consumption. However, all above work considered only dependent tasks and cannot be generalized 4

to work with independent and concurrent tasks. In summary, we propose a novel solution to schedule a set of independent tasks on a HeMP system. Our goal is to complete all tasks within a common deadline while minimizing total energy consumption. To our best knowledge, our work is the first one to consider the issue of performance difference on heterogeneous processors in low-power real-time scheduling. 5