Attaining EDF Task Scheduling with O(1) Time Complexity Verber Domen University of Maribor, Faculty of Electrical Engineering and Computer Sciences, Maribor, Slovenia (e-mail: domen.verber@uni-mb.si) Abstract: For hard real-time systems, the Earliest Deadline First (EDF) scheduling is the best solution for single processor systems. Unfortunately, the EDF algorithm has complexity of O(N 2 ). Because the scheduling introduces interference in normal operation of the control application and prolongs the reaction times, it would be beneficial if we could reduce its influence. One way to do this coul be to implement the EDF scheduling of the lowest possible complexity and with the shortest execution time. This can be done using parallel execution where hardware resources are employed. In the article the optimal implementation of EDF scheduling is elaborated where tasks are kept in hardware implemented sorted list. This list is then updated whenever a task becomes active, suspended or is terminated. This virtually eliminates the impact of the scheduling on the normal application execution. In addition, because it is implemented in hardware, it outperforms any software implementation. Keywords: Embedded systems, real-time, scheduling algorithms, computational complexity, EDF. 1. INTRODUCTION In the paper, real-time embedded systems are discussed. In such systems, the given functionality must be performed in timely fashion, i.e. regardless of the situation in the system some operations must be completed within some predefined time interval. This can be realized only if a proper management of the tasks (jobs) is set. Traditionally, this is performed by the operating system. However, proper management of embedded system may be complex and timeconsuming task that interferes with normal operation of the system if performed on the same processor. Because of this, it would be beneficial if such impact could be reduced or eliminated altogether. One of the solutions is to employ a coprocessor that can perform the functionality of the operating system in parallel to the execution of application. The coprocessor interacts with the application only when the work of the application must be rearranged. Such solution also decreases the complexity of system design and analysis of temporal circumstances that may occur in the system. It also decreases the minimum reaction times and increases the throughput of the system. In addition to task scheduling, the coprocessor must also perform other functionalities of the operating systems (e.g. tasks synchronization, inter-task communication, etc.). The coprocessor can be implemented as additional standard (micro) processor that is dedicated solely to perform the job of the operating system. Nowadays multi-core processors solutions are available and they will become more and more popular with embedded systems. Therefore, the implementation where one of the processor cores is dedicated exclusively to operating system operations is feasible. However, because of the complexity of the algorithms for real-time operating systems, it may perform poorly irrespective of its processing power. The communication with the main processor must be synchronized with coprocessor s activities, which may additionally prolong the reaction. The second solution is to employ customized hardware implementation that is dedicated to operating system functionalities. This approach virtually eliminates the impact of the operating system to the normal application execution for several reasons. Firstly, the functionality of operating systems may be divided into independent functions, which are then implemented in parallel as separate processes on the hardware device. Secondly, the synchronization with the main processor can also be implemented asynchronously to other functionalities and have much smaller overhead. In addition, because it is implemented with hardware means, this approach outperforms in speed any software implementation. Such OS-on-a-chip may be used with less powerful main processors allowing cheaper and more powerefficient solutions. Such device can be also formally verified and certified for higher safety integrity levels than the software implementation. Furthermore, such device may serve additional purposes. For example, it can be used as intelligent I/O device, it may implement communication layer in distributed control systems, etc. Similar architecture was studied and successfully implemented in the IFATIS project (IFATIS (2005)). With new approach presented in the paper, further improvements are expected. The paper focuses on hardware implantation of one of the most important part of an operating system: the task scheduling. This is only one part of the functionalities that must be performed by the operating system. Nevertheless, this part has the greatest influence on temporal behaviour of the systems. In the first part of the paper, a brief introduction to tasks and task scheduling for hard real-time embedded systems is given with focus on Earliest Deadline First (EDF) scheduling. In the second part, the modification of EDF algorithm is studied that executes in constant time
independent on the number of the tasks in the system. In the third part some hardware implementation consideration of such algorithm are presented. 2. TASKS AND TASK SCHEDULING Typically, embedded application consists of several computing processes or tasks. Most often there are much more tasks than processing facilities which execute them, therefore some tasks must be executed sequentially. To execute tasks effectively, a proper schedule (or arrangement) of tasks must be found that conforms to some restrictions. This process is known as task scheduling. For simple applications, the task scheduling can be performed in advance (a priori). In complex systems however, where the situation frequently changes, this simple scheme is not adequate. A task may be dormant (inactive), ready for the execution (active) or executing on the processor. An active task may be temporary suspended in execution due to unavailability of some system resource, because of the need for synchronizations with other tasks, etc. In this case, the schedules of tasks must be determined dynamically. Traditionally some kind of priority is used to determine which task must be executed next. An active task with the highest priority is always executed first. If needed, tasks with lower priority may temporally be suspended to allow the running of the more important ones. However, the priority based scheduling strategies are not adequate for hard realtime systems (Cooling (1993)). In these systems, when a task is started, it must be finished prior some predetermined deadline regardless of the conditions in the system. Therefore, a schedule of tasks must be constructed in such way that all tasks will meet their deadlines by taking into account their execution times and other time-delaying factors in the system. Such a schedule is called feasible. Several deadlines driven scheduling methods exist. One of them is Earliest Deadline First (EDF) strategy, which is proven to be the best solution for single processor systems. 2.1 EDF task scheduling In EDF, task with the shortest deadlines must be executed first. To find it, the scheduler must seek through the list of ready tasks and find the task with the shortest deadline. This can be represented with a simple pseudo code: min_task_index = 0 min_deadline = for i=1 to n do if taskinfo[i].deadline < min_deadline then min_deadline = taskinfo[i].deadline min_task_index = i The taskinfo array holds the information of all ready tasks in the system. This process must be executed every time a new task become active (i.e. ready for the execution) or when currently running task is suspended or terminated. Termination or suspension of tasks other than the currently running does not influence the schedule. However, this is only a part of the work that must be done by scheduler. The next important thing is to prove that the schedule is feasible (i.e. that all active tasks will meet theirs deadlines). For the EDF, the schedule is feasible if the next Equation is fulfilled for each active task: k ak li, k = 1,..., n i= 1 This Equation states that the sum of the remaining execution times l i of all tasks T i scheduled to run before, and including task T k, must be less or equal to the relative time of deadline a k of task T k. In other words, the cumulative workload to be performed prior and during execution of task T k must be completed before its deadline. The tasks are sorted by their ascending deadlines. Again, the condition determined by the formula is static, and must be re-evaluated only when one or more of the tasks change their states. This schedulability check can be converted into a form with a double nested loop, illustrated with the next pseudo code: cumulative_finish_time = current_time for i=1 to n do for j=i+1 to n do if taskinfo[j].deadline < taskinfo[i].deadline then swap(taskinfo[i],taskinfo[j]) cumulative_finish_time = cumulative_finish_time+ taskinfo[i].remaining_exec_time if cumulative_finish_time > taskinfo[i].deadline then raise deadline_violation_error The first part of the outer loop is used to sort the data according the deadlines of tasks. As a side effect, the algorithm also puts the task with the shortest deadline in the first place of the array. Therefore, the searching for the tasks with shortest deadline can be combined with the feasibility check. In the second part of the outer loop, feasibility is tested by first adding the remaining execution time of the current task to the cumulative execution time and comparing it to the task's deadline. 2.2 Optimization of the EDF algorithm As a drawback, the EDF feasibility check described above requires N 2 /2 iterations (for the N active tasks) which is much more in comparison to priority based task scheduling; to find a task with the highest priority, only N iterations are required. The complexity of the algorithm may be reduced if the sorting part of the algorithm is somehow eliminated. This may be achieved with maintaining the taskinfo array sorted all the time. However, the such implementation of EDF algorithm is not complete. If the taskinfo array follows to be properly maintained all of the time, in addition to task activation, other tasking operations such as task termination and task suspension/continuation and must be considered. (1)
Task activation When a new task becomes active, the algorithm must find the proper place for it in the array, according to its deadline. The information of a new task must be inserted between some existing tasks by shifting the remaining content of the array toward its end. At the same time, the cumulative finish time must be updated and feasibility check must be performed. This can be described with following pseudo code: pos = i for j=1 to n-i do if taskinfo[j].deadline>curr_data.deadline then pos = j break for j=n-i downto pos do taskinfo[j+1]=taskinfo[j] taskinfo[pos] = curr_data At the beginning, first element in the array with the longer deadline that the one to be added is determined. Once this position is determined, the remaining data in the array is shifted to the end and the new data is inserted into the gap. At the first look, the complexity of this algorithm is O(n). Indeed, if we implement this algorithm sequentially, this is true. However, both loops can be parallelized, i.e. all iteration of the loop can be executed at the same time. The first loop has no data dependency in it and it can be easily parallelized by the comparison of the current task s deadline with the deadlines already in the table. This can be done in parallel if a comparator is used with each element in the list. The second loop has data dependency between two loop iterations. Several elements of the array are accessed and modified from two loop iterations. However, this part of the code is only the sequential program representation of a shift operation. In hardware architectures, shift registers and queues are frequently used where pieces of data are moved in similar way. Therefore, the second loop can be implemented by linking the elements of the taskinfo table in serial fashion. This approach also requires modification of the feasibility check. According to (1) we must sum the remaining execution time of tasks every time the new task is added. This requires a loop that cannot be completely parallelized. To avoid this, each element of the taskinfo table must be expanded with the current remaining execution of all tasks before and including the current one. This attribute stands for the cumulative_finish_time variable in the original EDF algorithm. To keep this attribute properly updated, every time when new task information is put in the list, its remaining execution time must be added to all elements in the list after it. Notably, those are the same elements that were shifted previously. To get the total remaining execution time for the current task, its value is summated with the value of the element before it. Because the remaining execution time is in relative form, and the deadline is kept in the absolute form, it must be converted to properly compared with the deadlines. To achieve this, when a new data is put in the first cell of the list, its remaining execution time is enlarged by the value of the current time. Thereafter, this value will eventually propagate through all elements of the list. Now the parallel version of the EDF algorithm may be summarized as a sequence of four steps. Because of limitations of sequential programming language for hardware implementation it is better to use plain text do describe them. Step 1: Compare the deadlines of all elements in the list with the current one and mark elements with greater deadlines. Step 2: Shift all marked elements to the end of the list by copying all the data. Update (set) mark for the last element in the list to be included in the next step. Step 3: Fill the gap with the current task s information. Add the remaining execution time of the current task to the cumulative execution time of all marked elements. If the first element in the list is updated, set its current cumulative execution time to the values of the current system time. Step 4: Compare the cumulative execution times with the corresponding deadlines and mark all elements where deadline will be violated. Each step has fixed execution time, therefore the time complexity of this algorithm become O(1). Further improvements are possible if some of the steps may be executed in parallel. If the resources for fast comparison of steps 1 and 4 are available, each first and the second pair of steps can be executed in parallel. Task termination When a task ends its execution, it must be removed from the array. The tasks that follows must be shifted to the beginning of the array and the cumulative execution times must be properly updated. This can be done with the next sequence of operations. Step 1: Find the position of the task to be removed in the table by comparing the ID of this task with the IDs in the table. Step 2: Mark all the elements after that position by comparing their index with the found one. Store the remaining execution time of the task to be removed into temporary variable. Step 3: Shift all marked elements to the beginning of the list. Step 4: Subtract the remaining execution time of the deleted task (stored in step 2) from the cumulative execution time of all marked elements. Because the cumulative execution times are only reduced, there could be no deadline omission at this operation.
Task suspension/continuation When a task is suspended or is continued after suspension, the state of the task in taskinfo array must be properly updated. The associated algorithm must find the proper position in the array by comparing the given ID. Then the state of the task in this position is changed. Step 1: Find and mark the position of the task where status will be changed by comparing the ID of this task with the IDs in the table. Step 2: For the marked element, change the state information. Periodic update of taskinfo array Even when there is no task scheduling operation, the taskinfo array must be periodically updated. For example, the property that represents the remaining execution time of a task must be updated periodically for some tasks. Obviously this is so for the task that is currently running. However, all suspended tasks are considered to be in executing state as well. When the estimation of execution time of a task is evaluated, all periods when the task would be explicitly suspended must be also included. The alarm should be raised if the value representing the execution time of a task becomes negative. This situation denotes that the predetermined execution time of the task was incorrect. The notion of the suspended tasks also complicates the process of determination of the next task to be run. In simple case, where only one task is considered to be running, the first element of the taskinfo array always identifies this task. On the other hand, if for any reason the first task in the table becomes suspended, the next one should be run, etc. For this, a separate process should observe the states of the tasks. First unsuspended task in the list is marked by this process to be running on the CPU. 3. IMPLEMENTATION OF THE OS COPROCESSOR In this section some notes about implementation of EDF scheduling engine is presented. This engine would be a part of a much more universal OS co-processor. The building of such co-processor had been started using Field- Programmable Gate Array (FPGA). The FPGA device consists of a set of programmable logic components or logic blocks, which can be set after manufacturing. The usual number of blocks on the FPGA device is several thousand. Each logic block is capable to perform a simple digital function and/or it can be used as a memory element. By changing the settings, different digital devices may be implemented with the combination of the blocks. Modern FPGA devices may also contain some dedicated digital circuits like memory blocks, clock generators, etc. FPGA configuration can be specified using a hardware description language (HDL) or digital schematic. This allows full flexibility at the design time. However, for the production phase, ASIC or custom build chips would be more cost effective. 3.1 Interface between CPU and OS coprocessor The main processor communicate with the OS co-processor through a set of registers. A command register issues a specific OS command and another set of registers is used to pass the parameters. In addition, the results of the operations and the current OS status are presented through another set of registers. This is portrayed on Figure 1. Cmd TaskID Param1 Param2 CurTaskID CurTime Fig. 1. Register set of OS co-processor First the task ID and the parameters are written to the registers. Then the code of the operation is written into command register and the operation is performed. Not all of the registers are used by each operation. Several microprocessors allow the extension of the basic instruction set by means of machine instructions with designated bit pattern. For example, a special prefix is used by x86 microprocessor architecture to execute floating-point arithmetic instructions. Other microprocessors may use the so-called trap instructions or software interrupts. In most operating systems, such instructions are the primary mechanism of executing OS calls. For tasking operation, the parameters would be the task deadline and its execution time. The CurTaskID register is a read-only register that holds the ID of the task that must be executed as the result of the last operation. Some of the mechanisms used by the OS coprocessor may change this register asynchronously. Because of that, additional interrupt engine may be implemented to notify the main processor when context-switch must be performed. The basic set of commands for tasking operation would be: Initialize prepare the internal data structures of the OS coprocessor. In the case of task scheduling operations, it clears the taskinfo table. AddTask add a new task in the taskinfo table. The parameters are task s ID, deadline and execution time. The operation updates the taskinfo table and evaluates the next task to be executed. If the new task has the shortest deadline, the context-switch is initiated. If some of the tasks would violate its deadline, the alarm is raised. Suspend suspends the execution of the tasks with the given task ID. If the task being executing is suspended, the next one to be running is determined and context-switch is signalled. Continue allows the execution of the suspended tasks. Again, it is possible that this task will have the shortest deadline and the context-switch will be initiated.
RemoveTask remove a task from the task table and update some other parameters in the table accordingly. If the currently running task is removed from the task table, the next task to be run is determined. Tick periodically updates the execution times of the tasks in the task table. For the task that is running and for the all suspended tasks, the remaining execution time is decreased. As was described before, the alarm is raised if the remaining execution time becomes zero. The execution of this operation may be implemented independently from the main processor by means of a counter. The period of this counter is determined during the initialization of the co-processor. 3.2 Implementation of taskinfo table The taskinfo table should be observed as a set of independent cells or components with the same functionalities. We named this component EDF cell. Multiple cells are linked together to implement the task list. This is illustrated in Figure 2. The parameters of the current OS instruction (such the index of the task, its deadline and its remaining execution time) are put on the common bus. Then, a series of control signals (not shown in the picture) are generated to execute different steps of the specific instructions. Apart from parameters, each cell has two sets of inputs and outputs that are used during the shift phase for some OS instructions. The data from a single cell may be shifted into the next or into the previous cell. Several logical signals are also used to synchronize shift operations (ShiftMark) or to signal if there is a deadline violation or some other error (ErrMark). On one side of the first cell in the list some predetermined values are used as the inputs. E.g., to determine the cumulative finish time of the first cell in absolute form, the value of current time must be added. Therefore, one of the inputs is connected to the CurTime register. 3.3 Implementation of EDF cell Each EDF cell contains a set of registers that hold different attributes of the task: ID, deadline, remaining execution time, cumulative finish time and the task state. Those registers are connected to the inputs through multiplexers to allow values to be filled in either from the common bus or from the adjacent cells. In addition, each cell consists of digital logic responsible to execute different OS operations. Those components can be logically divided into several parts. The purpose of such division is to allow parallel execution of several parts during execution of an OS instruction. First part tracks the index of the task that is on the particular place in the list. This part also contains a comparator that compares the task ID with one of the parameters on a common bus. This is required during some operations to identify a specific cell by its ID. If required, the ID of the task can be put into the CurTaskID result register. The current state of the task is also held here. The second part of the EDF cell is responsible for deadline comparison and marking the cell for the shift operation when a new task becomes active. For this, another comparator is required that compares the current deadline with one of the parameters. The third part tracks the remaining execution time of a task and signal an error if it becomes negative. The register that contains the value can be either set with some input value or decremented by one. The binary representation of negative value of minus one is consisting of all ones. Therefore, instead of full comparator, much simpler AND function can be used to detect wrong execution time estimation. The fourth part of the cell calculates cumulative finish time and marks deadline violations. For this an arithmetic unit is required that can add or subtract one of the inputs from the existing value. For deadline omission detection, yet another comparator is required. For illustration of operation, when a new task becomes active the AddTask operation is performed. At first, the comparator responsible for deadline comparison in the second part of the EDF cell is used to determine all elements with greater deadlines. In the second step, the registers of marked cells are shifted to the right and, at the same time, the current values of parameters are put into the first marked cell. In the third step, the cumulative remaining execution times of marked cell are increased by the current execution time on the common bus. This is implemented with dedicated arithmetic unit in the third part of the cell. In the last step, the cumulative execution times are compared (with the comparators in the fourth part) and violations are signalled. Other operations are executed and can be explained in similar fashion. Parameters ShiftMark1... ShiftMark2... ErrMark1 ErrMark2 ErrMarkN Fig. 2. Hardware implementation of ordered list of tasks
3.4 Identification of the running task To determine which task must be executed next, a separate logic is required. As was expressed before, the task to be executed next on the main CPU is the first task in the list that is not suspended. This can be easily implemented with the priority encoder. The status signals, which indicate if associated tasks within a cell are suspended, are gathered. Then the index of the first cell that is not suspended is identified and produced. This index is then used to access the task ID register in specific EDF cell. The identification of the next task to be executed is run in parallel to other parts of EDF cells. 3.5 Experimental results The work presented in the paper was tested with FPGA device. In the experiments, Xilinx Spartan2E devices were used (Xilinx (2009)). No CPU interface was actually built yet; the experiments was done only to proof the concepts and to evaluate capabilities of such device. From all instructions, the AddTask is the most complex one. It requires four steps and complex cooperation of all parts of the EDF cell. By using both phases of the clock cycles of FPGA device, it was possible to reduce the execution time of this instruction to two basic clock cycles. At higher frequencies, all four cycles would probably be required. Nevertheless, this is a superior result to any other implementation. Other operations can be executed in the same time or less. The drawback of proposed approach is that it requires many hardware resources (i.e. several comparators, arithmetic units and registers for each EDF evaluation cell). The amount is linearly dependent on the number of tasks to be evaluated. For example, in the case of 32 tasks, approximately 2000 slices are required. For the comparison, the ad-hoc solution of the original algorithm requires only approximately 140 slices independent of the number of tasks. However, the execution time of the algorithm for 32 tasks is about 600 basic clock cycles. Simple hardware circuits, such as FPGA, are optimal to operate for Boolean and integer quantities, therefore the entire task s information were transformed into these two types. A brief experiment with floating-point quantities for time representations was done. However, the amount of silicon consumption was prohibiting. Some implemented solutions depend on the signal propagation trough all the cells in the array. This represents no problem at moderate clock frequencies and for moderate number of EDF cells. However, the signal propagation time may become a problem with faster clock sources or when more than sixty EDF cells are required. Alternative implementation of problematic parts are being studied. 4. CONCLUSION Nowadays, the hardware implementation of software algorithms is feasible even for low-cost embedded system solutions. Operating systems have well defined and relatively limited set of functionalities. Therefore, it is easy to imagine having an OS-on-a-chip solution that may be used in the same way as mathematical coprocessors two decades ago or as graphical coprocessors are used today. They may even become a part of general processors in the future. In the future, the hardware implementation of other functionalities of the operating systems and the middleware will be studied. In addition, integration with several microprocessor architectures is planned. REFERENCES Cooling J. (1993) Task Scheduler for Hard Real-Time Embedded Systems. Proceedings of Int'l Workshop on Systems Engineering for Real-Time Applications. IEE, Cirencester, London. 196-201 IFATIS (2005). IFATIS - Intelligent Fault Tolerant Control in Integrated Systems. http://www.ist-world.org/ Liu C.L. and Layland J.W. (1973). Scheduling algorithms for multiprogramming in a hard real-time environment. Journal of the ACM, 20(1), 46 61. Xilinx (2009). http://www.xilinx.com