by ORV BALCOM Simple Task Scheduler Prevents Priority Inversion Here is a method of task scheduling using a single interrupt that provides a deterministic approach to program timing and I/O processing. This approach can be implemented on any microprocessor system that can generate maskable interrupts at a fixed rate and has sufficient stack space. Nance Paternoster Real-time system programs are required to perform a multitude of different tasks at different times. The program s task scheduler controls the time of activation of each task. Often, a generalpurpose real-time operating system is used. The task scheduler of most realtime operating systems assigns priorities to these tasks, allowing higher priority tasks to preempt lower priority tasks. In most applications, data must be passed between priority levels. To ensure data integrity, a system of semaphores is used to control access to the common data. This can lead to a problem referred to as priority inversion, a condition in which a low-priority task holds up a higher priority task while the lower task is using the data. This problem, as well as possible solutions, has been the focus of numerous recent articles in this magazine. Naomi Avigdor discusses the problem and gives various suggestions for dodging the issue. 1 For the more analytically inclined, Ray Obenza provides an overview of rate monotonic analysis, a tool for guaranteeing realtime performance. 2 E. Douglas Jensen, the technical director of real-time engineering at Digital Equipment Corp., states that real-time generally means real-fast. 3 What is needed, he says, is to differentiate between hard and soft deadlines for real-time computations. Hard real-time conventionally, Jensen writes, is defined as deterministic in the sense that the only critical computations are those with deadlines, and scheduling objective is that these computations must always meet their deadlines, or the system has failed. This would imply that that it is acceptable for some tasks to be missed and for the user to not know where the program is at all times. I don t know about you, but I always want to know what my programs are doing. If the program s state can always be determined (the program is deterministic), there can be no priority inversion. Do-While Jones, possibly with tongue in cheek, went to the other extreme, extolling the advantages of interrupt-free program designs. 4 44 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
Interrupt-driven designs tend to be chaotic and non-deterministic, he writes. While I agree with his statement, I think Jones has left out a very important case between no interrupts and interrupt-driven designs. There is a program designed to use a single interrupt as a timing and scheduling tool. The interrupt is generated by internal (to the CPU) or external hardware at a fixed rate. Since the interrupt is not asynchronous to the program operation, the program can be deterministic and the program s state can be known at any time. The timing of the interrupt must be based on the external asynchronous requirements of the system. Since 99.99% of real-time embedded systems, by CPU count, have only one user and are doing only one job, I contend that none of the previous complications are necessary. For all but the most complex single-user systems, you must ensure that every higher priority task runs to completion before giving control to lower priority tasks. The program state can be determined at any time. As Jensen pointed out, if there is not enough CPU horsepower, the lower priority tasks may get shorted. What is important is that the tasks, by priority, run to completion. You would think an embedded system has to have a multitasking real-time operating system. SINGLE INTERRUPT TASK SCHEDULER Most embedded real-time microcomputer programs require the handling of asynchronous inputs and outputs. This I/O must be handled in a controlled and timely manner, or elusive program bugs can occur. Real-time programs also have tasks that are synchronous to the program operation. They must be performed at fixed intervals, synchronized to program operation, and sometimes synchronized to some external repetitive input. This article examines a method of task scheduling using a single interrupt that provides a deterministic approach to program timing and I/O processing. It can be implemented on any microprocessor system that can generate maskable interrupts at a fixed rate and has sufficient stack space. While index registers are not needed, they simplify the programming of reentrant subroutines. The Z80 and MC68HC11 microprocessors meet these requirements. To look at most articles and advertisements, you would think an embedded system has to have a multitasking real-time operating system. This expectation is a holdover from the days when CPUs were expensive. In those days, mainframes and minis had operating systems that allowed multiple users at the same time. Since there was only one CPU, users had to share it. Generally, these systems allocated a portion of the CPU time to each user on a rotating basis. Instead of scheduling tasks by allocating various amounts of CPU time based on priorities, take the following approach. The program will run in a continuous timed loop, called the background loop. During each background, different tasks are performed in order, and then the program enters a tight loop until it is time to start the background again. These are the lowest priority tasks. Variations in latency will not affect program performance. Typical background tasks would perform such functions as service a display, archive data, or perform built-in test routines. JUNE 1995 EMBEDDED SYSTEMS PROGRAMMING 45
68HC11 STACK POINTER MOVEMENT These diagrams show the 68HC11 and Z80 stack pointers and stack contents traced through the activation of the foreground routine and its subsequent interruption. Each horizontal line represents a byte. The approaches differ, since the 68HC11 automatically stacks the registers when interrupted while the Z80 has a duplicate register set that can be used during interrupt servicing. In the following diagram for the 68HC11, PCH0, PCL0 is the return address to the interrupted background program. PCH1, PCL1 is the return address to the interrupt service routine. PCH2, PCL2 is the return address to the foreground routine after the second interrupt. Stack Pointer First interrupt Foreground and Return to interrupted Address initiated 2nd interrupt Background (Hex) 10FF < Top of Stack 10FE 10FD < Assume 4 bytes used by Background at time of 1st interrupt 10FC. < SP after return 10FB PCL0 < 1st int PCL0 PCL0 from 1st int 10FA PCH0 PCH0 PCH0 10F9 IYL IYL IYL 10F8 IYH IYH IYH 10F7 IXL IXL IXL 10F6 IXH IXH IXH 10F5 ACCA ACCA ACCA 10F4 ACCB ACCB ACCB 10F3 CCR CCR CCR 10F2 PCL1 <--JSR to--> PCL1.< SP after return from Foreground 10F1 PCH1 Foreground PCH1 10F0 10EF < Assume 3 bytes used by Foreground at time of 2nd interrupt 10EE.. < SP after 2nd int 10ED PCL2 < SP at 2nd int 10EC PCH2 10EB IYL 10EA IYH 10E9 IXL 10E8 IXH 10E7 ACCA 10E6 ACCB 10E5 CCR 10E4. < SP during 2nd int Usually, some repetitive tasks must be done at a rate higher than the background rate. These tasks can be synchronous or asynchronous. The synchronous tasks are called foreground tasks. The individual foreground tasks may have different activation rates, but the rates are usually multiples of each other. The foreground is often initiated by an interrupt at a fixed timed interval. Typical foreground tasks are servicing analog input or output, servicing a real-time control loop, or implementing digital filters. The other tasks are not synchronized with the foreground and background and occur at times determined by sources outside the program. The timing of their occurrence must be considered random with respect to the program timing. Figure 1 depicts the timing of a single background loop of this type of program. The traditional way to process asynchronous I/O is to let each service request interrupt the background and foreground process. When an asynchronous input or output request is detected, the CPU interrupts the program in process and vectors to the appropriate interrupt service routine. After servicing is complete, the program returns to the program in progress. If there are two or more asynchronous service routines, the program requires some sort of prioritization of the routines. As the number of different functions increases, the complexity of prioritizing the interrupts also increases. This becomes especially obvious while debugging the program with an in-circuit emulator. Elusive bugs caused by a second interrupt during an interrupt service routine are the bane of this approach. There is a program design that simplifies the program operation and eliminates these problems. The design uses only one CPU-generated interrupt, so prioritizing is unnecessary. Although you might not be able to meet some requirements with this design, it is applicable to most real-time systems. First, let s look more closely at the problems associated with processing asynchronous I/O. INTERRUPT I/O PROCESSING In an interrupt service routine, interrupts generally remain disabled to ensure that another interrupt doesn t cause an exit from the first service routine. Consequently, the second request cannot be serviced until the first one is complete. The system requirements specify the maximum latency in servicing a given I/O request. If the length of time required for any other service routine exceeds the latency, the specification will not be met. During the initial program design, these requirements are usually met. But later, when changes are made, 46 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
Z80 STACK POINTER MOVEMENT In this diagram for the Z80, the PCH0, PCL0 and PCH2, PCL2 addresses have the same meaning as in the 68HC11 example. PCH3, PCL3 is the address of the foreground service routine and is pushed to the stack before the return from interrupt to activate the foreground. Stack Pointer First interrupt Foreground and Return to interrupted Address initiated 2nd interrupt Background (Hex) FFFF < Top of stack FFFE FFFD < -Assume 6 bytes used by Background at time of 1st interrupt FFFC FFFB FFFA. < SP after return FFF9 PCH0 < 1st int PCH0 PCH0 from 1st int to FFF8 PCL0 PCL0 PCL0 Background FFF7 PCH3 <-Foreground ->. Foreground Stack FFF6 PCL3 address pushed. usage, assume FFF5 to stack followed. 3 bytes. < SP after return FFF4 by a Return PCH2 < SP at 2nd from 2nd int FFF3 PCL2 interrupt FFF2. < SP during 2nd interrupt FIGURE 1 Foreground-background program timing. it is very easy to add improvements to one service routine and affect the response time to all others. The insidious part is that since the different I/O requests are asynchronous, the problem will show up randomly. It is a simple matter to see if the multiple prioritized interrupt approach will work. The interrupt service routines must be prioritized, the maximum execution time for each routine determined, and maximum required response latency for each routine specified. For each routine, the response latency for a given routine is the maximum time required for any routine plus the sum of the time required for all routines of a higher priority. This assumes that a higher priority routine can only be activated once while waiting for a lower priority routine. If not, the time needed by the routine must be multiplied by the number of times it can be activated. As the number of different interrupt routines increases, this procedure gets very complicated. When there are more than a few routines, the lower priority routine latency lengthens. On a worst-case basis, it often exceeds the specification. The program designer then resorts to probability theory to say that all those higher priority interrupts could never occur at once and everything will work all right. Sure, until the first demonstration or qualification testing. Also, any change in the code of any service routine requires recalculating the timing. WITH POLLED I/O PROCESSING There is a very simple solution for this problem that is applicable to most real-time systems. If the foreground and background processes are interrupted at a sufficiently high rate that any asynchronous I/O latency requirement is met, all I/O can be serviced by simple polling during the single interrupt. Inherent in this approach is dividing the I/O service routines into two parts: one that is required at the higher interrupt rate and the remainder that can be executed at the foreground rate. The following is a detailed explanation of the latter approach. First, determine the minimum latency for all functions. If it is possible to operate the CPU at a fixed interrupt rate less than half this latency, the approach will probably work. For example, if the program is receiving data at 9,600 baud, it takes 1,004 ms to receive a start bit, eight data bits, no parity, and one stop bit. If it is possible to operate the CPU with an interrupt every 500 ms, received data will be available every other interrupt on the average. Data will never be available more frequently than three interrupts in a row. Next, the individual service routines must be split into two parts. The first part, executed during each interrupt service routine, must input or output data if available. The second part, generating output data or processing input data, will be scheduled at the foreground rate. 48 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
The program is initialized to have a single interrupt operating at a fixed rate. The program will then enter the background loop with the interrupt enabled. When the background is interrupted, the code will service all asynchronous requirements by polling. The interrupt code also can maintain any required counters since its rate is constant. At the completion of the interrupt code, the program returns to the background. After a predetermined number of interrupts, the program executes the foreground code instead of returning to the background loop. Interrupts must be enabled during foreground, so that it too can be interrupted. Without this, foreground may overrun the next interrupt and violate the I/O latency requirements. Figure 2 shows this timing relationship. An example of this approach is a program that sends and receives serial data line by line. Data is passed between the foreground and interrupt code in buffers able to hold a line of data. During each interrupt, the output code tests if the output device is busy. If not, it will send the next character until the buffer is empty. Also, the interrupt input code checks if another input character is available. If so, it will be read and placed in local storage. When the program detects an end of line, it sets a flag to signal the foreground and moves the line from local storage to the input buffer. If moving the data takes too long, two buffers can be used and pointers to the two buffers swapped when the line is complete. The foreground portion of the output code tests if the output buffer is empty. When the buffer is empty and there is a new line to transmit, the foreground fills the buffer and sets a flag. The interrupt code sees the flag and begins transmitting data. The foreground portion of the input code monitors the flag or pointer from the interrupt code. When a line is received, it begins processing it while the interrupt code receives the next line. The first part of the service routines is executed each interrupt. Since the FIGURE 2 Interrupt, foreground, background program timing. time between interrupts is half the maximum latency for any service routine, data is guaranteed to be handled in a timely manner. Obviously, the total of the interrupt portions of the service routines and any interrupt housekeeping must execute much more quickly than the period between interrupts. If it doesn t, the CPU needs more horsepower. INTERRUPT ACTIVATION The interrupts must be activated at a fixed rate. The MC68HC11 has an internal counter TCNT, a 16-bit register that is continually incremented by the E clock. There is a prescalar that can be set to 4, 8, or 16 during program initialization. The counter cannot be reset by the program, but it can be compared to one of five 16-bit compare registers. When the counter and the compare register are equal, an interrupt will be generated if enabled. After each interrupt, the program must add the counts to the next interrupt to the last value in the compare register and load it to the register. The Z80 lacks an internal counter. External counters can consist of discrete devices that can be read as memory, an I/O port, or the Z80 s companion device, the counter/timer circuit. The counter/timer circuit is a multiplechannel counter timer that can be used to initiate interrupts at a fixed rate. The period between interrupts does not have to be exact. Timing of the program will only be based on the cumulative number of interrupt activations. If the period between one interrupt and the next is longer than normal, it will only affect the latency in I/O servicing. So, it is often prudent to disable the interrupts within the foreground or background code, which should be done while moving blocks of data acquired within the interrupt routine or foreground. Disabling interrupts is only acceptable if they are disabled for a fraction of the interrupt period. Generally, the time required to execute the code within the interrupt routine should be a fraction of the repetition period, say 30%. If it takes too long, there will be no time for the rest of the program. Theoretically, the code could take up to two interrupt periods without disastrous results. Good design would not ever let the execution time exceed an interrupt period. FOREGROUND ACTIVATION Activating the foreground routines in place of the background is not difficult but requires some planning. Remembering that the foreground will probably be interrupted, it is important that it does not share registers or RAM with the interrupt routines. The timing of the 50 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
foreground activation can be scheduled by a counter in the interrupt routine. For example, the foreground can be activated every 32 interrupts. If certain foreground tasks occur less frequently than others, it is best to activate the most frequent one directly and let the foreground code schedule the slower tasks. The foreground is activated by a call from the interrupt routine before the normal return from interrupt. The actual approach differs between the Z80 and the MC68HC11. The way the machine state is saved upon entering an interrupt also differs and must be considered when calling to foreground. Listing 1 shows the calling procedure for the foreground routines when using a Z80. The code segment demonstrates the portion of the interrupt service routine used to call the foreground routine. In this case, the foreground is called every eight interrupts. The assumed hardware uses a counter to set a flip-flop to activate each interrupt. The flip-flop output is connected to pin 16, INT* of the Z80. When the interrupt is serviced, the flipflop is reset by writing to a port. A flag is set while in the foreground to ensure that it is not called a second time while it is executing. When an interrupt is serviced by the Z80, only the return address is pushed to the stack. No registers are saved. Normally, the registers (except the index registers) are swapped with the alternate registers for use during the interrupt servicing. If the index registers are to be used during interrupt servicing, they must be pushed to the stack. All registers must be restored before returning from the interrupt or calling the foreground. To properly execute a return from interrupt with a Z80, a RETI instruction must be executed. This will enable interrupts and return to the address on the stack that is normally the program counter location before the interrupt. To call to the foreground routine, load the HL register with the address of the foreground entry and push it to the LISTING 1 Z80 foreground activation. ; Constants: TRUE equ 0FFH FALSE equ 0 ; Port locations ClrI equ0aah ; The port to clear interrupt F-F ; Variable storage, based at RAM. ForeFlag equ RAM ; Foreground lockout flag Real equ ForeFlag+1 ; 2 byte free running interrupt clock ; IntReal: This is where the program ; comes from the interrupt vector. IntReal: EX AF,AF ; Swap the registers. dont use <IX>, <IY> EXX LD HL,(Real) ; Bump interrupt clock INC HL LD (Real),HL ; The other interrupt routines would be ; done here. ; Call foreground every 8 interrupts. LD A,(Real) AND 7 ; Every 8 interrupts JP NZ,NotInFgd LD HL,ForeFlag LD A,(HL); Dont allow lock CP TRUE JP Z,NotInFgd LD (HL),TRUE LD HL,Forgnd ; Set new return PUSH HL ; To the stack NotInFgd: OUT A,(ClrI) ; Reset interrupt latch EX AF,AF ; Restore registers EXX EI RET ; NOTE: use RETI here if the hard- ; ware uses the Z80 interrupt chain ; to prioritize interrupts. ; Foreground routines. ; Called every 8 interrupts. Forgnd: PUSH AF PUSH BC PUSH DE PUSH HL PUSH IX ; Only store the index regs A/R PUSH IY ; Do the foreground routines here. POP IY POP IX POP HL POP DE POP BC LD A,FALSE ; Release foreground LD (ForeFlag),A POP AF RET stack. This is done before swapping the alternate registers. When the RETI is executed, the program counter will be set to the foreground code location. Future interrupts will interrupt the foreground in the same manner they interrupted the background. When in the foreground, all registers must be saved by pushing them to the stack. Any future interrupts must use the alternate registers, since the content of the registers belongs to the background. To exit the foreground, restore the registers and execute a RET. The program will return to where it left off in the background. It is best to disable interrupts before starting to store or restore the registers and enable the interrupt when complete. If this is not done, an interrupt could occur and push additional data to the stack between the stored registers, which could possibly cause a problem. Listing 2 shows the calling procedure for the foreground routines when using a 68HC11. The code segment demonstrates the portion of the interrupt service routine used to call to the foreground routine. In this case, the 54 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
LISTING 2 68HC11 foreground activation. ; Constants: TRUE equ 0FFH FALSE equ0 NumClocks equ 1024 ; Number of E clocks between interrupts OC1F equ 80H ; Clear Timer interrupt flag ; Port locations, after moving them to page 0. TOC1 equ 16H ; Output Compare 1 Register TFLG1 equ 23H ; Timer Interrupt Flag Reg. 1 COPRST equ 3AH ; Arm/Reset COP Timer Circuitry ; Variable storage, based at RAM. ForeFlag equ RAM ; Foreground lockout flag Real equ ForeFlag+1 ; 2 byte free running interrupt clock MastErr equ Real+2 ; Location tells program about errors LastComp equ MastErr+1 ; 2 byte storage for last compare value ; IntReal: This is where the program comes from the interrupt vector. IntReal: JSR ResetComp ; Reset the compare register and COP LDD Real ; Interrupt counter, 2 byte ADDD #1 STD Real ; The other interrupt routines would be done here. LDAA #FALSE CMPA ForeFlag ; See if flag is False BEQ NotInFgd ; Skip if not in foreground ; Check if program is still in foreground when it is time to enter it again. If so, it ; constitutes an error and cant be allowed to happen. LDAA Real+1 ; Get low byte ANDA #1FH ; Every 32 times BNE ExitInt ; Still in foreground if not 0 LDAA #TRUE STAA MastErr ; Show errors if here, overrun BRA ExitInt NotInFgd: LDAA Real+1 ; Get low byte ANDA #1FH; Every 32 times BNE ExitInt ; Not time for foreground yet LDAA #TRUE STAA ForeFlag ; Going to do it ; Foreground is called here. There should ; be nothing of value in the registers. JSR DoFgd ; Off to do the foreground, stack return LDAA #FALSE ; Return here after foreground STAA ForeFlag ; Foreground complete ExitInt: RTI ; Normal return, cleared I bit will be restored ; ResetComp: Reset the compare register and COP. Reset the interrupt flag bit. foreground is called every 32 interrupts. A flag is set to ensure that the foreground is not called a second time while it is executing. When an MC68HC11 interrupt is serviced, the return address is pushed to the stack followed by the index registers, the accumulator, and the condition code register. Execution of an RTI instruction to return from the interrupt routine will restore the registers, including the condition code register that contains the interrupt enable I bit. The program counter is restored to its position when interrupted. To call to the foreground routines, the I bit of the condition code register must be cleared to enable interrupts and a JSR to the foreground executed before the RTI of the interrupt routine. Since interrupts are again enabled, the foreground may be interrupted the same as the background. The background s version of the registers is still on the stack, so the foreground need not save the current state of the registers. To exit the foreground, all that is required is a normal RTS to return from a subroutine. The program will return to the interrupt routine (with interrupts enabled) and then to the background via an RTI, restoring the registers. ADDITIONAL CONSIDERATIONS Subroutines that may be called from both the interrupt routine and the foreground or background must be reentrant. This means that they may be interrupted at any time, reentered and executed to completion before finishing their original activation. The implication is that the routines must not use fixed RAM variable locations. They must either use registers, locations on the stack, or locations referenced to the index registers for variables. If the CPU has index registers, the best solution for using common subroutines is to allocate separate blocks of RAM for use by the interrupt routine, the foreground, and the background. The index registers can then be 56 EMBEDDED SYSTEMS PROGRAMMING JUNE 1995
LISTING 2 continued ResetComp: LDD LastComp ADDD #NumClocks STD LastComp STD TOC1 LDAA #55H STAA COPRST LDAA #0AAH STAA COPRST LDAA #OC1F STAA TFLG1 RTS ; Reset Timer #1, the interrupt timer ; Load the new value to the counter, 2 byte ; Reset the COP ; Clear interrupt flag ; DoFgd: Do the foreground routines. The program should be able to use all registers. DoFgd: CLI ; Clear I bit to enable interrupts ; Do the foreground routines here. RTS ; Return, end of foreground used as pointers into these blocks for common subroutine variable storage. The Z80 has three 16-bit registers and an 8-bit accumulator, which will suffice for most variables. The MC68HC11 has only a 16-bit accumulator, so it usually requires the use of index registers for dynamic variable storage. SATISFACTORY RESULTS We ve examined the requirements of program task scheduling and asynchronous I/O handling in a real-time system. We ve taken a look at a traditional method of prioritized interrupts and a simplified method using a single interrupt. In addition, we ve outlined approaches for implementing the single interrupt method for Z80 and MC68HC11 microprocessors. The single interrupt method will usually give satisfactory results. The key to this approach is that a higher priority task always runs to completion before turning control over to a lower priority task. This approach is deterministic. It can be analyzed for correctness, it will never invert priorities, and it is straightforward to debug. When its performance will satisfy the system requirements, the single interrupt method is the approach of choice. Orv Balcom has worked in the electronics industry for over 34 years, designing and programming microprocessor applications. In 1971, he founded Brown Dog Engineering, which provides custom engineering in areas of instrumentation and control systems and emphasizes the use of embedded microprocessor solutions. Balcom received his BS in mathematics from Long Beach State College. He can be reached at (310) 326-8482, or by mail at Brown Dog Engineering, Box 427, Lomita, CA 90717. RESOURCES 1. Avigdor, Naomi, Handling Inverted Priorities, Embedded Systems Programming, March 1994, pp. 44. 2. Obenza, Ray, Guaranteeing Real-Time Performance Using RMA, Embedded Systems Programming, May 1994, pp. 26. 3. Jensen, E. Douglas, Eliminating the Hard/Soft Real-Time Dichotomy, Embedded Systems Programming, Oct. 1994, pp. 28. 4. Jones, Do-While, Interrupt Free Design, The Computer Applications Journal, Feb. 1994, pp. 36.