1 Par of: Muliprocessor SysemsonChips Edied by: Ahmed Amine Jerraya and Wayne Wolf Morgan Kaufmann Publishers, 2005
2 2 Modeling Shared Resources Conex swiching implies overhead. On a processing elemen, pipeline saes and regiser conens mus be saved and resored, and caches mus be (parially) flushed. On a communicaion elemen, conex swich includes all bus arbiraion overhead. Conex swich overhead in memory access is usually negligible since modern DRAM ypes operae using fixed ransacions, independen of he process issuing a memory access (excep he processes reprogram he DRAM inerface), and all oher memory ypes have no conex defining inernal saes. Conex swiching ime is mosly consan and can, herefore, usually be deermined a design ime. Scheduling effecs, however, are highly execuion ime dependen. In his secion, we will look a boh process scheduling and communicaion scheduling. There are hree main classes of scheduling sraegies: Saic execuion order scheduling Timedriven scheduling wih subclasses of Fixed ime slo assignmen Dynamic ime slo assignmen Prioriy driven scheduling wih subclasses of Saic prioriy assignmen Dynamic prioriy assignmen. The efficiency of hese models depends heavily on he acivaion model, as we will see in he following. 2 Saic Execuion Order Scheduling Figure 1a gives an example of wo processors CP 1 and CP 2 communicaing over a shared communicaion elemen CE 1. CP 1 runs processes P 1, P 2, and P 5, CP 2 runs P 3 and P 4. Figure 1b shows a Gan char of he scheduling sequence. CP 1 and CP 2 sar processing P 1 and P 3, respecively. When P 1 has finished, i sends daa o he dependen process P 4, followed by a conex swich of CP 2 o P 2. P 4 sends daa back o CP 1 which enables P 5 o execue. Now, all processes have been execued and he scheduling sequence can repea. As seen in he figure, saic execuion order scheduling is applicable o boh process elemen and communicaion scheduling.
3 Modeling Shared Resources 3 a CP 1 CE 1 CP 2 CP 1 P 1 P 2 P 5 P 1 P 2 P 5 CE 1 C 1 C 2 C 1 C 2 CP 2 P 3 P 4 P 3 P 4 csw b p p : scheduling period csw: conex swiching ime 1 Saic execuion order scheduling. (a) Example archiecure. (b) Schedule. FIGURE Saic process execuion has a number of imporan advanages. I suppors inerleaved uilizaion of processing and communicaion elemens minimizing idle imes, since here is full conrol on he execuion order. For he same reason, buffer sizes can be efficienly opimized. Sequences of processes on one processing elemen can be clusered ino one process and hen compiled. The compiler will implemen an opimized conex swich and migh be able o find more opimizaions across processes. The scheduler is easily implemened as a sae machine. 3 TimeDriven Scheduling Timedriven scheduling is a very flexible scheduling sraegy. I assigns ime slices o processes or communicaion links independen of acivaion, execuion imes, or daa dependencies.
4 4 Modeling Shared Resources TDMA The ime division muliple access (TDMA) sraegy keeps a fixed assignmen of ime slices o processes or communicaion links. This assignmen is periodically repeaed. Figure 2 shows an example. Process P 1, P 2, P 3, and P 4 are assigned 12, 10, 5, and 13ms, respecively. This resuls in a oal period ptdma of 40ms. The oal execuion ime of P 1 is 45ms, such ha i ends a r = 129ms. r is he response ime of P 1. Afer ha ime, he P 1 slo remains idle unil P 1 is acivaed again. For simpliciy, we have omied he conex swiching imes in he figure. P 2 has an execuion ime of 23ms and a response ime of r = 95ms. I is again acivaed a = 150ms and coninues execuion a = 172ms. P 3 wih an execuion ime of 54ms has a response ime of r = 426ms. The greaes advanages of TDMA are predicabiliy and simpliciy. Processes or communicaions wih arbirary behavior and acivaion can be merged on one resource wihou influencing each oher. In effec, he available performance is scaled down according o Equaion 1. (For readabiliy, we use his simple upper bound.) This is an excellen propery for inegraion ha has been exploied in many inegraion applicaions, such as in auomoive design (TTP bus) or by he onchip MicroNework offered by Sonics as communicaion IP. The main limiaions are efficiency and long oal response imes. There is some flexibiliy since he ime slos can be adaped a sysem sarup ime. P 1 P 4 P 2 P P P P p TDMA idle resource CSW omied for simpliciy 2 Scheduling and idle imes in TDMA. FIGURE
5 Modeling Shared Resources 5 petdma ( P, pe ) = i i È ÎÍ ( P, pe ) pe i Pi i csw ptdma (1) Round Robin Roundrobin scheduling depars from he fixed ime slo assignmen and erminaes a slo if he corresponding process ends. Therefore, slos are omied or shorened, and he cycle ime, RR, of he roundrobin schedule is imevarian. Figure 3 shows he example of Figure 2, his ime for a roundrobin sraegy. P 1 now ends a r = 113ms, bu, more impressively, P 3 has a response ime of r = 179ms. Round robin avoids he idle imes of TDMA and reaches maximum resource uilizaion. On he oher hand, process execuion is no longer independen, losing he mos imporan inegraion propery of TDMA. P 3 only finished so quickly because he oher processes were no execuing. However, round robin guaranees a minimum resource assignmen per process, since under full load condiions i falls back o a TDMA schedule. This is suiable for applicaions wih sof deadlines and qualiy of service requiremens in which a given resource level mus be guaraneed. Again, round robin is applicable o communicaion and processing. I is found in many applicaions, such as in he Sonics MicroNework for onchip inerconnec or in sandard operaing sysems. P 1 P 4 cycle 1 cycle 2 cycle 3 P P P P RR (1) RR (2) RR (3) 3 Roundrobin scheduling. FIGURE
6 6 Modeling Shared Resources 4 PrioriyDriven Scheduling Saic Prioriy Assignmen The hird class of scheduling sraegies uses process or communicaion prioriies. Saic prioriy assignmen allows one o offload he scheduling problem o a simple inerrup uni. Vecorized inerrup unis reducing inerrup laency and conex swiching overhead are found even in small 8bi microconrollers, such as he A scheduler process or conrol uni is no needed. Finally, here is efficien analysis and opimizaion algorihms are available. We discuss hree differen saic prioriy assignmen sraegies ha differ in heir acivaion model. Again, we will look a processing elemens, bu he same discussion applies o communicaion. Model 1. Processes are acivaed by he arrival of an inpu even. Inpu evens are periodic wih jier. The process deadline is a he end of he period. Therefore, process execuion mus be periodic wih jier, as well. The inpu even and, hence, he process execuion raes may have differen periods. This classical and widely used model was firs invesigaed by Liu and Layland. They proved ha he opimal soluion for single processors is o order he process prioriies according o increasing process execuion raes, i.e., he process wih he shores period is assigned he highes prioriy. This rae monoonic scheduling (RMS) is very popular in embedded sysem design due o is simpliciy and ease of analysis. I has, e.g., been exended o cover synchronizaion for muual exclusive resource access and muliprocessing. Deadline monoonic scheduling (DMS) is a sraighforward exension of RMS for deadlines smaller han a period. A nice propery of RMS and DMS in he conex of more complex sysems is ha he processes finish periodically wih jier. This propery allows conrol of buffer sizes and he load on oher sysem pars ha use he oupu evens of hese processes as inpu. Model 2. Like model 1, excep ha a subse of he processes is dependen such ha a process P 2 ha depends on P 1 can only be execued if process P 1 has been finished in his period. Obviously P 1 and P 2 mus have he same period. This relaion can be represened in a ask graph. This model has been invesigaed by Yen and Wolf. The dependencies consrain he possible even jier. They also imply a saic process order for each period. Boh can be exploied for beer uilizaion of processors and communicaion links, for single processors as well as muliprocessors. In effec, he schedule wihin a period appears like a loosely coupled version of saic execuion order scheduling. Model 3. Like model 1, bu wih arbirary deadlines.
7 Modeling Shared Resources 7 T 1 T1 P 1 P 2 T 2 T2 T 2 T 2 P 3 Prioriy busy period P 3 4 Saic prioriy scheduling wih arbirary deadlines. FIGURE This seemingly small change has a major impac on opimizaion, analysis, and sysem load. On he oher hand, his model is frequenly found in more complex sysems, in which a deadline covers muliple componens and subsysems. Take he example in Figure 4. P 1 wih execuion period T 1 has he shores deadline and is assigned he highes prioriy. P 2 wih execuion period T 2 is nex in prioriy and P 3 is las. The firs inpu evens are assumed o be available a = 0, a known worscase siuaion. Execuion sars wih he highes prioriy ask P 1. If i has erminaed, P 2 will be execued. Before P 2 has even sared, he nex inpu even for P 2 has arrived a = T 2. As soon as P 2 ends, i is sared again o process he buffered second inpu even. The hird inpu even arrives a = 2T 2, before he second P 2 execuion has ended. The hird execuion of P 2 is inerruped because he second even for P 1 has arrived a = T 1. P 2 can only resume a he end of T 1. P 3 can only be processed shorly before = 2T 1. The execuion sequence is raher complex. Wih is inpu buffers filled, P 2 execuion runs in a burs mode wih an execuion frequency ha is only limied by he available processing elemen performance. This burs execuion will lead o a ransien P 2 oupu burs ha is modulaed by P 1 execuion. A larger sysem wih more prioriy levels generaes complicaed even burs sequences. Using anoher processing elemen (or a faser bus if we apply his scheduling sraegy o communicaion) will increase he P 2 burs frequency and, consequenly, he ransien load caused by P 2 oupu.
8 8 Modeling Shared Resources On he oher hand, we can see in he example ha P 2 execuion frequency can be easily bounded over a larger ime inerval. This observaion jusifies he inroducion of even burs models and corresponding analysis approaches. There are many soluions o he analysis and opimizaion of model 3 sysems. Lehoczky provided a soluion o iming analysis for arbirary deadlines, Audsley e al. proposed an ieraive heurisic algorihm for prioriy assignmen. Finally, Tindell and Clark presened an algorihm ha exends model 3 o handle periodic inpu evens wih jier and burss, effecively inroducing inpu oupu even model compaibiliy and, hus, ineroperabiliy for saic prioriy sysems wih arbirary deadlines. However, his approach applies o single processing elemen sysems only. Dynamic Prioriy Assignmen Saic prioriy algorihms canno reach maximum resource uilizaion, no even for he simple model 1. To reach higher resource uilizaion (or shorer deadlines), he prioriy mus be assigned dynamically a run ime. The bes dynamic prioriy assignmen sraegy is he one ha gives he process wih he earlies deadline he highes prioriy. The advanage of his earlies deadline firs (EDF) scheduling is is flexible response o inpu even iming and process execuion imes. However, i depends on he availabiliy of hard deadlines for each process. In SoCs, hese deadlines mus be derived from global sysem requiremens. If he process execuion imes are known, hen such deadlines can be derived from ask graphs, as already shown by Blazewicz and laer exended o mulirae periodic process sysems wih circular dependencies by Ziegenbeine. There is a hos of work in he domain of dynamic prioriy assignmen and he respecive iming analysis for differen inpu models, which canno be discussed here. In any case, dynamic prioriy assignmen requires a scheduler process running he assignmen sraegy and hereby observing he (local) sysem sae. Therefore, in SoCs, i is pracically resriced o run on microconrollers. There, i adds scheduling overhead o he conex swiching overhead and evenually also increases power consumpion. 5 Resource Sharing: Summary The efficiency of resource sharing sraegies is largely dependen on he acivaion model. Time driven scheduling is very robus bu reaches less efficiency han oher sraegies or is only suiable for bes effor applicaions. Saic order
9 Modeling Shared Resources 9 scheduling reaches highes efficiency bu also imposes he ighes consrains on he inpu even sream and on narrow execuion ime inervals o reach his efficiency. Saic prioriy scheduling provides good adapaion o a wide range of inpu even model parameers bu can creae burs even sequences for he general case of arbirary deadlines. Saic prioriy scheduling is well suppored by algorihms for analysis and opimizaion. Dynamic prioriy scheduling provides he highes flexibiliy bu incurs significan scheduling overhead. For all resource sharing sraegies presened in his secion, formal analysis echniques have been proposed. Many of hese can even be applied manually. Only a small selecion has been presened here, which is, however, sufficien background o undersand he nex level of global performance modeling and analysis.
More information