An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

Size: px
Start display at page:

Download "An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures"

Transcription

1 An ILP Formulaton for Task Mappng and Schedulng on Mult-core Archtectures Yng Y, We Han, Xn Zhao, Ahmet T. Erdogan and Tughrul Arslan Unversty of Ednburgh, The Kng's Buldngs, Mayfeld Road, Ednburgh, EH9 3JL, UK Abstract-Mult-core archtectures are ncreasngly beng adopted n the desgn of emergng complex embedded systems. Key ssues of desgnng such systems are on-chp nterconnects, memory archtecture, and task mappng and schedulng. Ths paper presents an nteger lnear programmng formulaton for the task mappng and schedulng problem. The technque ncorporates proflng-drven loop level task parttonng, task transformatons, functonal ppelnng, and memory archtecture aware data mappng to reduce system executon tme. Experments are conducted to evaluate the technque by mplementng a seres of DSP applcatons on several mult-core archtectures based on dynamcally reconfgurable processor cores. The results demonstrate that the proposed technque s able to generate hgh-qualty mappngs of realstc applcatons on the target mult-core archtecture, achevng up to 1.3x parallel effcency by employng only two dynamcally reconfgurable processor cores. I. INTRODUCTION An mportant trend n embedded systems s the use of mult-core archtectures to meet applcaton s functonal and performance requrements. Mult-core desgns offer hgh performance and flexblty, at the same tme promse low-cost and power-effcent mplementatons. However, the semconductor ndustry s stll facng several other technologcal challenges wth mult-core systems. Important ssues n mult-core desgns are the communcaton nfrastructure, memory archtecture, and task mappng and schedulng. In mult-core archtectures, the performance of the entre system s affected by the executon order of tasks and communcatons. It s well known that task mappng and task schedulng are hghly nter-dependent. Therefore the two ssues need to be handled together n order to obtan effcent mappng and schedulng. Dynamc reconfgurable (DR) processor combnes the flexblty of FPGAs wth the programmablty found n general purpose processors (CPUs/DSPs) n a unfed and easy programmng envronment. It s a strong canddate for mult-core systems. In our proposed embedded mult-core platform whch has several DR processors [1], the shared memory heavly affects the executon tme and power consumpton. The tme of data transmsson between dfferent processors must be consdered durng schedulng such that the desgn result can conform to the real stuaton. In addton, n order to meet the system throughput constrants, the desgn s ppelned to construct more effcent archtectures. Ppelnng dvdes the desgn nto concurrently executng stages, thus ncreasng the throughput. In mult-core archtectures all parallel tasks n an applcaton have the potental to be executed smultaneously. However the number of such tasks may exceed the number of avalable processors. Therefore task mappng s requred to assgn the parallel tasks to the avalable processors. In the past, task mergng and task replcaton have been proposed wth the goal of re-allocatng tasks when performance bottlenecks are met. Snce task mergng requres more local memory and task replcaton needs more processors to mplement the same task [2], a mult-core archtecture whch does not feature suffcent memory and processors wll severely lmt the avalable mappng optons usng the exstng methodology. Applcaton development on mult-core archtectures requres the desgner, or automated tool, to dvde tasks between avalable processors and to determne data mappngs for the requred memory elements. A SystemC-based smulaton framework for mappng an applcaton to a platform and evaluatng ts performance has been presented n [3]. The authors n [4, 5] have ntroduced schedulng and mappng parallel applcatons onto an MPSoC platform. Mappng solutons for bus-based and NoC-based MPSoCs have been descrbed n [6] and [7]. Some automated system-level mappng technques for applcaton development on network processors have also been proposed [8]. Ths paper addresses the problem of automated applcaton mappng and schedulng on DR processor based mult-core archtectures. An Integer Lnear Program (ILP) based approach s proposed for loop level task parttonng, task mappng and ppelned schedulng whle takng the communcaton tme nto account for embedded applcatons. The effcacy of the technque s demonstrated by a seres of DSP applcatons. The paper s organzed as follows: Secton 2 ntroduces the target DR processor as well as the target mult-core archtecture. Secton 3 descrbes the task mappng methodology. Secton 4 gves a more detaled descrpton of the problem addressed n ths paper. Secton 5 descrbes the proposed ILP based approach to solve the problem. The expermental results are gven n secton 6 followed by conclusons n secton 7. II. TARGET MULTI-CORE ARCHITECURE Some applcatons demand a closer nterconnecton between the partcpatng processors to acheve the requred performance. Such a communcaton can be realsed usng dstrbuted shared regster fles. The target mult-core platform s desgned for DSP applcatons, whch typcally have ntensve computatons and a stream of nput data. The archtecture descrbed n a prevous work [2] conssts of a selectable number of DR processors, whch communcate wth a shared memory through a full crossbar network. Ths archtecture has been extended and modfed by ncorporatng the shared regster fle nto the system memory archtecture n order to support the loop level parallelsm proposed n ths /DATE EDAA

2 paper. The target mult-core archtecture s based on a recently ntroduced DR processor archtecture [9]. The DR processor offers comparable computaton performance to leadng DSP processors wth a sgnfcant reducton n power consumpton [1]. It contans an array of nstructon-set functonal unts connected by a programmable nterconnecton. The DR processor s realsed usng an array of Instructon-Cells (ICs) that s reconfgured every processor cycle to map data-paths consstng of both dependent and ndependent nstructons. The salent characterstcs of the DR processor are that t s able to be fully customsable at desgn stage and can be set accordng to applcaton requrements. All nstructon cells n the DR processor can be connected to the entre shared memory or regster fle through nterface cells. Ths scheme allows all possble connectons between the nstructon cells and the shared memory elements, but wth a reduced mult-core nterconnecton complexty. The exstng DR processor tool-flow gves full support by provdng all the requred fles for the ILP model, whch nclude a machne descrpton fle, a statc proflng fle, and a task graph (control data dependent graph). III. PROPOSED MAPPING FLOW Many DSP algorthms target streamng based applcatons and need to operate n real-tme: the DSP applcaton must read nput data, process t, and wrte processed results out before the next nput data s ready. A key concern n a DSP system s mantanng real-tme executon. It s necessary to ppelne the applcaton nto concurrently executng stages to meet the throughput constrants of these systems. The mplementaton of DSP applcatons on a mult-core archtecture manly nvolves parttonng, mappng and schedulng the applcaton tasks onto the processors as well as specfyng data mappng and data transmsson between these processors. The parttonng, mappng and schedulng of tasks are complex optmsaton problems, whch need to be solved smultaneously to maxmse the throughput. In addton, the data communcaton tme between dfferent processors must also be taken nto account durng task mappng and schedulng to maxmse the system throughput. The proposed mappng flow allows the desgner to explore the applcaton mplementatons on the target archtecture platform as shown n Fg. 1. Ths paper manly focuses on the automatc mappng. An ILP-based approach s proposed for loop level task parttonng, task mappng, and ppelned schedulng, whle takng the data communcaton tme nto account for embedded applcatons targeted on the mult-core platform. Durng the process of task parttonng and task mappng, the estmaton of executon-tme for the dfferent tasks as well as for transferrng the data between processors s requred. It s also necessary to schedule the executon order of these ppelned tasks to mprove the system performance. The executon tme of a task can be obtaned from proflng fle generated by the sngle DR smulator. The mappng flow starts from the descrpton of an applcaton n standard sequental C code whch s then optmsed and profled for a sngle DR Fg.1: Mappng methodology processor mplementaton. Applcaton developers can use the generated task mappng and schedulng nformaton and a task-level nterface (TLI) to buld mult-core applcaton code. A TLI nterface s an applcaton programmng nterface. It can be used for developng parallel applcaton program on mult-core archtectures. The TLI nterface provdes servces for nter-task communcaton and task allocaton. It must allow parallelsm and communcaton to be made explct to enable mappng to mult-core archtectures. For example, f a task uses an abstract nterface for synchronzaton wth other tasks, t hdes the detaled mplementaton of the synchronzaton. The mult-core applcaton code s compled and smulated wth the sngle DR processor. The sngle DR processor smulator generates an executon trace fle, whch s used as an nput to the mult-core smulator (MRPSIM) [9]. MRPSIM s a trace-drven smulator whch can correctly and effcently smulate the run-tme mult-core envronment, allowng the throughput of the modelled system to be measured. The proposed mappng approach also takes nto account the task graph (control data flow graph), the mult-core archtecture model, and statc proflng fle. The statc proflng nformaton contans the tmng characterstcs for each task and the access frequency for the varous data tems. The mult-core archtecture model, also called the machne descrpton fle, conssts of the set of processors and the set of memory. These are used for mappng tasks to avalable processors as well as mappng varous data tems to memory archtecture. Our soluton conssts of dvdng the problem nto two stages and solvng each consecutvely. The frst stage assgns and schedules tasks to processors, assumng an dealstc memory mappng, where all data tems are mapped to the fastest possble level of memory, gnorng memory capactes. The ILP formulaton of ths stage ncludes task mergng, task replcaton, and loop level splttng and fuson. The task mergng combnes several tasks nto a sngle task that performs tasks n an effcent order. Ths technque reduces the number of requred processors, but needs more local nstructon and data memores. Task replcaton assgns the same task to

3 several processors such that all nstances of the task are executed n parallel. Therefore, task replcaton needs more processors to mplement an applcaton and also more global memory to save the shared data. A new mappng approach s needed when the workload among the processors s unbalanced and task replcaton cannot be used due to the lmted number of avalable processors. The new mappng approach dvdes the tasks at basc block level nstead of at the functon level n order to explore the loop level parallelsm. In tasks and communcatons schedulng process, n order to consder data dependency between tasks and resolve resource contenton, we model the schedulng problem wth data dependency constrants between tasks, constrants that represent resource contenton. The task graph generated by the DR compler provdes basc block and functon level control and data dependency. A task s defned as a small procedure, functon or just a basc block. The start task has no ncomng arcs, and end (leaf) task has no outgong arcs. The ILP model wth and wthout functonal ppelnng s proposed n [10] but can only handle ths n a restrcted way. The restrcton s that the frst computaton of a task n the (+1)-th teraton s only possble f all leaf tasks are fnshed n the -th teraton. Ths lmtaton wll generate neffcent soluton for some task graphs, whch s llustrated n Fg.2. Fg. 2(a) gves a task graph of a small example. The tme nterval between two successve teratons of the algorthm s called latency (LT). The overall computaton tme (OCT) of n frames wthout functonal ppelnng s equal to noet, where OET s the overall executon tme for one frame. The method gven n [10] provdes a longer latency (LT=OET) whch s shown n Fg. 2(b). Our model removes the above lmtaton, whch results n more effcent mappng and schedulng shown n Fg. 2(c). Ths approach wll be llustrated n detal n the followng two sectons. (a) task graph Pm Pk Pm Pk processor processor LT LT (b) wth OCT = OET + LT = 2*OET (c) wth OCT = OET + LT < 2*OET Fg.2. Advantage of functonal ppelnng The second stage performs a mappng of data tems to memory archtecture and explores the memory archtecture for mnmzng memory access latences. Each DR processor can access three types of memory: (a) the shared mult-bank regster fle, (b) local memory, and (c) shared memory. The local memores are prvate to a processor and cannot be accessed by other processors. Thus, shared data tems, accessed by tasks assgned to dfferent processors, cannot be mapped to these memores. Instead, they are mapped ether to the shared mult-bank regster fle or to shared memory, whch can be accessed by the dfferent processors. Here, we adopted a smlar ILP formulaton gven n [8] for ths stage. tme tme IV. MAPPING HEURISTICS AND PROBLEM DEFINITION In ths secton, we defne the problem of task mappng and schedulng for DR processor based mult-core archtectures. Gven a task graph, a mult-core target archtecture wth ts parameters, and a mappng of tasks and data on the target archtecture ncludng processors and memory, the problem s to fnd a mappng and schedulng of task executons and communcaton transactons whch yelds mnmum executon tme of the task graph on the target archtecture. To solve the problem, target archtecture, task and applcatons defntons are presented. Then, an ILP formulaton or a heurstc algorthm s ntroduced to map and schedule tasks and communcatons. A. Archtecture Defnton A target mult-core archtecture s specfed by the set of processors P and the set of memory elements M. Each processor p s a 2-tuple p=(pd, pcm) where pd s the processor dentfer and pcm s the nstructon memory of the processor. Each memory element m s gven by a 4-tuple m=(md, mc, mt, nm), where md s the memory element dentfer, mc s the capacty of the memory, mt s the tme requred to access the memory, and nm s the type of the memory (shared memory, local memory, or shared regster fle). The target archtecture model based on the above specfcaton s extracted from the mult-core archtecture model fle. B. Applcaton Defnton An applcaton s represented by the herarchcal task graph, whch s an acyclc drected graph G=<V, E>, where the vertex set V s a set of tasks and the edge set E s communcaton edges. Each small procedure or a basc block s defned as a task whch s a 5-tuple t=(td, pt, tc, td, nd) where td s the task dentfer, pt s the task executon tme excludng memory conflctng accesses delay, tc s the total nstructon memory requred by the task, td s the amount of memory requred to store the local data of the task, and nd s the number of tmes the local data tem s accessed. Each communcaton edge s descrbed wth the form (md, sd, sd, nsd), where md s the master task dentfer, sd s the slave task dentfer, sd s the amount of memory requred to store the shared data between two tasks, and nsd s the number of tmes the shared data tem s accessed. The applcaton task model s profled to obtan tmng characterstcs for each task and the access frequency for the varous data tems. The objectve s to map and obtan a statc mappng and ppelned schedulng of the task graph on the target mult-core archtecture such that the throughput s maxmzed whle satsfyng performance constrants. The result of the mappng procedure s the decson whch tasks run on whch processors at what tme. V. ILP FORMULATIONS In ths secton, we present the ILP formulaton that gves an optmal soluton for the problem descrbed n secton 4. Our soluton s based on the mappng strategy gven n [8] and extended the ILP model wth ppelned schedulng. The ILP model supports task mergng and task replcaton. It means that

4 a task may be performed on several processors n order to explot more parallelsm. We assume that each processor has ts own local memory and only one task can be executed at a tme by one processor. Each DR processor can execute all tasks. Tasks on dfferent processors can be executed n parallel. The ILP formulaton ncorporates task mergng and replcaton by frst assgnng processes nto batches, whch are then assgned or replcated to processors [8]. Now, a short summary of the abbrevaton s gven. N represents the number of tasks; M s the number of avalable DR processors. T set of tasks T = {T 1, T 2,, T n L set of the end (leaf) tasks L T S set of the start tasks S T P set of processors P = {P 1, P 2,, P m B set of batches B = {B 1, B 2,, B l {l= mn(m, n) Constants and varables used n the ILP formulaton: s j, k start tme of task T on processor P j n the k -th teraton communcaton tme for sendng data from to ' C ', k n the k - th teraton, f there s (, ') E; 1 task T s assgned to batch l tl 0 otherwse 1 batch B l s assgned to processor j blj 0 otherwse 1 batch B l s replcated on j processors rlj 0 otherwse tasks T and T ' are allocated on the same processor 1 d ' j P j, and T starts executon before task T ' 0 otherwse the processor P j that extectue task T provdng 1 data for any task ' wth (, ' ) E; the two tasks x j are allocated on the dffernt processors 0 otherwse An objectve functon dependng on the system throughput (TP) and the latency (LT) need to be mnmsed. TP and LT are contnuous varables n our ILP model. The objectve functon s gven below. The weghts k 1 and k 2 of the costs TP and LT can be tuned by the desgner. The objectve functon used n the ILP formulaton: mnmse ( k TP + k LT ) 1 2 Constrants used n the ILP formulaton: Every task must be assgned to a sngle batch; T : t = 1 l (1) l B A batch s replcated on n processors, and then exactly n processors must execute that batch. l B n r = b (2) : ln lj n j P Each processor must be assgned to a sngle batch. j P: blj = 1 (3) l B A batch must be assgned to one or more processors only f there s at least one processor assgned to the batch. Otherwse, the batch can be gnored. l B : b MAX _ VAL t lj l (4) j P T where MAX_VAL s a very large value. The nstructon sze of all the tasks assgned to a batch cannot exceed the sze of the avalable nstructon memory of the processor. l B, j P: tl tc( ) pcm( j) (5) T The throughput s equal to the maxmum effectve tme over all batches. rln l B: TP tl pt( ) n n T (6) The fnshng tme of each leaf task s less than or equal to OET L, j P: sj,0 + pt() OET+ (1 tl blj ) MAX_ VAL (7) A data dependency constrant exsts between the two tasks T and T f there s an edge between two tasks T and T n the task graph G(V, E). The executon of task T has to be fnshed before the executon of T f they are on the same processor [constrant (8.1)]. When the tasks are allocated to dfferent processors, T can start c tme unts after T has fnshed [constrant (8.2)]. Snce the ILP model supports task replcaton, a task can be allocated to several processors. We need to consder all possble task allocatons of replcated tasks on all processors and also need to know whch processor that executes task T provdes the data for task T. Ths s done by varable x j gven n the begnnng of secton. (, ') E, j P: sj, k + pt() s ' j, k + (2 tj t ' j ) MAX _ VAL (8.1) (, ') E, j, j' P, j j': (8.2) sj, k+ pt() + C', k s' j', k+ (2 t' j' + tj' x j) MAX _ VAL Two ndependent tasks must not be executed on the same processor at the same tme..e., Task T s executed ether before task T ( d ' j = 1 ) (9.1) or after task T ( d ' j = 0 ) (9.1) on processor P j. (, ') E, j P: sj, k + pt() s ' j, k + (3 tj t ' j d ' j ) MAX _ VAL (9.1) s ' j, k + pt(') sj, k + (2 tj t ' j + d' j ) MAX _ VAL (9.2) The start tme of all tasks have to be postve. T, j P :, 0 (10) If processor P j executes task T ( x j = 1), then task T s assgned one batch and ths batch s allocated to processor P j., : j l lj (11) T j P x t b l B To mnmse the OCT t s necessary to begn the start task n teraton (+1) as soon as possble. Each task wthout replcaton should be allocated to the same processor n each teraton. In addton, t s requred that each start task of (+1)-th teraton can only start on a

5 processor after ths task s -th teraton s fnshed on ths processor. If a task has been replcated, there s no constrant between the dfferent teratons. The latency tme s affected by the frst start task of the -th teraton and the frst task of the (+1)-th teraton. S, j P : s + LT s + (1 t b ) MAX _ VAL (12) j, k j, k+ 1 l lj VI. CASE STUDY: LOOP LEVEL PARALLELISM The followng secton demonstrates the effectveness of the proposed mappng methodology usng a seres of DSP applcatons. The applcaton set ncludes (1) a 64-tap Fnte Impulse Response (FIR) flter, (2) an Advanced Encrypton Standard (AES) applcaton, (3) a Fast Fourer Transform (FFT) applcaton, (4) a smoothng and edgng mage processng applcaton for a 256*256 grayscale mage, and (5) a Freeman demosacng applcaton for a 1138*850 RGB mage. Some compler optmzaton technques have been adopted n our mult-core mappng models, whch ncludes loop splttng and loop fuson. Loop splttng attempts to smplfy a loop or elmnate dependences by breakng t nto multple loops whch have the same bodes but terate over dfferent contguous portons of the ndex range. Loop fuson (loop combnng) attempts to reduce loop overhead. When two adjacent loops terate the same number of tmes, ther bodes can be combned as long as they make no reference to each other's data. Let us consder the applcaton of the 64-pont 6-stages radx-2 FFT to demonstrate loop splttng. Mappng soluton s not only dependent on the strategy but also on the archtecture desgn. The mult-core FFT mplementaton s manly affected by the number of processors and the shared regster fle sze n the mult-core archtecture. To demonstrate the proposed mappng methodology, the FFT applcaton s mapped to several dfferent mult-core archtectures ncludng: (a) lmted processor cores wth lmted shared regster banks, (b) lmted processor cores wth suffcent shared regster banks, and (c) suffcent processor cores wth suffcent shared regster banks. The FFT-I applcaton s mapped onto a mult-core archtecture wth two processor cores, suffcent local and shared memores and an 32*32 shared regster fle shown n Table 3. The most tme consumng part of the applcaton s the TABLE I. for (stage=0; stage<stages; stage++) { shuffle(n, out, SIZE); for (=0; <SIZE; =+2) { getw(&w, (/2), stage); fft(w, out[], out[+1], &(n[]), &(n[+1])); A CODE EXAMPLE OF LOOP SPLITTING for (stage=0; stage<stages/2; stage++) { shuffle(t1, t2, SIZE); for (=0; <SIZE; =+2) { getw(&w, (/2), stage); fft(w, t2[], t2[+1], &(t1[]), &(t1[+1])); /* assgned to Processor 0*/ for (stage=stages/2;stage<stages; stage++) { shuffle(t1, t3, SIZE); for (=0; <SIZE; =+2) { getw(&w, (/2), stage); fft(w, t3[], t1[+1], &(t1[]), &(t1[+1])); /* assgned to Processor 1 */ TABLE II. Intal Code /* Smooth Processor 0 */ for (y=0; y<height; y++) sharedimage[(y*width)+x] = flter(x, y, smooth, mage); /* Laplacan Processor 1 */ for (y=0; y<height; y++) result[(y*width)+x] = flter(x, y, laplacan, sharedimage); A CODE EXAMPLE OF LOOP FSION Combnng Code for (y=0; y<height+3; y++) { /* Smooth Processor 0 */ f (y<height) sharedimage[(y*width)+x] = flter(x, y, smooth, mage); /* Laplacan Processor 1*/ f (y>=3) result[((y-3)*width)+x] = flter(x, (y-3), laplacan, sharedimage); 2-level loop body shown n Table 1. The loop s splt nto two parts: the frst part executes the begnnng 3 stages and the second part executes the last 3 stages (Table 1). Snce there s a lmted shared regster fle, whch s not suffcent to save all the shared data. The FFT wll make use of shared memory to send data from one processor to another. In the general case, data cannot smply be wrtten to and read from shared memory n a mult-core archtecture. Programmer can use the mutex or semaphore nstructon defned n TLI nterface to synchronse the data transfer between processors. If there are suffcent shared regsters, the shared data tems can be mapped to the shared regsters nstead of the shared memory. Ths detaled mplementaton called FFT-II s gven n Table 3. These shared regsters are 8 bank 32*32 bt data regsters, whose access tme (2ns) s much smaller than one of shared memory (5ns). Therefore, frequently read and wrtten shared data should be mapped to the shared regster fle n order to reduce memory access tme and memory access conflcts. A more effcent mappng can be mplemented for a mult-core archtecture whch has three avalable processor cores (FFT-III) and suffcent shared regsters. The number of stages s evenly dvded nto three processors, and each processor executes 2 stages of FFT. A smlar technque s adopted by the 64-tap FIR flter. The FIR-I flter s splt up and mplemented on two processors archtecture: the frst processor executes the frst 32 taps and the second processor executed the last 32 taps. The FIR-II applcaton has been mplemented on 4 processors wth each processor executng 16 taps. A fully parallel mplementaton of the 64-tap FIR flter requres 64 processors, whch s determned by the number of taps. An mage processng applcaton (IMP) ncludes two stages: mage smoothng and edge enhancement. Image smoothng attempts to capture mportant patterns n the mage data whle leave out nose. Edge enhancement s a dgtal mage processng flter that mproves the apparent sharpness of an mage. The ntal mplementaton s gven n Table 2. Edge detecton wats for entre mage smoothng to fnsh before t begns. The mplementaton performance s lmted by synchronzaton; however there s no need to wat for entre mage. Two stages can synchronze at every lne of pxels. Loop combnng s adopted n the IMP applcaton. Combnng code s gven n Table 2, whch results n nearly two tmes better performance of the orgnal code, and nearly 90% processor effcency compared to the orgnal 50% effcency. A seres of DSP applcatons targeted on several mult-core

6 TABLE III. DATE MAPPING Memory Capacty Memory Access Count Apps. Shared Local S.Reg. Shared Local S. Reg. (KB) (KB) (b) (KB) (KB) (b) FIR(I) * , FIR(II) *32*32 2,062 5,041 1,086 AES * FFT(I) * , FFT(II) *32* , FFT(III) *32* , IMP *32*32 212, , Freeman *32*32 1,5451, ,414 Apps. No.of Proc. TABLE IV. IMPLEMENTATION RESULT Executon me (ms) Speedup parallel effcency Average Idle Rato FIR(I) % FIR(II) % AES % FFT(I) % FFT(II) % FFT(III) % IMP % Freeman % archtectures are descrbed n Tables 3 and 4. To make far performance comparsons, all applcatons are executed wth 100 frames. The expermental results are based on the followng assumptons: the DR processors operate at 500MHz, the shared memory access delay s 5ns, the local prvate memory access delay s 4ns, and the shared mult-bank regster fle access delay s 2ns. The memory szes together wth the number of memory access operatons for each applcaton are gven n Table 3, where each processor has an equal local memory sze. A memory access occurs when nformaton s read from or wrtten to a memory unt. Table 3 also provdes the average number of memory access operatons per frame. Table 4 shows the total executon tme, the speedup, parallel effcency, and average dle rato of the dfferent applcatons. Speedup refers to the amount by whch a parallel algorthm speeds-up compared to a correspondng sequental algorthm. The parallel effcency (PE) metrc ndcates how effcently the processors are utlzed n solvng the problem, and s obtaned by dvdng the speedup acheved by the number of processor cores used. The dle rato (IR) metrc refers to the rato between the perods when a DR processor core s dle and the overall smulaton tme. The IR gven n Table 4 provdes the average dle rate of each processor. As the results show, the applcatons wth suffcent processors and memory elements acheve both the hghest speedup and the hghest parallel effcency, compared to other task mappng solutons. In all applcatons wth loop splttng and loop fuson, all processor cores are much more effcent wth very low dle ratos. The FIR flter wth suffcent local memory and shared regsters gans a super lnear speedup [11] where the speedup s greater than the number of processor cores. The super lnear speedup obtaned n ths paper attrbutes to the data localty whch reduces the accesses to the slower shared data memory and dramatcally mproves the performance. Up to 2.59x super lnear speedup and a parallel effcency of 1.30 have been acheved wth only two DR processor cores wth a local memory and employng loop splttng. The smulaton results show that the proposed mappng methodology wth loop level parallelsm provdes superor performance.. VII. CONCLUSIONS The focus of ths paper s on modelng the task mappng and schedulng problem as an ILP whch allows the use of standard tools for solvng t. The proposed mappng technque utlzes proflng-drven task parttonng and loop level transformatons. These are ntellgently fused wth loop splttng, loop fuson and a memory aware data mappng n order to reduce system executon tme. Several applcatons based on dfferent mult-core archtectures have been generated usng our mappng and schedulng tool. Smulaton results demonstrate the effectveness of the proposed mappng and schedulng strateges, showng up to 1.3 parallel effcency for a mult-core archtecture wth two DR processor cores. REFERENCES [1] S. Khawam, I. Nousas, M. Mlward, Y. Y, M. Mur, and T. Arslan, "The Reconfgurable Instructon Cell Array," Very Large Scale Integraton (VLSI) Systems, IEEE Transactons on, vol. 16, pp , [2] We Han, Yng Y, M. Mur, N. Ioanns, T. Arslan, and A. T. Erdogan, Effcent Implementaton of WMAX Physcal Layer on Mult-core Archtecture wth Dynamcally Reconfgurable Processors, Scalable Computng: Practce and Experence Scentfc nternatonal journal for parallel and dstrbuted computng, Vol. 9, ISSN , [3] T. Kempf, M. Doerper, R. Leupers, G. Asched, H. Meyr, T. Kogel, and B. Vanthournout, "A modular smulaton framework for spatal and temporal task mappng onto mult-processor soc platforms," n Proceedngs of the conference on Desgn, Automaton and Test n Europe (DATE), pp , 2005,. [4] P. G. Pauln, "Automatc mappng of parallel applcatons onto multprocessor platforms: a multmeda applcaton," n Dgtal System Desgn, Euromcro Symposum, pp. 2-4, [5] N. Pazos, A. Maxagune, P. lenne, and Y Leblebc, "Parallel modelng paradgm n multmeda applcatons: Mappng and schedulng onto a mult-processor system-on-chp platform", n Proceedngs of the Internatonal Global Sgnal Processng Conference, Santa Clara, Calforna, [6] M. Ruggero, A. Guerr, D. Bertozz, F. Polett, M. Mlano, Communcaton-aware allocaton and schedulng framework for stream-orented mult-processor system-on-chp, n Proceedngs of the Conference on Desgn, Automaton and Test n Europe (DATE), pp. 3-8, [7] C. Marcon, A. Born, A. Susn, L. Carro, F. Wagner, me and Energy Effcent Mappng of Embedded Applcatons onto NoCs, Asa and South Pacfc Desgn Automaton Conference (ASP-DAC), pp , Vol. 1, [8] C. Ostler and K.S. Chatha, An ILP Formulaton for System-Level Applcaton Mappng on Network Processor Archtectures, Desgn, Automaton & Test n Europe Conference & Exhbton, (DATE), pp.1-6, [9] We Han, Yng Y, M. Mur, N. Ioanns, T. Arslan, and A. T. Erdogan, "MRPSIM: a TLM based Smulaton Tool for MPSoCs targetng Dynamcally Reconfgurable Processors," 21st Annual IEEE Internatonal SOC Conference, pp , September, [10] A. Bender, Desgn of an Optmal Loosely Coupled Heterogeneous Multprocessor System, Proceedngs of European Desgn and Test Conference (ED&TC 96), pp , [11] Davd Culler, J.P. Sngh and A. Gupta, Parallel Computer Archtecture: A Hardware/Software Approach, Morgan Kaufmann, 2nd edton, 1999.

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

An MILP model for planning of batch plants operating in a campaign-mode

An MILP model for planning of batch plants operating in a campaign-mode An MILP model for plannng of batch plants operatng n a campagn-mode Yanna Fumero Insttuto de Desarrollo y Dseño CONICET UTN yfumero@santafe-concet.gov.ar Gabrela Corsano Insttuto de Desarrollo y Dseño

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Poltecnco d Torno Porto Insttutonal Repostory [Artcle] A cost-effectve cloud computng framework for acceleratng multmeda communcaton smulatons Orgnal Ctaton: D. Angel, E. Masala (2012). A cost-effectve

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

A Dynamic Energy-Efficiency Mechanism for Data Center Networks

A Dynamic Energy-Efficiency Mechanism for Data Center Networks A Dynamc Energy-Effcency Mechansm for Data Center Networks Sun Lang, Zhang Jnfang, Huang Daochao, Yang Dong, Qn Yajuan A Dynamc Energy-Effcency Mechansm for Data Center Networks 1 Sun Lang, 1 Zhang Jnfang,

More information

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters Frequency Selectve IQ Phase and IQ Ampltude Imbalance Adjustments for OFDM Drect Converson ransmtters Edmund Coersmeer, Ernst Zelnsk Noka, Meesmannstrasse 103, 44807 Bochum, Germany edmund.coersmeer@noka.com,

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems Jont Schedulng of Processng and Shuffle Phases n MapReduce Systems Fangfe Chen, Mural Kodalam, T. V. Lakshman Department of Computer Scence and Engneerng, The Penn State Unversty Bell Laboratores, Alcatel-Lucent

More information

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers Journal of Computatonal Informaton Systems 7: 13 (2011) 4740-4747 Avalable at http://www.jofcs.com A Load-Balancng Algorthm for Cluster-based Mult-core Web Servers Guohua YOU, Yng ZHAO College of Informaton

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop IWFMS: An Internal Workflow Management System/Optmzer for Hadoop Lan Lu, Yao Shen Department of Computer Scence and Engneerng Shangha JaoTong Unversty Shangha, Chna lustrve@gmal.com, yshen@cs.sjtu.edu.cn

More information

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign PAS: A Packet Accountng System to Lmt the Effects of DoS & DDoS Debsh Fesehaye & Klara Naherstedt Unversty of Illnos-Urbana Champagn DoS and DDoS DDoS attacks are ncreasng threats to our dgtal world. Exstng

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

QoS-based Scheduling of Workflow Applications on Service Grids

QoS-based Scheduling of Workflow Applications on Service Grids QoS-based Schedulng of Workflow Applcatons on Servce Grds Ja Yu, Rakumar Buyya and Chen Khong Tham Grd Computng and Dstrbuted System Laboratory Dept. of Computer Scence and Software Engneerng The Unversty

More information

Conferencing protocols and Petri net analysis

Conferencing protocols and Petri net analysis Conferencng protocols and Petr net analyss E. ANTONIDAKIS Department of Electroncs, Technologcal Educatonal Insttute of Crete, GREECE ena@chana.tecrete.gr Abstract: Durng a computer conference, users desre

More information

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and POLYSA: A Polynomal Algorthm for Non-bnary Constrant Satsfacton Problems wth and Mguel A. Saldo, Federco Barber Dpto. Sstemas Informátcos y Computacón Unversdad Poltécnca de Valenca, Camno de Vera s/n

More information

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment Survey on Vrtual Machne Placement Technques n Cloud Computng Envronment Rajeev Kumar Gupta and R. K. Paterya Department of Computer Scence & Engneerng, MANIT, Bhopal, Inda ABSTRACT In tradtonal data center

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Cloud-based Social Application Deployment using Local Processing and Global Distribution Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

Multi-Source Video Multicast in Peer-to-Peer Networks

Multi-Source Video Multicast in Peer-to-Peer Networks ult-source Vdeo ultcast n Peer-to-Peer Networks Francsco de Asís López-Fuentes*, Eckehard Stenbach Technsche Unverstät ünchen Insttute of Communcaton Networks, eda Technology Group 80333 ünchen, Germany

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

A Prefix Code Matching Parallel Load-Balancing Method for Solution-Adaptive Unstructured Finite Element Graphs on Distributed Memory Multicomputers

A Prefix Code Matching Parallel Load-Balancing Method for Solution-Adaptive Unstructured Finite Element Graphs on Distributed Memory Multicomputers Ž. The Journal of Supercomputng, 15, 25 49 2000 2000 Kluwer Academc Publshers. Manufactured n The Netherlands. A Prefx Code Matchng Parallel Load-Balancng Method for Soluton-Adaptve Unstructured Fnte Element

More information

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems 1 Applcaton of Mult-Agents for Fault Detecton and Reconfguraton of Power Dstrbuton Systems K. Nareshkumar, Member, IEEE, M. A. Choudhry, Senor Member, IEEE, J. La, A. Felach, Senor Member, IEEE Abstract--The

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing Effcent Bandwdth Management n Broadband Wreless Access Systems Usng CAC-based Dynamc Prcng Bader Al-Manthar, Ndal Nasser 2, Najah Abu Al 3, Hossam Hassanen Telecommuncatons Research Laboratory School of

More information

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems 1 Mult-Resource Far Allocaton n Heterogeneous Cloud Computng Systems We Wang, Student Member, IEEE, Ben Lang, Senor Member, IEEE, Baochun L, Senor Member, IEEE Abstract We study the mult-resource allocaton

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

An Energy-Efficient Data Placement Algorithm and Node Scheduling Strategies in Cloud Computing Systems

An Energy-Efficient Data Placement Algorithm and Node Scheduling Strategies in Cloud Computing Systems 2nd Internatonal Conference on Advances n Computer Scence and Engneerng (CSE 2013) An Energy-Effcent Data Placement Algorthm and Node Schedulng Strateges n Cloud Computng Systems Yanwen Xao Massve Data

More information

Period and Deadline Selection for Schedulability in Real-Time Systems

Period and Deadline Selection for Schedulability in Real-Time Systems Perod and Deadlne Selecton for Schedulablty n Real-Tme Systems Thdapat Chantem, Xaofeng Wang, M.D. Lemmon, and X. Sharon Hu Department of Computer Scence and Engneerng, Department of Electrcal Engneerng

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information

Dynamic Fleet Management for Cybercars

Dynamic Fleet Management for Cybercars Proceedngs of the IEEE ITSC 2006 2006 IEEE Intellgent Transportaton Systems Conference Toronto, Canada, September 17-20, 2006 TC7.5 Dynamc Fleet Management for Cybercars Fenghu. Wang, Mng. Yang, Ruqng.

More information

A Programming Model for the Cloud Platform

A Programming Model for the Cloud Platform Internatonal Journal of Advanced Scence and Technology A Programmng Model for the Cloud Platform Xaodong Lu School of Computer Engneerng and Scence Shangha Unversty, Shangha 200072, Chna luxaodongxht@qq.com

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告

行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告 行 政 院 國 家 科 學 委 員 會 補 助 專 題 研 究 計 畫 成 果 報 告 期 中 進 度 報 告 畫 類 別 : 個 別 型 計 畫 半 導 體 產 業 大 型 廠 房 之 設 施 規 劃 計 畫 編 號 :NSC 96-2628-E-009-026-MY3 執 行 期 間 : 2007 年 8 月 1 日 至 2010 年 7 月 31 日 計 畫 主 持 人 : 巫 木 誠 共 同

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng Envronment Congcong Xong, Long Feng, Lxan Chen A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng

More information

Loop Parallelization

Loop Parallelization - - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze

More information

Energy Efficient Routing in Ad Hoc Disaster Recovery Networks

Energy Efficient Routing in Ad Hoc Disaster Recovery Networks Energy Effcent Routng n Ad Hoc Dsaster Recovery Networks Gl Zussman and Adran Segall Department of Electrcal Engneerng Technon Israel Insttute of Technology Hafa 32000, Israel {glz@tx, segall@ee}.technon.ac.l

More information

Heuristic Static Load-Balancing Algorithm Applied to CESM

Heuristic Static Load-Balancing Algorithm Applied to CESM Heurstc Statc Load-Balancng Algorthm Appled to CESM 1 Yur Alexeev, 1 Sher Mckelson, 1 Sven Leyffer, 1 Robert Jacob, 2 Anthony Crag 1 Argonne Natonal Laboratory, 9700 S. Cass Avenue, Argonne, IL 60439,

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Resource Scheduling in Desktop Grid by Grid-JQA

Resource Scheduling in Desktop Grid by Grid-JQA The 3rd Internatonal Conference on Grd and Pervasve Computng - Worshops esource Schedulng n Destop Grd by Grd-JQA L. Mohammad Khanl M. Analou Assstant professor Assstant professor C.S. Dept.Tabrz Unversty

More information

A Multi-Camera System on PC-Cluster for Real-time 3-D Tracking

A Multi-Camera System on PC-Cluster for Real-time 3-D Tracking The 23 rd Conference of the Mechancal Engneerng Network of Thaland November 4 7, 2009, Chang Ma A Mult-Camera System on PC-Cluster for Real-tme 3-D Trackng Vboon Sangveraphunsr*, Krtsana Uttamang, and

More information

Research of concurrency control protocol based on the main memory database

Research of concurrency control protocol based on the main memory database Research of concurrency control protocol based on the man memory database Abstract Yonghua Zhang * Shjazhuang Unversty of economcs, Shjazhuang, Shjazhuang, Chna Receved 1 October 2014, www.cmnt.lv The

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers J. Parallel Dstrb. Comput. 71 (2011) 732 749 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. ournal homepage: www.elsever.com/locate/pdc Envronment-conscous schedulng of HPC applcatons

More information

The Load Balancing of Database Allocation in the Cloud

The Load Balancing of Database Allocation in the Cloud , March 3-5, 23, Hong Kong The Load Balancng of Database Allocaton n the Cloud Yu-lung Lo and Mn-Shan La Abstract Each database host n the cloud platform often has to servce more than one database applcaton

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo.

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo. ICSV4 Carns Australa 9- July, 007 RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL Yaoq FENG, Hanpng QIU Dynamc Test Laboratory, BISEE Chna Academy of Space Technology (CAST) yaoq.feng@yahoo.com Abstract

More information

Agile Traffic Merging for Data Center Networks. Qing Yi and Suresh Singh Portland State University, Oregon June 10 th, 2014

Agile Traffic Merging for Data Center Networks. Qing Yi and Suresh Singh Portland State University, Oregon June 10 th, 2014 Agle Traffc Mergng for Data Center Networks Qng Y and Suresh Sngh Portland State Unversty, Oregon June 10 th, 2014 Agenda Background and motvaton Power optmzaton model Smulated greedy algorthm Traffc mergng

More information

Cost-based Scheduling of Scientific Workflow Applications on Utility Grids

Cost-based Scheduling of Scientific Workflow Applications on Utility Grids Cost-based Schedulng of Scentfc Workflow Applcatons on Utlty Grds Ja Yu, Rakumar Buyya and Chen Khong Tham Grd Computng and Dstrbuted Systems Laboratory Dept. of Computer Scence and Software Engneerng

More information

Optimization of network mesh topologies and link capacities for congestion relief

Optimization of network mesh topologies and link capacities for congestion relief Optmzaton of networ mesh topologes and ln capactes for congeston relef D. de Vllers * J.M. Hattngh School of Computer-, Statstcal- and Mathematcal Scences Potchefstroom Unversty for CHE * E-mal: rwddv@pu.ac.za

More information

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet 2008/8 An ntegrated model for warehouse and nventory plannng Géraldne Strack and Yves Pochet CORE Voe du Roman Pays 34 B-1348 Louvan-la-Neuve, Belgum. Tel (32 10) 47 43 04 Fax (32 10) 47 43 01 E-mal: corestat-lbrary@uclouvan.be

More information

FORMAL ANALYSIS FOR REAL-TIME SCHEDULING

FORMAL ANALYSIS FOR REAL-TIME SCHEDULING FORMAL ANALYSIS FOR REAL-TIME SCHEDULING Bruno Dutertre and Vctora Stavrdou, SRI Internatonal, Menlo Park, CA Introducton In modern avoncs archtectures, applcaton software ncreasngly reles on servces provded

More information

Power Consumption Optimization Strategy of Cloud Workflow. Scheduling Based on SLA

Power Consumption Optimization Strategy of Cloud Workflow. Scheduling Based on SLA Power Consumpton Optmzaton Strategy of Cloud Workflow Schedulng Based on SLA YONGHONG LUO, SHUREN ZHOU School of Computer and Communcaton Engneerng Changsha Unversty of Scence and Technology 960, 2nd Secton,

More information

Dynamic Constrained Economic/Emission Dispatch Scheduling Using Neural Network

Dynamic Constrained Economic/Emission Dispatch Scheduling Using Neural Network Dynamc Constraned Economc/Emsson Dspatch Schedulng Usng Neural Network Fard BENHAMIDA 1, Rachd BELHACHEM 1 1 Department of Electrcal Engneerng, IRECOM Laboratory, Unversty of Djllal Labes, 220 00, Sd Bel

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

A New Quality of Service Metric for Hard/Soft Real-Time Applications

A New Quality of Service Metric for Hard/Soft Real-Time Applications A New Qualty of Servce Metrc for Hard/Soft Real-Tme Applcatons Shaoxong Hua and Gang Qu Electrcal and Computer Engneerng Department and Insttute of Advanced Computer Study Unversty of Maryland, College

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

EVERY year, seasonal hurricanes threaten coastal areas.

EVERY year, seasonal hurricanes threaten coastal areas. 1 Strategc Stockplng of Power System Supples for Dsaster Recovery Carleton Coffrn, Pascal Van Hentenryck, and Russell Bent Abstract Ths paper studes the Power System Stochastc Storage Problem (PSSSP),

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Cloud Auto-Scaling with Deadline and Budget Constraints

Cloud Auto-Scaling with Deadline and Budget Constraints Prelmnary verson. Fnal verson appears In Proceedngs of 11th ACM/IEEE Internatonal Conference on Grd Computng (Grd 21). Oct 25-28, 21. Brussels, Belgum. Cloud Auto-Scalng wth Deadlne and Budget Constrants

More information

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks From the Proceedngs of Internatonal Conference on Telecommuncaton Systems (ITC-97), March 2-23, 1997. 1 Analyss of Energy-Conservng Access Protocols for Wreless Identfcaton etworks Imrch Chlamtac a, Chara

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

Checkng and Testng in Nokia RMS Process

Checkng and Testng in Nokia RMS Process An Integrated Schedulng Mechansm for Fault-Tolerant Modular Avoncs Systems Yann-Hang Lee Mohamed Youns Jeff Zhou CISE Department Unversty of Florda Ganesvlle, FL 326 yhlee@cse.ufl.edu Advanced System Technology

More information

A Simple Approach to Clustering in Excel

A Simple Approach to Clustering in Excel A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa

More information

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS Bogdan Cubotaru, Gabrel-Mro Muntean Performance Engneerng Laboratory, RINCE School of Electronc Engneerng Dubln Cty

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE, FEBRUARY ISSN 77-866 Logcal Development Of Vogel s Approxmaton Method (LD- An Approach To Fnd Basc Feasble Soluton Of Transportaton

More information

T signal processing, the need to depart from the simplicity. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing

T signal processing, the need to depart from the simplicity. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing 1' 24 IEEE TRANSACTIONS ON COMPUTERS, VOL. '2-36, NO. 1. JANUARY 1987 Statc Schedulng of Synchronous Data Flow Programs for Dgtal Sgnal Processng EDWARD ASHFORD LEE, MEMBER, IEEE, AND DAVID G. MESSERSCHMI'TT,

More information

Profit-Aware DVFS Enabled Resource Management of IaaS Cloud

Profit-Aware DVFS Enabled Resource Management of IaaS Cloud IJCSI Internatonal Journal of Computer Scence Issues, Vol. 0, Issue, No, March 03 ISSN (Prnt): 694-084 ISSN (Onlne): 694-0784 www.ijcsi.org 37 Proft-Aware DVFS Enabled Resource Management of IaaS Cloud

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Efficient On-Demand Data Service Delivery to High-Speed Trains in Cellular/Infostation Integrated Networks

Efficient On-Demand Data Service Delivery to High-Speed Trains in Cellular/Infostation Integrated Networks IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. XX, NO. XX, MONTH 2XX 1 Effcent On-Demand Data Servce Delvery to Hgh-Speed Trans n Cellular/Infostaton Integrated Networks Hao Lang, Student Member,

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

Ants Can Schedule Software Projects

Ants Can Schedule Software Projects Ants Can Schedule Software Proects Broderck Crawford 1,2, Rcardo Soto 1,3, Frankln Johnson 4, and Erc Monfroy 5 1 Pontfca Unversdad Católca de Valparaíso, Chle FrstName.Name@ucv.cl 2 Unversdad Fns Terrae,

More information

Sangam - Efficient Cellular-WiFi CDN-P2P Group Framework for File Sharing Service

Sangam - Efficient Cellular-WiFi CDN-P2P Group Framework for File Sharing Service Sangam - Effcent Cellular-WF CDN-P2P Group Framework for Fle Sharng Servce Anjal Srdhar Unversty of Illnos, Urbana-Champagn Urbana, USA srdhar3@llnos.edu Klara Nahrstedt Unversty of Illnos, Urbana-Champagn

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) 2127472, Fax: (370-5) 276 1380, Email: info@teltonika.

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) 2127472, Fax: (370-5) 276 1380, Email: info@teltonika. VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

Compiling for Parallelism & Locality. Dependence Testing in General. Algorithms for Solving the Dependence Problem. Dependence Testing

Compiling for Parallelism & Locality. Dependence Testing in General. Algorithms for Solving the Dependence Problem. Dependence Testing Complng for Parallelsm & Localty Dependence Testng n General Assgnments Deadlne for proect 4 extended to Dec 1 Last tme Data dependences and loops Today Fnsh data dependence analyss for loops General code

More information

An RFID Distance Bounding Protocol

An RFID Distance Bounding Protocol An RFID Dstance Boundng Protocol Gerhard P. Hancke and Markus G. Kuhn May 22, 2006 An RFID Dstance Boundng Protocol p. 1 Dstance boundng Verfer d Prover Places an upper bound on physcal dstance Does not

More information

DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines

DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines DBA-VM: Dynamc Bandwdth Allocator for Vrtual Machnes Ahmed Amamou, Manel Bourguba, Kamel Haddadou and Guy Pujolle LIP6, Perre & Mare Cure Unversty, 4 Place Jusseu 755 Pars, France Gand SAS, 65 Boulevard

More information

Availability-Based Path Selection and Network Vulnerability Assessment

Availability-Based Path Selection and Network Vulnerability Assessment Avalablty-Based Path Selecton and Network Vulnerablty Assessment Song Yang, Stojan Trajanovsk and Fernando A. Kupers Delft Unversty of Technology, The Netherlands {S.Yang, S.Trajanovsk, F.A.Kupers}@tudelft.nl

More information

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures Mnmal Codng Network Wth Combnatoral Structure For Instantaneous Recovery From Edge Falures Ashly Joseph 1, Mr.M.Sadsh Sendl 2, Dr.S.Karthk 3 1 Fnal Year ME CSE Student Department of Computer Scence Engneerng

More information

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process Dsadvantages of cyclc TDDB47 Real Tme Systems Manual scheduler constructon Cannot deal wth any runtme changes What happens f we add a task to the set? Real-Tme Systems Laboratory Department of Computer

More information

Real-Time Process Scheduling

Real-Time Process Scheduling Real-Tme Process Schedulng ktw@cse.ntu.edu.tw (Real-Tme and Embedded Systems Laboratory) Independent Process Schedulng Processes share nothng but CPU Papers for dscussons: C.L. Lu and James. W. Layland,

More information

Dynamic Scheduling of Emergency Department Resources

Dynamic Scheduling of Emergency Department Resources Dynamc Schedulng of Emergency Department Resources Junchao Xao Laboratory for Internet Software Technologes, Insttute of Software, Chnese Academy of Scences P.O.Box 8718, No. 4 South Fourth Street, Zhong

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and Ths artcle appeared n a journal publshed by Elsever. The attached copy s furnshed to the author for nternal non-commercal research and educaton use, ncludng for nstructon at the authors nsttuton and sharng

More information