A Multi-criteria Class-based Job Scheduler for Large Computing Farms

Size: px
Start display at page:

Download "A Multi-criteria Class-based Job Scheduler for Large Computing Farms"

Transcription

1 A Multi-criteria Class-based Job Scheduler for Large Computing Farms R. Baraglia 1, P. Dazzi 1, and R. Ferrini 1 1 ISTI A. Faedo, CNR, Pisa, Italy Abstract In this paper we propose a new multi-criteria class-based job scheduler able to dynamically schedule a stream of batch jobs on large-scale computing farms. It is driven by several configuration parameters allowing the scheduler customization with respect to the goals of an installation. The proposed scheduling policies allow to maximize the resource usage and to guarantee the applications QoS requirements. The proposed solution has been evaluated by simulations using different streams of synthetically generated jobs. To analyze the quality of our solution we propose a new methodology to estimate whether at a given time the resources in the system are really sufficient to meet the service level requested by the submitted jobs. Moreover, the proposed solution was also evaluated comparing it with the Backfilling and Flexible backfilling algorithms. Our scheduler demonstrated to be able to carry out good scheduling choices. Keywords: Job scheduling; Deadline Scheduling; Software License Scheduling; Computing Farm. 1. Introduction In large computing farms providing utility computing for a large number of users, with different functional and nonfunctional requirements, a scheduler plays a basic role in order to efficiently and effectively schedule submitted jobs on the available resources. The objective of the scheduling is to assign tasks to specific resources maximizing the overall resource utilization and guaranteeing the QoS required by applications. The scheduling problem has shown to be NPcomplete in its general as well as in some restricted forms [3]. The scheduling on utility computing environments is multi-criteria in nature [8]. In fact, these environments generally manage computational requests dynamically time varying, and with different computational requirements and constraints that compete to access shared resources. Even if in the past research efforts has been devoted to develop multi-criteria job scheduling algorithms [7], [6], [1], [2], there is still the need to improve the scheduling techniques able to manage an increasing number of jobs and to address all the application and installation requirements as well as sets of constraints for the afore mentioned computational environment. In this paper, we propose a scheduler able to schedule a continuos stream of batch jobs on largescale computing farms. As typical scenario we consider a computing farm made up of heterogeneous, single-processor or SMP machines, linked by a low-latency, high-bandwidth network. Some characteristics of the computing nodes (e.g. processor type, memory size, number of CPUs, link bandwidth) are static and known whereas some others are dynamic (e.g. floating sw licenses). The adopted scheduling policies permit us to optimize the scheduling with respect to different objectives, even contrasting, such as maximize the resource usage and to guarantee the non-functional applications requirements. The rest of this paper is organized as follows. Section 2 describes some of the most common job scheduling algorithms. Section 3 gives a description of the problem. Section 4 describes our solution. Section 5 outlines and evaluates our job scheduler. Finally, conclusions and future work are described in Section Related work Batch jobs scheduling are mainly divided in two main classes: on-line and offline. On-line algorithms are those that do not have any knowledge about the whole input job stream. They take decisions for each arriving job without knowing future inputs. Conversely, offline algorithms know all the jobs before taking scheduling decisions. Many of these algorithms are exploited into commercial and open source job schedulers [4]. The Backfilling algorithm [9] is a widely adopted scheduling approach, it is an optimization of the FCFS algorithm [1]. It requires each job specifies its execution time, so that the scheduler can estimate when jobs finish and other ones can be started. The main goal of Backfilling is to exploit a resource reservation approach to improve the FCFS policy by increasing the system resource usage and by decreasing the average job waiting time in the scheduler s queue. In order to improve performance, some backfilling variants, such as Flexible backfilling [5] have been proposed. The Flexible backfilling algorithm is obtained by exploiting a different order of queued jobs. Jobs prioritized according to scheduler goals are queued according to their priority value, and selected for scheduling. Even if the multi-criteria approach seems to be the most viable one to solve the resource management and scheduling problem in heterogeneous and distributed computational environments, only a few research efforts have been done in such direction [1], [7], [2], [11]. In [1] a multi-criteria job scheduler for scheduling a continuous stream of batch jobs on large scale computing

2 farms is proposed. It exploits a set of heuristics that drive the scheduler in taking decisions. Each heuristics manages a specific constraint, and contributes to compute the measurement of the matching degree between a job and a machine. The scheduler allows its extensions to manage a wide set of requirements and constraints. In [7] K. Kurowski et al. propose a two-level hierarchy multi-criteria scheduling approach for Grid environments. All participants of a scheduling process, i.e. endusers, Grid administrators and resource providers, express their requirements and preferences by using two sets of parameters: hard constraints and soft constraints. A Grid broker at higher level exploits the hard constraints to compute a set of feasible solutions, which can be optimized by using soft constraints describing preferences regarding multiple criteria, such as various performance factors, QoSbased parameters, and characteristics of local schedulers. In [2] a bi-criteria algorithm for scheduling moldable jobs on cluster computing platforms is proposed. It exploits two preexisting algorithms to simultaneously optimize two criteria: job makespan and weighted minimal average completion time. Such criteria are complementarity, and well represent the objectives of both users and system administrators. The algorithm was evaluated by simulations using two different synthetic workloads. In [11] a solution based on advanced resource reservation that optimizes resource utilization and user QoS constraints for Grid environments is proposed. It supports advanced reservations to deal with the dynamic of Grids and provides a solution for agreement enforcement. The proposed advanced reservation solution is structured according to a 3-layered negotiation protocol. Preferences of end-users are taken into account to start a negotiation to select resources to reserve. The user can select the best suitable offer or can decide to re-negotiate by changing some of the constraints. End-users preferences are modeled as utility functions for which end users have to specify required values and negotiation levels. In [6] is proposed a schedule-based solution for scheduling a continuous stream of batch jobs on computational Grids. The solution is based on Earliest Deadline First (EG-EDF) rule and Tabu search technique. The EG-EDF rule incrementally builds the schedule for all jobs by applying technique which fills earliest existing gaps in the schedule with newly arriving jobs. If no gap for a coming job is available EG-EDF rule uses Earliest Deadline First (EDF) strategy for including a new job into the existing schedule. The schedule is then optimized by using a Tabu search algorithm to move jobs into earliest gaps. Scheduling choices are taken to meet the QoS requested by the submitted jobs, and to optimize the hardware resource usage. 3. Problem Description We consider jobs and machines annotated with information describing their requirements and features, respectively. Jobs in a stream can be sequential or multi-thread, and all the jobs are independent one from each other. To each job is attached a description containing both an identifier and a set of functional and non-functional requirements. Functional requirements include the number of processors, the RAM size and the software licenses a job needs to be executed. Non-functional requirements (also referred as QoS) are job slowdown equal to one, job deadline and job advanced resource reservation. The description also includes an estimation of the time required to compute the job and the features describing the processor exploited to perform such estimation (benchmark score). Each job is executed on a single machine, and all jobs are preemptable. Job preemption can be performed when either a job submission or a job ending event takes place. The machines composing the farm are described by a benchmark score, the number and type of CPUs, the size of the RAM installed and the non-floating (i.e. bound to a machine) and floating (i.e. not bound to any specific machine) software licenses they can run. Processors installed on each machine has associated a weight. Every machine can execute multiple jobs at the same time in a space-sharing fashion. All the machines support two basic forms of job preemption: stop/restart and suspend/resume. The checkpoint/restart form is possible only if the running job is properly instrumented. Machines are assigned to jobs in the shape of sub-machines, namely a subset of a machine s processors. A submachine is managed by the scheduler as an instance of the machine from which it is originated. Floating sw licenses can be assigned to any machine able to run them. The only limit is that the total number of licenses in use can not be greater than their availability. In our study, we consider the association of licenses to machines. As a consequence if a set of jobs requiring the same license can be executed on the same machine, only one license copy is accounted. 4. The scheduler achitecture The proposed scheduler is based on multiple job classes. Each job is assigned to a class on the basis of its functional and/or non-functional requirements. Figure 1 depicts the architecture of our scheduler. Three main components are represented: Job-Dispatcher, Class-Scheduler and Control- Scheduler. The Job-Dispatcher receives, classifies, and dispatches each job to the proper class. A class is an entity characterized by a set of dynamically assigned computational resources, a job queue and a Class Scheduler. The classes are ranked according to a priority value assigned statically by the installation on the basis of the functional and nonfunctional requirements managed. To each class is associated a Class-Scheduler (CLS). This component is specialized for managing a specific class of job requirements. To each CLS is associated a job queue and a set of resources. The CLS extracts jobs from its queue and allocates them resources to be run. In case of resource shortage, it issues a request for additional resources to the Control Scheduler. A class releases the assigned resources when they have not been used

3 Fig. 1: The scheduler architecture. for a predefined quantum of time, fixed by the installation. The Control-Scheduler (CNS) is devoted to manage requests issued by CLSs. CNS allocates resources to CLSs in the form of sub-machines or floating sw licenses. As an example, consider a job asking for four processors in a computing farm composed by only eight-processors machines. The CNS assigns four free processors from an available suitable machine to the requesting class. That class becomes the temporary owner of the assigned resources until it will not release them. Requests issued by higher ranked classes are scheduled first than the other ones, and requests issued by the same CLS are managed according to the FCFS order. CNS defines new sub-machines according to two alternatives: 1) The sub-machine is defined on a machine already assigned to the requesting class, 2) The sub-machine is defined on a machine not assigned to any class, i.e. a free machine. If no machine is available, the CNS can decide to enact a resource stealing process. The definition of a new submachine is performed by exploiting the principle of the least privilege, namely, from all the available machines is chosen the one with the least amount of resources that is sufficient to satisfy the requested assignment. Sub-machines are managed using a data structure consisting in a vector V F M which length equals to the number of processors P of the largest machine in the computing farm. Each element in the vector contains a list in which each elements represents a farm s machine. A machine belongs to list at index i if its actual availability of processors equals to i. The lists are arranged in increasing order with respect to machine memory size, and then the number of floating sw licenses the machine can run. In order to find a machine with p processors, a RAM of size r and l licenses, CNS starts its search from the V F M[p] element and continues until it finds a machine addressing the assignment requirements or the vector ends. When a proper machine is found and a new sub-machine is created the number of available processors in that machine decreases correspondently. The data structure is updated consequently. The idea behind this data structure is to keep lager machines available for subsequent requests and to reduce machines fragmentation. Floating sw licenses are managed using a specific data structure consisting in a vector which length equals to the number of available floating sw licenses S. Each vector entry addresses a list storing the number of currently usable copies for a specific license. Such lists are structured according to three sublists storing respectively the number of available copies, the number of copies assigned to a class but not used, and the number of copies assigned to a class and in use. Floating licenses belonging to the first two sublists are available for assignment to classes, whereas the copies in the third list are already assigned. When a free copy of a floating sw license is assigned to a class, it is removed from the first sublist and assigned to the third sublist. When a job using a floating sw license finishes its execution the license copy is moved to the second sublist. After an installation-defined quantum of time licenses in the second sublist are released and moved to the first sublist. In our study we considered six job classes ( 5) The first three classes (, 1 and 2) manage jobs with both functional and non-functional requirements whereas the other three ones (3, 4 and 5) manage jobs with only functional requirements. Jobs are assigned to the classes according to the following criteria: Class : Jobs requiring a slowdown equal to 1. Jobs are managed by the related CLS according to the First Come First Served (FCFS) order. Class 1: Jobs requiring advanced resource reservation. Jobs are scheduled according their closeness to the reservation. Jobs for which the resource reservation fails are discarded. Alternatively, but not in our study, they could be moved to Class 3, 4 or 5 depending on their functional requirements. Class 2: Jobs with deadline. Jobs are scheduled according to the expected time they have to start to meet their deadline. The queue position of a job is determined exploiting the solution proposed in [6]. According to such solution, the closer the deadline of a job is, the higher its position in the job queue is. Class 3: Sequential or parallel jobs requiring floating sw licenses. Class 4: Parallel jobs not requiring any floating sw license. Class 5: Sequential jobs not asking for floating sw licenses. Jobs within classes 3, 4 and 5 are selected by the related CLS according to the FCFS order. If a job has requirements to be assigned to two different classes, it will be assigned to the class having a higher priority. The assignment of resources to classes permits to exploit the locality in job requirements. In fact, after a initial time, it is highly probable that a class managing jobs with similar requirements owns in advance the resources to run the jobs to it assigned.

4 Resource stealing: When a class is experiencing a lack of free resources to satisfy a request its CLS issues a request to CNS. As a consequence, the execution of jobs belonging to classes with a lower priority have to be interrupted to release the needed resources. However, the interruption of a job has a cost for the computing farm. Actually, interrupting a job is convenient only if the gain resulting from the use of the released resources overcomes this cost. To evaluate this cost several parameters should be considered, e.g. the time elapsed in execution by the job candidate to be interrupted, the number of sw assigned to that job, etc. In our model a resource r can be moved from a class A to a class B if the following expression is verified: Rank A > Cost B (r), where Rank A is the rank value of Class A and Cost B (r) is the cost associated to the interruption of jobs running on resource r and belongs to B. Considering a generic class C, a resource r, and the number k of jobs running on r such cost can be computed as: Cost C (r) = Rank C + k i=1 P C (i), where P C (i) = W i ( Tex(i) T ) W tot(i) dead + (W pr P r(i)) + (W l L(i)). T ex (i) is the time spent executing the job i and T tot (i) is the total estimated execution time of the job i. W i is the weight associated to the form of preemption adopted to interrupt the execution of the job i. This is small if the job supports checkpoint/restart, it increases in case of suspend/resume and it is maximum if the only option is stop/restart. W dead is the weight associated to jobs having a deadline. It equals to 1 for jobs without deadline. W pr is the weight associated to a processor and P r(i) is the number of processors assigned to i. W l is the weight associated to a floating sw license and L(i) is the number of floating sw licenses used by i. The idea of this approach is to allow an installation to tune W i, W pr, W l and W dead values and class ranks according to its objectives. As an example, suppose that an installation goal is to respect in a very strict way the prioritization given by the jobs classes.to this end, the Rank associated to two consecutive classes have to differ by a value greater than the maximum value that P C (i) can assume. Such value is obtained when a job i (it makes no difference to have just one or several jobs if the overall resource usage is the same) is using all the processors of the largest farm s machine, all the available floating sw licenses, and is approaching the end of its execution. Let s assume that the largest farm s machine has 124 processors, and that the total number of floating sw licenses is 2. Moreover, consider weights assuming these values: W i = 2, W pr = 1, W l =.5 and W dead = 2. Hence, the maximum value P C (i) can assume is: Pmax C = = As a consequence, the rank values of the six classes have to be fixed as follows: Rank_Class5 =, Rank_Class4 = 15, Rank_Class3 = 3, Rank_Class2 = 45, Rank_Class1 = 6, and Rank_Class = These rank values are the ones exploited in the conducted tests Resource search: Considering a job i belonging to a class A and requiring P r x P processors and L x S floating sw licenses the resource search algorithm is structured according to the following steps: 1) Starting from the entry P r x, the VFM data structure is analyzed to find jobs to be interrupted. 2) For every machine m suitable for executing i indexed by P r x, a list of jobs that could be interrupted is created. The list also includes free processors. 3) The first N x jobs, which interruption permits us to obtain the required P r x processors are selected. 4) If selecting the first N x jobs a number of processors greater than P r x is obtained, a refined step is conducted to fix the number of selected processors. The list of selected jobs is visited in reverse order to remove exceeding processors. 5) The cost Cost r is computed as the sum of the costs related to the jobs being interrupted on m. If Cost r is smaller than the costs computed for the other analyzed machines, the machine m is selected, and the jobs on it executing are selected to be interrupted. 6) Steps from 2 to 5 are repeated from P r x + 1 to P to find machines suitable for executing i. 7) At the end of step 6, the list of jobs that cloud be interrupted (i.e. jobs running on the machines with associated the lowest costs) is carried out. The interruption of the selected jobs may lead to free some required licenses. In this case, the found licenses are removed from L x and the following steps are executed: 1) The floating sw licenses search starts from the queue l L x of the floating sw license data structure. 2) The cost Cost r due to the interruption of a job using l is computed. The job corresponding to the smallest Cost r is selected and its execution interrupted. 3) Steps 1 and 2 are repeated until all the licenses needed to run the job i are found. 4) At the end of step 3, the set of jobs to interrupt is found. This phase is the most computation expensive one. In fact, the search of free processors requires in the worst case to analyze all the available machines and floating sw licenses. The search of processors needs to sort the N jobs running on each of the M machines in the farm. Since the sort operation has complexity N logn, in the worst case, the resource search algorithm has complexity C = M NlogN. To search a floating sw licenses in the worst case has complexity L. 5. Performance Evaluation The evaluation of the proposed scheduler was conducted by simulations using different streams of jobs and farms of different size. Job and machine parameters have been randomly generated from a uniform distribution in the ranges shown in Table 1. Moreover, we compared our solution

5 with Backfilling and Flexible backfilling algorithms. The job priorities of the Flexible backfilling algorithm are updated at each job submission or ending event and the reservation for the first queued job is maintained through events. Table 1: Parameters used to generate jobs and machines. Description Range Processor Type 1 5 Number of processors Benchmark score.5 2 RAM 5Mb 5Gb Job estimated execution time (secs) 16 2 Number of licenses copies 5 7 Number of different licenses 2 For each simulations the percentages of jobs requiring specific functional and non-functional requirements have been generated according to the values shown in Table 2. Table 2: s used to generate the job steams. Description 5% requires a slowdown equal to 1 3% has a deadline 5% needs of advanced resource reservation 6% needs a software license 3% needs a floating software license 1% needs a specific hardware 2% supports checkpointing 4% needs 1 processor 4% needs 2 processors 1% needs 4 processors.8% needs 8 processors.8% needs 16 processors.6% needs 32 processors.4% needs 64 processors.2% needs 128 processors The duration of each simulation was set at 432 time units (i.e. the number of seconds in 12 hours). For each simulation unit the system: (1) Generate a job and put it in the Dispatcher s job queue, (2) Update of the running jobs status (3) Update of the status of the resources (4) Execute the CLSs, (5) Execute the CNS, (6) Store the simulation statistics. In the conducted experiments the number of generated machines varied from 1 to 12, and to obtain stable values each simulation was repeated 5 times with different farm configurations and job streams. The performance metrics have been evaluated versus the system contention. Usually, this value is roughly computed as: ResourceR/ResourceA, where ResourceR is the amount of a specific resource requested by the jobs in the system, and ResourcesA is the available amount of such resource. This ratio does not provide an accurate information on resource availability because it ignores the jobs allocation implied constraints. In fact, all the requirements of a job must be satisfied to allocate it, so the variables describing the available resources cannot be considered independently. To clarify this point, let us suppose that the value computed by using the above expression is less than 1. In principle, it indicates an availability of the considered resource. As a consequence, a scheduler should be able to properly allocate the jobs on the available resource. However, this is not always true. As an example consider an availability of 2 processors in the system and a job to schedule requiring 16 processors. Clearly, if at least 16 of the 2 free processors are not available on the same machine the job can not be scheduled even if a rough analysis would suggest enough processors availability. Unfortunately, in general can be hard to understand if the resource shortage is caused by the ineffectiveness of the adopted scheduler or by an insufficient number of available resources. To overcome this problem we introduce the index. Its aim is to exploit a simple allocator to measure, with a certain degree of approximation, if at a certain time, the resources in the system are sufficient to meet all the jobs requirements. In particular, in this paper we only consider the resource processor for computing the index. To this end, we considered the following four job scheduling algorithms (but in principle others can be also considered), each one basing its strategy on a different job allocation policy: 1) Largest Machine, which allocates a job on the machine with the largest number of free processors, 2) Smallest Machine, which allocates a job on the machine with the smallest number of free processors, 3) Smallest Residue, which allocates a job on the machine where remains, after the allocation of a job, the lowest number of free processors, 4) Largest Residue, which allocates a job on the machine where remains, after the allocation of a job, the largest number of free processors. These algorithms were evaluated to find the one leading to the best processor usage in the simulated environment. To this end, a workload able to use all the available processors of the simulated farm was designed according to the following four steps: 1) A random generation of a set of machines, 2) For each machine a proper set of jobs were generated, 3) A random distribution of all the processors belonging to each machine to the generated set of jobs, 4) Assignment of the generated jobs to a free computation slot in such a way that they finish their execution on the target machine all at a fixed time. is computed as: (P rocessors r +P rocessors q )/P rocessors a, where P rocessors r are the processors request by the allocated jobs, P rocessors q are the processors request by the not allocated jobs, and P rocessors a are the available processors. The higher the value is, the higher the system contention is. Smallest Residue is the method that obtained the best results in 5 simulations we conducting varying the number and the type of the machines inside the simulated farm. This is the allocator we used for computing the index. It behaves as a sort of probe to measure the processors availability throughout a simulation. It is executed each time a job execution is started. To evaluate the scheduler efficiency, we analyzed the algorithms exploited by CNS

6 to handle sub-machines. Figure 2 shows the percentages of new sub-machines definition and expansion we obtained by the simulations. When the value is low ( <.4) there is a high rate of new sub-machines definition because the classes have only a few resources assigned. When the value of is between.4 and.8 there is a higher submachines expansion rate because the classes already have a large number of sub-machines and at the same time there are enough available processors on the farm machines. When the value of is greater than.8 the percentages of the expanding and new definition processes tend to stabilize at values of 3% and 4%, respectively. This result means that, when the system is heavily loaded, for example, when is equal to 2 the classes in about 35% of cases already have a submachine able to execute a submitted job. While, in the other 65% of cases, CLSs require to CNS to extend a submachine (25%) or to define a new sub-machine (4%) Machine Expansion New Machines Fig. 2: New sub-machines definition or expansion. The graph of Figure 3 shows the percentage of processors used by the running jobs. When the index is greater than 1, i.e. when the requested resources begin to be unavailable, the processor utilization approaches to 1%. But the shape of the curve clearly shows that in some cases there are free processors even if the values of are greater than 1. In fact, also when the value of is greater than 1.2 (i.e. when the estimated number of requested resources go over in the available ones) the figure shows that some processors are not used. This happens because, also when the number of requests is much greater than the available resources, the last are not able to run any one of the waiting requests. However, it is worth to point out that our scheduler is able to schedule jobs in a way that keeps low the number of unused resources. We also investigated the degree of satisfaction of nonfunctional job requirements. We evaluated the quality of service provided by the Control-Scheduler on the basis of decisions it has made to allocate resources to the classes. This analysis has been conducted to assess the choices made in the following areas: (1) Job classification policies and rank values assigned to classes; (2) Resource stealing technique. Bad choices can cause long queuing times for some types of jobs, in particular the ones belonging to low ranked classes. To evaluate the satisfaction level of the resource demands Fig. 3: Processor usage. we used the slowdown metric. This measures the ratio between the response time of a job (i.e. the time elapsed between its submission and its termination) and its execution time. It is computed as: (T w + T e)/t e, with T w the time that a job spends waiting to start and/or restart its execution, and T e the job execution time [9]. Figure 4(a) shows the average slowdown obtained executing job belonging to the following job classes: Class, i.e. jobs requiring a slowdown equal to 1; Class 3, i.e sequential or parallel jobs requiring floating license, Class 4, i.e. parallel jobs do not requiring floating license, and Class 5, i.e. sequential jobs do not asking for floating license. In this evaluation jobs requesting advanced reservation or a deadline were not considered because for such jobs the slowdown is not indicative. In fact, depending on their characteristics, such jobs can spend some time enqueued before to be executed without affecting their performances. In our tests all the requests of advanced resource reservation were satisfied. Figure 4(a) shows that, when the resources are available (i.e. < 1) all the jobs obtain a slowdown equal to 1. When > 1, i.e. the job competition to access the available computational resources increases, jobs are forced to spend some time in the queues resulting in an increase of the job average slowdown. It can be seen that in the conducted tests the job requirement slowdown=1 is satisfied also when the value of reaches 1.6 (i.e. high system contention). The slowdown value of jobs asking floating software licenses remain under 1.2 also with high system contention, while the slowdown increases only up to a 2% for parallel jobs. Instead, the value of the slowdown of the serial jobs is the worst, their completion time can increase up to 8%. Figure 4(b) shows the percentage of jobs executed respecting their deadline. The results obtained by the proposed scheduler were compared with those obtained by running a Backfilling and Flexible backfilling algorithms with the same simulation conditions, i.e. with both the same machines and the same job streams, used to evaluate the proposed scheduler. As expected, the lower the system contention is ( 1), the higher the percentage of the jobs meeting their deadline is, and all the schedulers are able to satisfy all deadline requests. The proposed scheduler is able to obtain better results than the other algorithms. It obtains a percentage of

7 Slowdown = 1 License Parallel Serial Slowdown Class Scheduler Backfilling Flexible Backfilling Slowdown Class Scheduler Backfilling Flexible Backfilling (a) Job slowdown (b) Job deadline Fig. 4: Slowdown (c) Slowdown of jobs requiring slowdown = 1 jobs that respect their deadline very close to 1% also with a high system contention ( 1.5). As the system contention increases the Flexible backfilling algorithm reaches a performance that is 16% lower than the one obtained by the proposed scheduler, while the Backfilling algorithm obtains a performance significantly lower than the one obtained by our scheduler. In Figure 4(c) we show the results obtained from the execution of jobs requiring a slowdown value equal to 1. It can be seen that when the resources are not longer available the Backfilling and Flexible backfilling algorithms are not able to guarantee this QoS. The proposed class scheduler, by using the technique of resource stealing, makes available the resources needed also when the system contention is high ( 1.5). It is worth to point out that the Flexible backfilling algorithm maintains the slowdown value within an acceptable level, offering in this test a performance comparable to the one obtained by the proposed scheduler. 6. Conclusion In this paper, we propose a new multi-criteria scheduler to dynamically schedule a continuous stream of batch jobs on large-scale non-dedicated computing farm made of heterogeneous, single-processor or SMP machines, linked by a low-latency, high-bandwidth network. The proposed solution aims at scheduling arriving jobs respecting several functional and non-functional job requirements and optimizing the hardware and software resource usage. Several configuration parameters allow the scheduler customization with respect to the goals of an installation. The scheduler was evaluated by simulations using different job streams synthetically generated. To conduct the evaluation a technique to measure the system contention throughout a simulation was adopted. The scheduler has been evaluated comparing it with Backfilling and Flexible backfilling schedulers. In the conducted tests, the proposed scheduler demonstrated to be able to carry out good scheduling choices As future work, we plan: (1) to enhance the current scheduler refining the adopted advanced resource reservation technique, and to manage jobs requiring co-allocation to be executed on more than one machine, (2) to introduce energy efficiency policies dispatching workloads to more energy-efficient machines, (3) to evaluate the scheduler when applied to computing platforms made of distributed computing farm, (4) to investigate the feasibility of different scheduling criteria to estimate the index. 7. Acknowledgment This work has been supported by the Projects CONTRAIL (EU- FP ) and S-CUBE (EU-FP ). References [1] G. Capannini, R. Baraglia, D. Puppin, L. Ricci, and M. Pasquali. A job scheduling framework for large computing farms. In SC, page 54, 27. [2] P.-F. Dutot, L. Eyraud, G. Mounié, and D. Trystram. Bi-criteria algorithm for scheduling jobs on cluster platforms. In Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures, SPAA 4, pages , New York, NY, USA, 24. ACM. [3] H. El-Rewini, T. G. Lewis, and H. H. ALI. Task Scheduling in Parallel and Distributed Systems. PTR Prentice Hall, Englewood Cliffs, New Jersey, [4] Y. Etsion and D. Tsafrir. A short survey of commercial cluster batch schedulers. Technical Report 25-13, School of Computer Science and Engineering, The Hebrew University of Jerusalem, May 25. [5] D. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling a status report. In Job Scheduling Strategies for Parallel Processing, pages Springer, 25. [6] D. Klusek, H. Rudov, R. Baraglia, M. Pasquali, and G. Capannini. Comparison of multi-criteria scheduling techniques, In: Grid Computing Achievements and Prospects. Springer, 28. [7] K. Kurowski, J. Nabrzyski, A. Oleksiak, and J. Weglarz. A multicriteria approach to two-level hierarchy scheduling in grids. J. of Scheduling, 11: , October 28. [8] Y. Kurowski, J. Nabrzyski, A. Oleksiak, and J. Weglarz. Scheduling jobs on the grid multicriteria approach. Computational Methods in Science and Technology, 12(2): , 26. [9] A. Mu alem and D. Feitelson. Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. Parallel and Distributed Systems, IEEE Transactions on, 12(6): , 21. [1] U. Schwiegelshohn and R. Yahyapour. Analysis of first-come-firstserve parallel job scheduling. In Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, pages Society for Industrial and Applied Mathematics, [11] M. Siddiqui, A. Villazón, and T. Fahringer. Grid capacity planning with negotiation-based advance reservation for optimized qos. In Proceedings of the 26 ACM/IEEE conference on Supercomputing, page 13. ACM, 26.

A Multi-criteria Job Scheduling Framework for Large Computing Farms

A Multi-criteria Job Scheduling Framework for Large Computing Farms A Multi-criteria Job Scheduling Framework for Large Computing Farms Ranieri Baraglia a,, Gabriele Capannini a, Patrizio Dazzi a, Giancarlo Pagano b a Information Science and Technology Institute - CNR

More information

BACKFILLING STRATEGIES FOR SCHEDULING STREAMS OF JOBS ON COMPUTATIONAL FARMS

BACKFILLING STRATEGIES FOR SCHEDULING STREAMS OF JOBS ON COMPUTATIONAL FARMS BACKFILLING STRATEGIES FOR SCHEDULING STREAMS OF JOBS ON COMPUTATIONAL FARMS A.D.Techiouba, G.Capannini, Ranieri Baraglia, D.Puppin, M.Pasquali ISTI,CNR Via Moruzzi, 1 Pisa, Italy techioub@cli.di.unipi.it

More information

A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids

A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids Managed by A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids M. Pasquali, R. Baraglia, G. Capannini, L. Ricci, and D. Laforenza 7th Meeting of the Institute on Resource

More information

Scheduling Algorithms for Dynamic Workload

Scheduling Algorithms for Dynamic Workload Managed by Scheduling Algorithms for Dynamic Workload Dalibor Klusáček (MU) Hana Rudová (MU) Ranieri Baraglia (CNR - ISTI) Gabriele Capannini (CNR - ISTI) Marco Pasquali (CNR ISTI) Outline Motivation &

More information

COMPARISON OF MULTI-CRITERIA SCHEDULING TECHNIQUES

COMPARISON OF MULTI-CRITERIA SCHEDULING TECHNIQUES COMPARISON OF MULTI-CRITERIA SCHEDULING TECHNIQUES Dalibor Klusáček and Hana Rudová Faculty of Informatics, Masaryk University Botanická 68a, Brno, Czech Republic {xklusac, hanka}@fi.muni.cz Ranieri Baraglia

More information

A Job Scheduling Framework for Large Computing Farms

A Job Scheduling Framework for Large Computing Farms A Job Scheduling Framework for Large Computing Farms Diego Puppin Information Science and Technologies Institute Via G. Moruzzi 1 56126 Pisa, Italy diego.puppin@isti.cnr.it Gabriele Capannini Information

More information

Grid Scheduling Dictionary of Terms and Keywords

Grid Scheduling Dictionary of Terms and Keywords Grid Scheduling Dictionary Working Group M. Roehrig, Sandia National Laboratories W. Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Document: Category: Informational June 2002 Status

More information

A Job Self-Scheduling Policy for HPC Infrastructures

A Job Self-Scheduling Policy for HPC Infrastructures A Job Self-Scheduling Policy for HPC Infrastructures F. Guim, J. Corbalan Barcelona Supercomputing Center {francesc.guim,julita.corbalan}@bsc.edu Abstract. The number of distributed high performance computing

More information

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354

159.735. Final Report. Cluster Scheduling. Submitted by: Priti Lohani 04244354 159.735 Final Report Cluster Scheduling Submitted by: Priti Lohani 04244354 1 Table of contents: 159.735... 1 Final Report... 1 Cluster Scheduling... 1 Table of contents:... 2 1. Introduction:... 3 1.1

More information

Adaptive Processor Allocation for Moldable Jobs in Computational Grid

Adaptive Processor Allocation for Moldable Jobs in Computational Grid 10 International Journal of Grid and High Performance Computing, 1(1), 10-21, January-March 2009 Adaptive Processor Allocation for Moldable Jobs in Computational Grid Kuo-Chan Huang, National Taichung

More information

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration

Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration Efficient Scheduling Of On-line Services in Cloud Computing Based on Task Migration 1 Harish H G, 2 Dr. R Girisha 1 PG Student, 2 Professor, Department of CSE, PESCE Mandya (An Autonomous Institution under

More information

EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALLEL JOBS IN CLUSTERS

EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALLEL JOBS IN CLUSTERS EFFICIENT SCHEDULING STRATEGY USING COMMUNICATION AWARE SCHEDULING FOR PARALLEL JOBS IN CLUSTERS A.Neela madheswari 1 and R.S.D.Wahida Banu 2 1 Department of Information Technology, KMEA Engineering College,

More information

RESOURCE CO-ALLOCATION ALGORITHMS IN DISTRIBUTED JOB BATCH SCHEDULING Victor V. Toporkov, Alexander Bobchenkov, Dmitry Yemelyanov and Anna Toporkova

RESOURCE CO-ALLOCATION ALGORITHMS IN DISTRIBUTED JOB BATCH SCHEDULING Victor V. Toporkov, Alexander Bobchenkov, Dmitry Yemelyanov and Anna Toporkova RESOURCE CO-ALLOCATION ALGORITHMS IN DISTRIBUTED JOB BATCH SCHEDULING Victor V. Toporkov, Alexander Bobchenkov, Dmitry Yemelyanov and Anna Toporkova Summary In this work, we present slot selection algorithms

More information

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses Overview of Real-Time Scheduling Embedded Real-Time Software Lecture 3 Lecture Outline Overview of real-time scheduling algorithms Clock-driven Weighted round-robin Priority-driven Dynamic vs. static Deadline

More information

The Importance of Software License Server Monitoring

The Importance of Software License Server Monitoring The Importance of Software License Server Monitoring NetworkComputer Meeting The Job Scheduling Challenges of Organizations of All Sizes White Paper Introduction Every semiconductor design group uses a

More information

On Job Scheduling for HPC-Clusters and the dynp Scheduler

On Job Scheduling for HPC-Clusters and the dynp Scheduler On Job Scheduling for HPC-Clusters and the dynp Scheduler Achim Streit PC - Paderborn Center for Parallel Computing, 331 Paderborn, Germany, streit@upb.de http://www.upb.de/pc Abstract. Efficient job-scheduling

More information

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource

Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)

More information

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,

More information

Cloud Federations in Contrail

Cloud Federations in Contrail Cloud Federations in Contrail Emanuele Carlini 1,3, Massimo Coppola 1, Patrizio Dazzi 1, Laura Ricci 1,2, GiacomoRighetti 1,2 " 1 - CNR - ISTI, Pisa, Italy" 2 - University of Pisa, C.S. Dept" 3 - IMT Lucca,

More information

Resource Models: Batch Scheduling

Resource Models: Batch Scheduling Resource Models: Batch Scheduling Last Time» Cycle Stealing Resource Model Large Reach, Mass Heterogeneity, complex resource behavior Asynchronous Revocation, independent, idempotent tasks» Resource Sharing

More information

Contributions to Gang Scheduling

Contributions to Gang Scheduling CHAPTER 7 Contributions to Gang Scheduling In this Chapter, we present two techniques to improve Gang Scheduling policies by adopting the ideas of this Thesis. The first one, Performance- Driven Gang Scheduling,

More information

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Pooja.B. Jewargi Prof. Jyoti.Patil Department of computer science and engineering,

More information

Scheduling and Resource Management in Computational Mini-Grids

Scheduling and Resource Management in Computational Mini-Grids Scheduling and Resource Management in Computational Mini-Grids July 1, 2002 Project Description The concept of grid computing is becoming a more and more important one in the high performance computing

More information

ICS 143 - Principles of Operating Systems

ICS 143 - Principles of Operating Systems ICS 143 - Principles of Operating Systems Lecture 5 - CPU Scheduling Prof. Nalini Venkatasubramanian nalini@ics.uci.edu Note that some slides are adapted from course text slides 2008 Silberschatz. Some

More information

A CP Scheduler for High-Performance Computers

A CP Scheduler for High-Performance Computers A CP Scheduler for High-Performance Computers Thomas Bridi, Michele Lombardi, Andrea Bartolini, Luca Benini, and Michela Milano {thomas.bridi,michele.lombardi2,a.bartolini,luca.benini,michela.milano}@

More information

Self-Tuning Job Scheduling Strategies for the Resource Management of HPC Systems and Computational Grids

Self-Tuning Job Scheduling Strategies for the Resource Management of HPC Systems and Computational Grids Self-Tuning Job Scheduling Strategies for the Resource Management of HPC Systems and Computational Grids Dissertation von Achim Streit Schriftliche Arbeit zur Erlangung des Grades eines Doktors der Naturwissenschaften

More information

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operatin g Systems: Internals and Design Principle s Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Bear in mind,

More information

How To Compare Load Sharing And Job Scheduling In A Network Of Workstations

How To Compare Load Sharing And Job Scheduling In A Network Of Workstations A COMPARISON OF LOAD SHARING AND JOB SCHEDULING IN A NETWORK OF WORKSTATIONS HELEN D. KARATZA Department of Informatics Aristotle University of Thessaloniki 546 Thessaloniki, GREECE Email: karatza@csd.auth.gr

More information

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

More information

Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations

Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations 2010 International Conference on Complex, Intelligent and Software Intensive Systems Job Scheduling in a Distributed System Using Backfilling with Inaccurate Runtime Computations Sofia K. Dimitriadou Department

More information

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 10 (April 2013), Pp 53-57 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com Fair Scheduling Algorithm with Dynamic

More information

C-Meter: A Framework for Performance Analysis of Computing Clouds

C-Meter: A Framework for Performance Analysis of Computing Clouds 9th IEEE/ACM International Symposium on Cluster Computing and the Grid C-Meter: A Framework for Performance Analysis of Computing Clouds Nezih Yigitbasi, Alexandru Iosup, and Dick Epema Delft University

More information

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum Scheduling Yücel Saygın These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum 1 Scheduling Introduction to Scheduling (1) Bursts of CPU usage alternate with periods

More information

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004

More information

THE EXTENSION OF TORQUE SCHEDULER ALLOWING THE USE OF PLANNING AND OPTIMIZATION IN GRIDS

THE EXTENSION OF TORQUE SCHEDULER ALLOWING THE USE OF PLANNING AND OPTIMIZATION IN GRIDS Computer Science 13 (2) 2012 http://dx.doi.org/10.7494/csci.2012.13.2.5 Václav Chlumský Dalibor Klusáček Miroslav Ruda THE EXTENSION OF TORQUE SCHEDULER ALLOWING THE USE OF PLANNING AND OPTIMIZATION IN

More information

Utilization Driven Power-Aware Parallel Job Scheduling

Utilization Driven Power-Aware Parallel Job Scheduling Utilization Driven Power-Aware Parallel Job Scheduling Maja Etinski Julita Corbalan Jesus Labarta Mateo Valero {maja.etinski,julita.corbalan,jesus.labarta,mateo.valero}@bsc.es Motivation Performance increase

More information

A Review on Load Balancing In Cloud Computing 1

A Review on Load Balancing In Cloud Computing 1 www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 6 June 2015, Page No. 12333-12339 A Review on Load Balancing In Cloud Computing 1 Peenaz Pathak, 2 Er.Kamna

More information

Grid Computing Approach for Dynamic Load Balancing

Grid Computing Approach for Dynamic Load Balancing International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Grid Computing Approach for Dynamic Load Balancing Kapil B. Morey 1*, Sachin B. Jadhav

More information

Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm

Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm Luís Fabrício Wanderley Góes, Carlos Augusto Paiva da Silva Martins Graduate Program in Electrical Engineering PUC Minas {lfwgoes,capsm}@pucminas.br

More information

Benefits of Global Grid Computing for Job Scheduling

Benefits of Global Grid Computing for Job Scheduling Benefits of Global Grid Computing for Job Scheduling Carsten Ernemann, Volker Hamscher, Ramin Yahyapour Computer Engineering Institute Otto-Hahn-Str. 4, 44221 Dortmund, Germany Email: {carsten.ernemann,

More information

The Impact of Migration on Parallel Job. The Pennsylvania State University. University Park PA 16802. fyyzhang, anandg@cse.psu.edu. P. O.

The Impact of Migration on Parallel Job. The Pennsylvania State University. University Park PA 16802. fyyzhang, anandg@cse.psu.edu. P. O. The Impact of Migration on Parallel Job Scheduling for Distributed Systems Y. Zhang 1,H.Franke 2, J. E. Moreira 2, and A. Sivasubramaniam 1 1 Department of Computer Science & Engineering The Pennsylvania

More information

1. Simulation of load balancing in a cloud computing environment using OMNET

1. Simulation of load balancing in a cloud computing environment using OMNET Cloud Computing Cloud computing is a rapidly growing technology that allows users to share computer resources according to their need. It is expected that cloud computing will generate close to 13.8 million

More information

Objective Criteria of Job Scheduling Problems. Uwe Schwiegelshohn, Robotics Research Lab, TU Dortmund University

Objective Criteria of Job Scheduling Problems. Uwe Schwiegelshohn, Robotics Research Lab, TU Dortmund University Objective Criteria of Job Scheduling Problems Uwe Schwiegelshohn, Robotics Research Lab, TU Dortmund University 1 Jobs and Users in Job Scheduling Problems Independent users No or unknown precedence constraints

More information

TPCalc : a throughput calculator for computer architecture studies

TPCalc : a throughput calculator for computer architecture studies TPCalc : a throughput calculator for computer architecture studies Pierre Michaud Stijn Eyerman Wouter Rogiest IRISA/INRIA Ghent University Ghent University pierre.michaud@inria.fr Stijn.Eyerman@elis.UGent.be

More information

Two-Level Scheduling Technique for Mixed Best-Effort and QoS Job Arrays on Cluster Systems

Two-Level Scheduling Technique for Mixed Best-Effort and QoS Job Arrays on Cluster Systems Two-Level Scheduling Technique for Mixed Best-Effort and QoS Job Arrays on Cluster Systems Ekasit Kijsipongse, Suriya U-ruekolan, Sornthep Vannarat Large Scale Simulation Research Laboratory National Electronics

More information

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM Akmal Basha 1 Krishna Sagar 2 1 PG Student,Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, India. 2 Associate

More information

Bi-criteria Algorithm for Scheduling Jobs on Cluster Platforms

Bi-criteria Algorithm for Scheduling Jobs on Cluster Platforms Bi-criteria Algorithm for Scheduling Jobs on Cluster Platforms Pierre-François Dutot ID-IMAG 5 avenue Jean Kuntzmann 80 Montbonnot Saint-Martin, France pfdutot@imag.fr Grégory Mounié ID-IMAG 5 avenue Jean

More information

An Enhanced Cost Optimization of Heterogeneous Workload Management in Cloud Computing

An Enhanced Cost Optimization of Heterogeneous Workload Management in Cloud Computing An Enhanced Cost Optimization of Heterogeneous Workload Management in Cloud Computing 1 Sudha.C Assistant Professor/Dept of CSE, Muthayammal College of Engineering,Rasipuram, Tamilnadu, India Abstract:

More information

Real Time Scheduling Basic Concepts. Radek Pelánek

Real Time Scheduling Basic Concepts. Radek Pelánek Real Time Scheduling Basic Concepts Radek Pelánek Basic Elements Model of RT System abstraction focus only on timing constraints idealization (e.g., zero switching time) Basic Elements Basic Notions task

More information

Resource Allocation Schemes for Gang Scheduling

Resource Allocation Schemes for Gang Scheduling Resource Allocation Schemes for Gang Scheduling B. B. Zhou School of Computing and Mathematics Deakin University Geelong, VIC 327, Australia D. Walsh R. P. Brent Department of Computer Science Australian

More information

New Issues and New Capabilities in HPC Scheduling with the Maui Scheduler

New Issues and New Capabilities in HPC Scheduling with the Maui Scheduler New Issues and New Capabilities in HPC Scheduling with the Maui Scheduler I.Introduction David B Jackson Center for High Performance Computing, University of Utah Much has changed in a few short years.

More information

Performance and Energy Aware Scheduling Simulator for High-Performance Computing

Performance and Energy Aware Scheduling Simulator for High-Performance Computing Performance and Energy Aware Scheduling Simulator for High-Performance Computing César Gómez-Martín (CénitS) 1, Miguel A. Vega-Rodríguez (University of Extremadura) 2, José-Luis González-Sánchez (CénitS)

More information

General Overview of Shared-Memory Multiprocessor Systems

General Overview of Shared-Memory Multiprocessor Systems CHAPTER 2 General Overview of Shared-Memory Multiprocessor Systems Abstract The performance of a multiprocessor system is determined by all of its components: architecture, operating system, programming

More information

OPERATING SYSTEMS SCHEDULING

OPERATING SYSTEMS SCHEDULING OPERATING SYSTEMS SCHEDULING Jerry Breecher 5: CPU- 1 CPU What Is In This Chapter? This chapter is about how to get a process attached to a processor. It centers around efficient algorithms that perform

More information

Load balancing model for Cloud Data Center ABSTRACT:

Load balancing model for Cloud Data Center ABSTRACT: Load balancing model for Cloud Data Center ABSTRACT: Cloud data center management is a key problem due to the numerous and heterogeneous strategies that can be applied, ranging from the VM placement to

More information

AN IMPROVED PERFORMANCE ANALYSIS OF PRIORITY SCHEDULING ALGORITHM IN MODIFIED AD HOC GRID LAYER

AN IMPROVED PERFORMANCE ANALYSIS OF PRIORITY SCHEDULING ALGORITHM IN MODIFIED AD HOC GRID LAYER AN IMPROVED PERFORMANCE ANALYSIS OF PRIORITY SCHEDULING ALGORITHM IN MODIFIED AD HOC GRID LAYER R. Bhaskaran 1 and V.Parthasarathy 2 1 Department of Information Technology, PSNA College of Engg. and Technology,

More information

C-Meter: A Framework for Performance Analysis of Computing Clouds

C-Meter: A Framework for Performance Analysis of Computing Clouds C-Meter: A Framework for Performance Analysis of Computing Clouds Nezih Yigitbasi, Alexandru Iosup, and Dick Epema {M.N.Yigitbasi, D.H.J.Epema, A.Iosup}@tudelft.nl Delft University of Technology Simon

More information

A Comparison of Dynamic Load Balancing Algorithms

A Comparison of Dynamic Load Balancing Algorithms A Comparison of Dynamic Load Balancing Algorithms Toufik Taibi 1, Abdelouahab Abid 2 and Engku Fariez Engku Azahan 2 1 College of Information Technology, United Arab Emirates University, P.O. Box 17555,

More information

Guideline for stresstest Page 1 of 6. Stress test

Guideline for stresstest Page 1 of 6. Stress test Guideline for stresstest Page 1 of 6 Stress test Objective: Show unacceptable problems with high parallel load. Crash, wrong processing, slow processing. Test Procedure: Run test cases with maximum number

More information

SLA-based Admission Control for a Software-as-a-Service Provider in Cloud Computing Environments

SLA-based Admission Control for a Software-as-a-Service Provider in Cloud Computing Environments SLA-based Admission Control for a Software-as-a-Service Provider in Cloud Computing Environments Linlin Wu, Saurabh Kumar Garg, and Rajkumar Buyya Cloud Computing and Distributed Systems (CLOUDS) Laboratory

More information

Fairness issues in new large scale parallel platforms.

Fairness issues in new large scale parallel platforms. Fairness issues in new large scale parallel platforms. Denis TRYSTRAM LIG Université de Grenoble Alpes Inria Institut Universitaire de France july 5, 25 New computing systems New challenges from e-science

More information

Energy Constrained Resource Scheduling for Cloud Environment

Energy Constrained Resource Scheduling for Cloud Environment Energy Constrained Resource Scheduling for Cloud Environment 1 R.Selvi, 2 S.Russia, 3 V.K.Anitha 1 2 nd Year M.E.(Software Engineering), 2 Assistant Professor Department of IT KSR Institute for Engineering

More information

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications Rouven Kreb 1 and Manuel Loesch 2 1 SAP AG, Walldorf, Germany 2 FZI Research Center for Information

More information

Int. J. Advanced Networking and Applications 1367 Volume: 03, Issue: 05, Pages: 1367-1374 (2012)

Int. J. Advanced Networking and Applications 1367 Volume: 03, Issue: 05, Pages: 1367-1374 (2012) Int. J. Advanced Networking and Applications 1367 s to Improve Resource Utilization and Request Acceptance Rate in IaaS Cloud Scheduling Vivek Shrivastava International Institute of Professional Studies,

More information

Characterizing Task Usage Shapes in Google s Compute Clusters

Characterizing Task Usage Shapes in Google s Compute Clusters Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang University of Waterloo qzhang@uwaterloo.ca Joseph L. Hellerstein Google Inc. jlh@google.com Raouf Boutaba University of Waterloo rboutaba@uwaterloo.ca

More information

An Approach to Load Balancing In Cloud Computing

An Approach to Load Balancing In Cloud Computing An Approach to Load Balancing In Cloud Computing Radha Ramani Malladi Visiting Faculty, Martins Academy, Bangalore, India ABSTRACT: Cloud computing is a structured model that defines computing services,

More information

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive

More information

The Probabilistic Model of Cloud Computing

The Probabilistic Model of Cloud Computing A probabilistic multi-tenant model for virtual machine mapping in cloud systems Zhuoyao Wang, Majeed M. Hayat, Nasir Ghani, and Khaled B. Shaban Department of Electrical and Computer Engineering, University

More information

How To Use A Cloud For A Local Cluster

How To Use A Cloud For A Local Cluster Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia (NICTA) Victoria Research Laboratory The University

More information

BRAESS-LIKE PARADOXES FOR NON-COOPERATIVE DYNAMIC LOAD BALANCING IN DISTRIBUTED COMPUTER SYSTEMS

BRAESS-LIKE PARADOXES FOR NON-COOPERATIVE DYNAMIC LOAD BALANCING IN DISTRIBUTED COMPUTER SYSTEMS GESJ: Computer Science and Telecommunications 21 No.3(26) BRAESS-LIKE PARADOXES FOR NON-COOPERATIVE DYNAMIC LOAD BALANCING IN DISTRIBUTED COMPUTER SYSTEMS Said Fathy El-Zoghdy Department of Computer Science,

More information

Improving Compute Farm Throughput in Electronic Design Automation (EDA) Solutions

Improving Compute Farm Throughput in Electronic Design Automation (EDA) Solutions Improving Compute Farm Throughput in Electronic Design Automation (EDA) Solutions System Throughput in the EDA Design Flow Abstract Functional verification of Silicon on Chip (SoC) designs can contribute

More information

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems G.Rajina #1, P.Nagaraju #2 #1 M.Tech, Computer Science Engineering, TallaPadmavathi Engineering College, Warangal,

More information

Evaluation of Job-Scheduling Strategies for Grid Computing

Evaluation of Job-Scheduling Strategies for Grid Computing Evaluation of Job-Scheduling Strategies for Grid Computing Volker Hamscher 1, Uwe Schwiegelshohn 1, Achim Streit 2, and Ramin Yahyapour 1 1 Computer Engineering Institute, University of Dortmund, 44221

More information

A Comparison of General Approaches to Multiprocessor Scheduling

A Comparison of General Approaches to Multiprocessor Scheduling A Comparison of General Approaches to Multiprocessor Scheduling Jing-Chiou Liou AT&T Laboratories Middletown, NJ 0778, USA jing@jolt.mt.att.com Michael A. Palis Department of Computer Science Rutgers University

More information

Design and Evaluation of Job Scheduling Strategies for Grid Computing

Design and Evaluation of Job Scheduling Strategies for Grid Computing Design and Evaluation of Job Scheduling Strategies for Grid Computing Doctorial Thesis Ramin Yahyapour Genehmigte Dissertation zur Erlangung des akademischen Grades eines Doktors an der Fakultät für Elektrotechnik

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

Comparison of PBRR Scheduling Algorithm with Round Robin and Heuristic Priority Scheduling Algorithm in Virtual Cloud Environment

Comparison of PBRR Scheduling Algorithm with Round Robin and Heuristic Priority Scheduling Algorithm in Virtual Cloud Environment www.ijcsi.org 99 Comparison of PBRR Scheduling Algorithm with Round Robin and Heuristic Priority Scheduling Algorithm in Cloud Environment Er. Navreet Singh 1 1 Asst. Professor, Computer Science Department

More information

Cloud Management: Knowing is Half The Battle

Cloud Management: Knowing is Half The Battle Cloud Management: Knowing is Half The Battle Raouf BOUTABA David R. Cheriton School of Computer Science University of Waterloo Joint work with Qi Zhang, Faten Zhani (University of Waterloo) and Joseph

More information

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING

FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING Hussain Al-Asaad and Alireza Sarvi Department of Electrical & Computer Engineering University of California Davis, CA, U.S.A.

More information

Load Balancing on a Grid Using Data Characteristics

Load Balancing on a Grid Using Data Characteristics Load Balancing on a Grid Using Data Characteristics Jonathan White and Dale R. Thompson Computer Science and Computer Engineering Department University of Arkansas Fayetteville, AR 72701, USA {jlw09, drt}@uark.edu

More information

Optimizing Shared Resource Contention in HPC Clusters

Optimizing Shared Resource Contention in HPC Clusters Optimizing Shared Resource Contention in HPC Clusters Sergey Blagodurov Simon Fraser University Alexandra Fedorova Simon Fraser University Abstract Contention for shared resources in HPC clusters occurs

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, July-Aug 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, July-Aug 2014 RESEARCH ARTICLE An Efficient Service Broker Policy for Cloud Computing Environment Kunal Kishor 1, Vivek Thapar 2 Research Scholar 1, Assistant Professor 2 Department of Computer Science and Engineering,

More information

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida

Power Management in Cloud Computing using Green Algorithm. -Kushal Mehta COP 6087 University of Central Florida Power Management in Cloud Computing using Green Algorithm -Kushal Mehta COP 6087 University of Central Florida Motivation Global warming is the greatest environmental challenge today which is caused by

More information

Running a Workflow on a PowerCenter Grid

Running a Workflow on a PowerCenter Grid Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Introduction to Apache YARN Schedulers & Queues

Introduction to Apache YARN Schedulers & Queues Introduction to Apache YARN Schedulers & Queues In a nutshell, YARN was designed to address the many limitations (performance/scalability) embedded into Hadoop version 1 (MapReduce & HDFS). Some of the

More information

Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters

Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters William Voorsluys, Saurabh Kumar Garg, and Rajkumar Buyya Cloud Computing and Distributed Systems (CLOUDS) Laboratory

More information

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age. Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Load Measurement

More information

Workload Characteristics of the DAS-2 Supercomputer

Workload Characteristics of the DAS-2 Supercomputer Workload Characteristics of the DAS-2 Supercomputer Hui Li Lex Wolters David Groep Leiden Institute of Advanced Computer National Institute for Nuclear and High Science (LIACS), Leiden University Energy

More information

Distributed and Scalable QoS Optimization for Dynamic Web Service Composition

Distributed and Scalable QoS Optimization for Dynamic Web Service Composition Distributed and Scalable QoS Optimization for Dynamic Web Service Composition Mohammad Alrifai L3S Research Center Leibniz University of Hannover, Germany alrifai@l3s.de Supervised by: Prof. Dr. tech.

More information

The International Journal Of Science & Technoledge (ISSN 2321 919X) www.theijst.com

The International Journal Of Science & Technoledge (ISSN 2321 919X) www.theijst.com THE INTERNATIONAL JOURNAL OF SCIENCE & TECHNOLEDGE Efficient Parallel Processing on Public Cloud Servers using Load Balancing Manjunath K. C. M.Tech IV Sem, Department of CSE, SEA College of Engineering

More information

Extended Round Robin Load Balancing in Cloud Computing

Extended Round Robin Load Balancing in Cloud Computing www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 8 August, 2014 Page No. 7926-7931 Extended Round Robin Load Balancing in Cloud Computing Priyanka Gautam

More information

Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover

Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover 1 Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover Jie Xu, Member, IEEE, Yuming Jiang, Member, IEEE, and Andrew Perkis, Member, IEEE Abstract In this paper we investigate

More information

Adaptive Task Scheduling for Multi Job MapReduce

Adaptive Task Scheduling for Multi Job MapReduce Adaptive Task Scheduling for MultiJob MapReduce Environments Jordà Polo, David de Nadal, David Carrera, Yolanda Becerra, Vicenç Beltran, Jordi Torres and Eduard Ayguadé Barcelona Supercomputing Center

More information

Windows Server Performance Monitoring

Windows Server Performance Monitoring Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly

More information

CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER

CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER CHAPTER 4: SOFTWARE PART OF RTOS, THE SCHEDULER To provide the transparency of the system the user space is implemented in software as Scheduler. Given the sketch of the architecture, a low overhead scheduler

More information

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm A REVIEW OF THE LOAD BALANCING TECHNIQUES AT CLOUD SERVER Kiran Bala, Sahil Vashist, Rajwinder Singh, Gagandeep Singh Department of Computer Science & Engineering, Chandigarh Engineering College, Landran(Pb),

More information

Analysis of IP Network for different Quality of Service

Analysis of IP Network for different Quality of Service 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009) Proc.of CSIT vol.1 (2011) (2011) IACSIT Press, Singapore Analysis of IP Network for different Quality of Service Ajith

More information

Resource Allocation Avoiding SLA Violations in Cloud Framework for SaaS

Resource Allocation Avoiding SLA Violations in Cloud Framework for SaaS Resource Allocation Avoiding SLA Violations in Cloud Framework for SaaS Shantanu Sasane Abhilash Bari Kaustubh Memane Aniket Pathak Prof. A. A.Deshmukh University of Pune University of Pune University

More information

Reverse Auction-based Resource Allocation Policy for Service Broker in Hybrid Cloud Environment

Reverse Auction-based Resource Allocation Policy for Service Broker in Hybrid Cloud Environment Reverse Auction-based Resource Allocation Policy for Service Broker in Hybrid Cloud Environment Sunghwan Moon, Jaekwon Kim, Taeyoung Kim, Jongsik Lee Department of Computer and Information Engineering,

More information

VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling

VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling Bin Lin Peter A. Dinda Prescience Lab Department of Electrical Engineering and Computer Science Northwestern University

More information