Resource Scheduling in Desktop Grid by Grid-JQA



Similar documents
A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

QoS-based Scheduling of Workflow Applications on Service Grids

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Project Networks With Mixed-Time Constraints

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

A Programming Model for the Cloud Platform

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS

Resource Sharing Models and Heuristic Load Balancing Methods for

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

A GENERIC HANDOVER DECISION MANAGEMENT FRAMEWORK FOR NEXT GENERATION NETWORKS

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

Load Balancing By Max-Min Algorithm in Private Cloud Environment

Cloud Auto-Scaling with Deadline and Budget Constraints

An Alternative Way to Measure Private Equity Performance

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

An Interest-Oriented Network Evolution Mechanism for Online Communities

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems

Politecnico di Torino. Porto Institutional Repository

Ants Can Schedule Software Projects

Multiple-Period Attribution: Residuals and Compounding

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

The Load Balancing of Database Allocation in the Cloud

Improved SVM in Cloud Computing Information Mining

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Dynamic Energy-Efficiency Mechanism for Data Center Networks

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

Multi-sensor Data Fusion for Cyber Security Situation Awareness

J. Parallel Distrib. Comput.

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

How To Plan A Network Wide Load Balancing Route For A Network Wde Network (Network)

A heuristic task deployment approach for load balancing

Performance Evaluation of Infrastructure as Service Clouds with SLA Constraints

An Optimal Model for Priority based Service Scheduling Policy for Cloud Computing Environment

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

IMPACT ANALYSIS OF A CELLULAR PHONE

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

Conferencing protocols and Petri net analysis

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Complex Service Provisioning in Collaborative Cloud Markets

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Enterprise Master Patient Index

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING

Checkng and Testng in Nokia RMS Process

The Greedy Method. Introduction. 0/1 Knapsack Problem

Optimization Model of Reliable Data Storage in Cloud Environment Using Genetic Algorithm

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Activity Scheduling for Cost-Time Investment Optimization in Project Management

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

SMART: Scalable, Bandwidth-Aware Monitoring of Continuous Aggregation Queries

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

Traffic State Estimation in the Traffic Management Center of Berlin

Cooperative Load Balancing in IEEE Networks with Cell Breathing

Enabling P2P One-view Multi-party Video Conferencing

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

QoS in the Linux Operating System. Technical Report

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Network Services Definition and Deployment in a Differentiated Services Architecture

Real-Time Process Scheduling

FORMAL ANALYSIS FOR REAL-TIME SCHEDULING

P2P/ Grid-based Overlay Architecture to Support VoIP Services in Large Scale IP Networks

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

RequIn, a tool for fast web traffic inference

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS

DEFINING %COMPLETE IN MICROSOFT PROJECT

An interactive system for structure-based ASCII art creation

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

An Adaptive and Distributed Clustering Scheme for Wireless Sensor Networks

Portfolo and Grdiversion Technology

Transcription:

The 3rd Internatonal Conference on Grd and Pervasve Computng - Worshops esource Schedulng n Destop Grd by Grd-JQA L. Mohammad Khanl M. Analou Assstant professor Assstant professor C.S. Dept.Tabrz Unversty C.E. Dept. IUST Tabrz, Iran Tehran, Iran l-khanl@tabrzu.ac.r Analou@ust.ac.r Abstract In destop grd computng, resource schedulng s an mportant ssue. In ths paper, we propose a QoS-based resource schedulng algorthm that fnds the best match between s and resources whle meetng QoS requests. We descrbe Grd-JQA, our proposed archtecture supportng resource schedulng n destop grd envronments, and our current mplementaton of ths archtecture. In ths wor we propose an aggregaton formula for the QoS parameters. The formula s a unt less combnaton of the parameters together wth weghtng factors. Three heurstc approaches have been desgned and compared va smulatons to match s whch tae nto account the QoS requested by the s, and at the same tme, to mnmze the s mae span as much as possble. Also, an optmum method based on the performance metrc has been desgned to compare the performance of the heurstcs developed. We compare our wor wth Mn_Mn, Max_Mn and heurstcs. The results of a smulaton are provded to evaluate the man dea of the paper.. Introducton A resource manager s one of the most crtcal components of the grd mddleware [] snce t s responsble for resource management that provdes a resources selecton and ob schedulng. Therefore, resource dscovery, resource selecton, and ob schedulng that nfluence computng performance are mportant ssues n grd computng. Grd servces are often expected to meet some mnmum levels of qualty of servce (QoS) for a desrable operaton. esource managements can encompass not only a commtment to perform a but also commtments to level of performance or qualty of servce [3]. Thus approprate mechansms are needed for montorng and regulatng the usage of system resource to meet QoS requrements [, 3, 9]. In ths paper, we propose a resource management servce that automatcally selects optmal resources and requests resource allocaton. The set of optmal resources conssts of the resources that guarantee to mnmze the total executon tme of a gven applcaton. We also propose a fault tolerance servce that detects resource falures, devatons from requred QoS levels, and excessve resource usages and resolves detected falures. Ths paper s organzed as follows: In Secton 2, we descrbe prevous wors about resource management and fault tolerance servces n grd computng. In Secton 3, we explan the archtecture of our Grd-JQA archtecture. In Secton 4, we smulate our proposed soluton and we compare our wor wth Mn_Mn, Max_Mn and Suffrage heurstcs. Fnally, we conclude the paper n Secton 5. 2. elated wors In grd computng, there are two approaches for managng resources for ob executon [4, 7]. One s that a user drectly searches the resources for ob executon usng an nformaton servce and then requests a local resource manager to do resource allocaton. The other s to use a resource manager, as s used n Condor-G [] and Legon [2]. Condor-G [] leverages software from Globus and Condor [6] to allow users to harness mult doman resources. In Condor, the matchmaer uses a very generc matchmang algorthm, called the Gang-Matchng [6]. Prevous resource management approaches 978--7695-377-9/8 $25. 28 IEEE DOI.9/GPC.WOKSHOPS.28.27 63 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.

employed n Globus [2], Condor, and others [, 2, 7, 8] do not solve the problem of selectng optmal resources that occurs when the number of resources that satsfy user s demand s much more than the number of resources that the user needs. Also, grd applcatons, mddleware, tools, or systems such as Globus [2], Condor-G, Nmnod- G [], Nnf-G [6], and others [8] have been addressng ether fault tolerance ssues and do not provde a generc mechansm for resolvng falures, or dfferent applcatons have been adoptng ad hoc fault tolerance mechansms whch can not be reused, nor shared among them. In Globus, a notceable flaw s the lac of support for fault tolerance [2, 3]. To date, grd applcatons have ether gnored falure ssues or have mplemented fault detecton and response behavor completely wthn the applcaton [5]. The support for fault tolerance conssted manly of fault detecton servces or a montorng system. HBM [9] desgns and mplements a local montor and a data collector for provdng a fault detecton servce of processes and computers for applcatons developed wth the Globus. Although HBM dd not provde a fault management servce, t can detect lmted falures,.e., a process falure and a computer falure. Whle NWS (Networ Weather Servce) [2] montors avalable networ bandwdth, memory, CPU avalablty, free memory sze, and free ds space sze, t cannot provde a fault detecton servce and a fault management servce. eferences [6, 8, 9, ] propose a falures detecton servce or a fault tolerance servce n grds, but they do not provde a mechansm for handlng the detected falure and the problem of QoS s not addressed. Therefore, n ths paper, we propose a resource manager for selectng optmal resources and a fault manager for a fault tolerance servce. Our resource manager automatcally selects the set of optmal resources among the set of canddate resources and requests resource allocaton, so t provdes convenence for a user to execute a ob. It also guarantees effcent and relable ob executon through a fault tolerance servce. The proposed fault tolerance servce detects falures event by fault detector by montorng processes, processors, and networs and resolves detected falures through ob duplcaton. 3. The Archtecture of Grd-JQA The Grd Java based Qualty of servce management by Actve database (Grd-JQA) s a framewor that provdes worflow management for qualty of servce on dfferent types of resources, ncludng networs, CPUs, and dss [3,4,5]. It also encourages Grd customers to specfy ther qualty of servce needs based on ther actual requrements. The man goal of ths system s to provde seamless access to users for submttng obs to a pool of heterogeneous resources, and at the same tme, dynamcally montorng the resource requrements for executon of applcatons. Fgure shows the archtecture of the proposed Grd-JQA. The Actve Grd Informaton Server (AGIS) and fault detector are connected wth Globus Toolt. In fgure 2, the AGIS cooperates wth a grd portal, SL parser, a fault detector and GAM. A grd portal provdes an nterface for a user to launch an applcaton that wll utlze the resources and servces provded by the grd. For scalablty of archtecture, we use mult level AGIS. Mult-level AGIS are created by connectng AGIS herarchcally. The ey to accomplshng ths s n Grd-JQA s nherent archtecture, whch allows an AGIS to behave le a resource towards a hgher level AGIS. The user whch goes to a hgher level AGIS has access to the entre computng power of the Grd, whereas the clent connected nto a lower-level AGIS has only access to the computng power managed by the lower-level AGIS. In ths fashon Grds can be scaled to an nfnte number of levels. To execute a ob wth the Grd-JQA, a user descrbes a resource type, a resource condton, and the number of resources usng SL. SL s the specfcaton language used by the Globus Toolt to descrbe confguraton and servce requrements [3]. Then the user sends SL to a Grd-JQA and the SL parser extracts the resource type and resource condton and sends them to an AGIS. The resources are processor, networ, and memory. It assgns weght for each parameter that shows the mportance of the parameter. Let us assume that a grd nfrastructure conssts of N s. 64 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.

Portal Our research area SL esource equest SL Parser Advertse esource type, esource condton etc Upper AGIS Lower AGIS equest esult Actve Grd Informaton Server equest Optmal set of resources Fault Detector Alert Fault Event Query esource Allocaton equest State Informaton Fgure - The archtecture of Grd-JQA esource Advertsement GAM The request s showed by vector of QoS parameters q, =, 2,..., N and the weghts for the parameters as shown n equatons () and (2)., q2 q q, q =, () W = w, w2,, w w w = (2) = Each weght s used to show the mportance of each parameter. For example, f CPU s mportant for one, the clent wll set for the CPU weght and zero for the others. GAM advertses resource level capabltes to AGIS. When the resource capabltes are changed, the fault detector nforms the AGIS by fault event. Let us assume that a Grd nfrastructure conssts of M resources. The capabltes of a Grd resource s expressed wth the resource parameter vector q, =, 2,..., M as t appears n equaton (3)., 2 q, = q q, q (3) The elements of whch q, =, 2,...,, ndcate ndependent capabltes of the th resource that affect ts performance. s Note that, q e Tas and have the same unt, and s resource manager compares q e wth q for each Tas from to. If the resource provdes the requrements needed for the, t can be chosen as the best matched resource. We ntroduce satsfy operator. T means that the resource can satsfy the T and guarantee QoS parameters. The satsfacton relaton s provded n such a way that the memory appears n separate facton. It s because that the shortage of the memory blocs the executon and ts exes mae no help. Other QoS parameters are aggregated. The aggregaton s ustfed from the fact that CPU and bandwdth do not have the mentoned restrctons for the memory. q = T = T T ( ) w qmem q T mem = the number of QoS parameters (4) The soluton proposed here s that, we normalze the resource capabltes by the clent requrements 65 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.

T and therefore the summaton wll be possble whereas the unts of each parameter are dfferent such as byte, bps, MFlops. When the resource capablty exceeds the demand, s more than one. So a T plus n some QoS parameter helps the. But a plus for memory parameters does not help the so we separate memory parameter from other parameters. Also, the clent ntroduces a weght for each parameter to show the mportance of the parameter. The weghts range from to and the sum of all the weghts s equal to one. We multply the weght nto as mentoned n (4). Fnally the best match T resource wll be the one that can provde the maxmum q value for T w T. q = 4. Smulatons In smulaton we use followng eght methods for matchng: ) The General method that matches resources wth s n frst come frst servce (FCFS) strategy (frst to frst free resource) regardless of QoS parameters. 2) The Optmum method, selects the best resources for s [3]. The best resource s the one that has q maxmum value for T w T. = 3) The, our proposed soluton, uses the threshold n (4) nstead of and nstead of fndng the maxmum for each matchng n Optmum method. 4) Dup_, our proposed soluton that adds new feature to. Ths feature s duplcate executon of delayed s (.e. executed n wea resources). 5) The Wat method. In all other methods, f the AGIS does not fnd the proper resource, t wll assgn the best avalable resource to. So the wll not wat for the proper resource. But n the Wat method s wat untl the proper resource s found. 6) The Mn_Mn heurstc consders the whole set of unmapped s. It fnds the set of mnmum completon tmes (MCT) correspondng to each unmapped. It then selects the wth the overall mnmum MCT from the set to be mapped next. It contnues untl all the unmapped s are mapped. Mn-mn consders all unmapped s whle MCT consders ust one. The ntuton behnd mn-mn heurstc s that at each mappng step the current maespan ncreases the least. 7) The Max-Mn heurstc s smlar to mn-mn, the only dfference beng that after the set of MCT s calculated the overall maxmum MCT value s selected next for mappng. The ntuton s that long s can be overlapped wth shorter s n case of max-mn. 8) The heurstc stores the sufferage value for each of the unmapped s. value s the dfference between the mnmum completon tme and the second mnmum completon tme. The havng the largest sufferage value s selected next for mappng. In ths smulaton, we assume that the CPU cycle s from to 6, and all resources have same amount of memory and bandwdth. The number of s n applcaton s 2 and the CPU weght s.9. All s requre same amount of memory and bandwdth. We choose the CPU cycle requrement of applcaton n range to 6 whch s smlar to the range chosen for resource s CPU cycle power. We do smulaton 6 tmes and each tme we consder the amount of CPU cycle requrement, 2, 3, 4, 5 and 6. For each CPU cycle requrement, the smulaton s repeated tmes and fnally we use the average turnaround tme. Fgure 2 shows the result of smulaton for 6 CPU cycle requrement wth resource number from (half of the number) to 6 (three tme more than the number). After ths, n all fgures, horzontal axs ndcates the number of the resources, and the vertcal axs ndcates the turnaround tme n msec. 25 2 5 5 Optmum Dup_ Wat MnMn MaxMn Fgure 2-6 CPU cycle request The smulaton results show that: ) Executng n wea resource n comparson wth watng to proper resource produces less turnaround tme. 2) Dup_ has mnmum turnaround tme. Because there are some s executed n wea resources, they can fnd chance of executng n strong resources. So the turnaround tme s decreased. 3) The Dup_ average turn around tme s 6.7% less than Mn_Mn heurstc. 66 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.

4) The Dup_ average turn around tme s 2.67% greater than Max_Mn and heurstcs. Based on these assumptons and algorthms, we do the smulaton. The results of average turn around tme are shown n fgures 3, 4, 5, 6 and 7. From the fgures, we can see the average turn around tme for Dup_ method are much lower than that for other practcal methods, especally when the CPU request s hgh. By assgnng the proper value to threshold, Dup_ becomes same as Optmum method. 4 2 8 6 4 Optmum Dup_ Wat MnMn MaxMn 9 8 7 6 5 4 3 2 Fgure 6-2 CPU cycle request 7 6 5 4 3 2 Optmum Dup_ Wat MnMn MaxMn Optmum Dup_ Wat MnMn MaxMn 2 Fgure 3-5 CPU cycle request 4 2 8 6 4 2 Fgure 4-4 CPU cycle request 2 8 6 4 2 Fgure 5-3 CPU cycle request Optmum Dup_ Wat MnMn MaxMn Optmum Dup_ Wat MnMn MaxMn Comparng above fgures, we can fnd followng results: ) Most of the tme, has less turnaround tme than Wat method because executng n wea resource s better than watng for proper resource. Fgure 7- CPU cycle request 2) Comparng and Optmum method shows that they are same for strong request. The dfference between and Optmum s for low requests that Optmum method has less turnaround tme than. But two ponts should be noted: Frst the user has low request so the executon tme s longer and f the user wants less executon tme, t should requre more capabltes. Second, n ths smulaton the threshold s one, but as explaned before, the threshold can be changed dynamcally n related to envronment changes so can be smlar to Optmum method. 3) Most of the tme Dup_ has less turnaround tme n comparson wth other methods. And also t has less cost n comparson wth retryng and chec pontng. 4) Dup_ has at least 45% mprovment over the general method whch uses the frst come frst servce (FCFS) strategy. So Dup_ s sutable and relable method for matchng n grd envronment. 6) Dup_ average turn around tme s 3.35 % less than Mn_Mn, average turn around tme s 6.% less than Mn_Mn. Ths s very mportant result, because Mn_Mn has extra nput. The Mn-Mn heurstc consders the whole set of unmapped s. It fnds the set of mnmum completon tmes (MCT) correspondng to each unmapped. It then selects the wth the overall mnmum MCT from the set to be mapped next. It contnues untl all the unmapped s are mapped. But our heurstcs only get the advertsement nputs and requrements. 5. Conclusons 67 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.

In ths paper, we propose a resource management servce that provdes the optmal resource selecton and a fault tolerance servce. The contrbutons of ths wor are as follows: () We propose a resource manager for optmal resource selecton. The resource manager consders the requrements of ob and resource capabltes. The resource manager selects the optmal resources that guarantee the optmal performance whle turn around tme s chosen as metrc for performance evaluaton. () We present a fault management servce to guarantee that the submtted obs would be completed relably and effcently. We perform the smulaton to measure the performance mprovement due to optmal resource selecton and ob duplcaton. () Three ways are proposed for resource selecton: Optmum,, Dup. Wth our algorthms, only one resource s decded automatcally for any request f multple avalable resources are found, resultng n no need to as the user manually to select the resource from a large lst of avalable matchng resources. The smulaton shows that Dup_ average turn around tme s 3.35 % less than Mn_Mn. The value of the threshold changes dynamcally related to envronmental changes such as the number of dle resources, whereas the exstng mappng systems lac the ablty of nexact matchng. In the future, we plan to mplement our resource manager as a part of the Globus Toolt and mae varous experments for measurng effcency of the resource manager and ob duplcaton. Also, we wll nvestgate ways for selectng the best value of threshold.. eferences [] I. Foster, A. oy, V. Sander, A qualty of servce archtecture that combnes resource reservaton and applcaton adaptaton, 8 th Internatonal Worshop on Qualty of Servce, 2. [2] I. Foster, C. Kesselman, Globus: a metacomputng nfrastructure toolt, Int. J. Supercomputer Appl. (2), 997. [3] I. Foster, C. Kesselman, The Grd 2: Blueprnt for a New Computng Infrastructure, Morgan Kaufmann Publshers, Los Altos, CA, 24. [4] J. Chen and Y. Yang. A Taxonomy of Grd Worflow Verfcaton and Valdaton, Concurrency and Computaton Practce and Experence, Wley, 28, 2(4), 347-36. [5] N.T. Anh, Integratng fault-tolerance technques n grd applcatons, Ph.D. Dssertaton, August 2. [6] Y. Tanaa, H. Naada, S. Seguch, T. Suzumura, S. Matsuoa, Nnf-G: a reference mplementaton of PCbased programmng mddleware for grd computng, J. Grd Computng (), 23. [7] J. Chen and Y. Yang, Adaptve Selecton of Necessary and Suffcent Checponts for Dynamc Verfcaton of Temporal Constrants n Grd Worflow Systems. ACM Transactons on Autonomous and Adaptve Systems (TAAS), 2(2): Artcle 6, 27. [8] A. Iamntch, I. Foster, A problem-specfc faulttolerance mechansm for asynchronous, dstrbuted systems, Proceedngs of the 2 Internatonal Conference on Parallel Processng, 2. [9] A. Waheed, W. Smth, J. George, J. Yan, An nfrastructure for montorng and management n computatonal grds, Proceedngs of the 5th Worshop on Languages, Complers, and un-tme Systems for Scalable Computers, March 2. [] J. Frey, I. Foster, M. Lvny, T. Tannenbaum, S. Tuece, Condor-G: A Computaton, Management Agent for Mult- Insttutonal Grds, Unversty of Wsconsn, Madson, 2. []. Buyya, D. Abramson, J. Gddy, Nmrod/G: an archtecture of a resource management and schedulng system n a global computatonal grd, HPC Asa May 2. [2] A. Grmshaw, W. Wulf, Legon a vew from 5, feet, Proceedngs of 5th IEEE Symposum on Hgh Performance Dstrbuted Computng, 996. [3] L.M.Khanl, M.Analou, An Approach to Grd esource Selecton and Fault Management Based on ECA ules, Future Generaton Computng System, 27, do:.6/.future.27.5.2, 27. [4] L.M. Khanl, M. Analou, Grd-JQA a New Archtecture for QoS-guaranteed Grd Computng System, Feb 5-7, PDP26, France, 26. [5] L.M. Khanl, M.Analou, Grd-JQA : Grd Java based Qualty of servce management by Actve database, 4 th Australan Symposum on Grd Computng and e-esearch, AusGrd 26. [6]. aman, M. Lvny, M. Solomon, esource management through multlateral matchmang, Proceedngs of the Nnth IEEE Symposum on Hgh Performance Dstrbuted Computng, 2. [7].A. Moreno, Job schedulng and resource management technques n dynamc grd envronments, The Proceedngs of the st European Across Grds Conference, 22. [8] L. Yang, J.M. Schopf, I. Foster, Conservatve schedulng: usng predcted varance to mprove schedulng decsons n dynamc envronments, The Proceedngs of the ACM/IEEE SC23 Conference, 23. [9] P. Stellng, I. Foster, C. Kesselman, C. Lee, G. von Laszews, A fault detecton servce for wde area dstrbuted computatons, Proceedngs of 7th IEEE Symposum on Hgh Performance Dstrbuted Computng, 998. [2] M. Swany,. Wols, epresentng dynamc performance nformaton n grd envronments wth the networ weather servce, 2nd IEEE Internatonal Symposum on Cluster Computng and the Grd (CCGrd22), Berln, Germany, May 22. 68 Authorzed lcensed use lmted to: UNIVESIDADE FEDEAL DO IO GANDE DO SUL. Downloaded on October 9, 28 at 9:5 from IEEE Xplore. estrctons apply.