CHAPTER 5 IMPLEMENTATION OF THE PROPOSED GRID NETWORK MONITORING SYSTEM IN CRB

60 CHAPTER 5 IMPLEMENTATION OF THE PROPOSED GRID NETWORK MONITORING SYSTEM IN CRB This chapter discusses the implementation details of the proposed grid network monitoring system, and its integration with CRB for best resource selection. This chapter also presents the Network Aware Resource Monitoring algorithm and Resource Selection with the integration of CRB. In addition, the implementation mobile agent based automated deployment of Network aware Resource Monitoring service is also discussed. This chapter also discusses about the network performance prediction model for predicting Grid network performance. 5.1 EXPERIMENTAL SETUP The experimental setup has been realized in CARE Research Laboratory for testing the proposed Grid network monitoring system with the integration of CRB. The proposed work has been developed as Grid services and deployed over GT4. The Grid services are written using WSDL and deployed in Grid Resources and also in Resource Broker. The proposed Grid network monitoring facilitates network aware resource selection strategy which improves the scheduling and increases the better utilization of the Grid resources, because it takes into account resource as well as network performance estimations while selecting the suitable resource for job submission.

61 Figure 5.1 Experimental Setup The experimental setup has three Grid resources namely smscluster.care.mit.in, xencluster.care.mit.in and CAREcluster.care.mit.in which is shown in Figure 5.1. All Grid resources have been configured as Beowulf cluster and each cluster has 10, 6 and 5 nodes respectively. The configuration of the machines used for experimentation is as follows: The smscluster consists of one head node and 9 computing nodes. The head node is installed with Red Hat Enterprise Linux (RHEL)-5.0 as its operating system, Globus Toolkit-4.0.7 as its grid middleware, Portable Batch System (PBS) as local resource manager with 2 GB RAM, one hard disk (320 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit). The computing nodes have RHEL-5.0, PBS MOM (machine oriented miniserver) with 2 GB RAM, one hard disk (250 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit).

62 The xencluster consists of one head node and 5 computing nodes. The head node is installed with RHEL-5.0 as its operating system, Globus Toolkit-4.0.7 as its grid middleware, PBS as local resource manager with 2 GB RAM, one hard disk (320 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit). The computing nodes have RHEL-5.0, PBS MOM (machine oriented miniserver) with 2 GB RAM, one hard disk (320 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit). The CAREcluster consists of one head node and 4 computing nodes. The head node consists of RHEL-4.0, Globus Toolkit-4.0.7 as grid middleware, PBS as local resource manager with 4 GB RAM, one hard disk (320 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit). The computing nodes have RHEL-4.0, PBS MOM (machine oriented miniserver) with 4 GB RAM, one hard disk (320 GB, 7200 rpm, SATA), 3.46 GHz processor speed and one network interface card (Broadcom NetXtreme Gigabit). The CRB is installed in server hardware with 4 CPUs, each CPU with quad core processors, 2000 MHz per processor, 16 GB RAM with RHEL-5.0, archives of Xen API and VMware API and Globus Toolkit-4.0.7. The three clusters are propelling of their information to CRB and become grid resources to CRB. The Resource Monitoring service, Network Monitoring service and Network Aware Resource Monitoring Service have been deployed on head nodes of the physical resources and also in the CRB. 5.2 IMPLEMENTATION MODEL The proposed monitoring system is implemented and tested in Grid Computing Laboratory of Anna University, Chennai. The implementation has

63 been carried out by using the GT4.0.7 as the grid middleware, java 1.6.0_16 as java run time environment. The services have been written as WSRF based Grid services and it is deployed in GT. The operational flow of the proposed monitoring system is as follows: The user submits the job in the CRB by specifying all resource requirements for the execution of the submitted job. Depends on the user s resource requirements, the services are invoked which are deployed in Grid to select the suitable resource for job submission. Monitoring is done with the help of the Grid service which is deployed in the GT4 in conjunction with the Mobile agent created using Aglets agent platform. Aglets are used for initiating the monitoring tools on the Grid Resources by cloning. Another aglet is used to kill all the processes of the monitoring tools after the service is stopped. 5.2.1 Resource Monitoring Service The Resource monitoring service runs periodically and collects the resource information specific to each cluster and its compute nodes. The collected information is provided to the information repository and it is maintained in the host pool. It generates the mobile agent which migrates from the Resource Broker to Grid cluster Head (GH) and migrates to all compute nodes and start the sensors deployed in all Grid Resources. For a given set of m grid resources (GR 1, GR 2,, GR m ), there will be n computing elements (CE 1, CE 2,, CE n ). The mobile agent migrates from the CARE Resource Broker to all Grid Resources and retrieves the Free Memory of all its computing elements. Then the Resource Cost Value is estimated for all Grid resources using the Equation (4.6) which was discussed in the previous chapter. Then all the resource cost values are stored in global archive. The resource selector selects the best resource which has the highest RCV from the list of matched

64 resources from CRB. The Resource selection algorithm based on resource monitoring is described below. while (there is any unsubmitted jobs) { Update resource performance using RCV with job scheduled in previous intervals; Select the resource for job submission which has maximum RCV; foreach (unsubmitted job) { Match the job to a resource set to satisfy the requirements at the job level; Schedule the jobs; } do { Assign mapped jobs to each compute resource heuristically; }while (all jobs are submitted or no more jobs can be submitted); wait until the next scheduling event; } 5.2.2 Network Monitoring Service The network monitoring service runs periodically and collects network metrics from sensors which are invoked by the mobile agents and the network cost function measure the network performance. Network monitoring service generates the mobile agent which migrates from the Resource Broker to Grid cluster Head (GH) and migrates to all compute nodes and start the sensors deployed in all Grid Resources. It retrieves the network metrics bandwidth, RTT, packet loss, and jitter for all the links between the Grid resources. For a given set of m Grid Resources (GR 1, GR 2,, GR m ), there will be n Computing Elements (CE 1, CE 2,, CE n ).

65 The Network monitoring service has major impact in Data Grids where there is a need of large data transfer for an execution of an application or job. The algorithm for Resource Selection based on Network Monitoring is described below. while (there is any unsubmitted job) { Update network performance using NCV with job scheduled in previous intervals; Select the resource for job submission which has maximum NCV; foreach (unsubmitted job) { Match the job to a resource set to satisfy the requirements at the job level; Schedule the jobs; } do { Assign mapped jobs to each compute resource heuristically; }while (all jobs are submitted or no more jobs can be submitted); wait until the next scheduling event; } The Network Cost Value is estimated using NCF for all grid resources using the Equation (4.5) which was discussed in the previous chapter. Then all the network cost values are stored in global archive for further prediction. The cost function CF RB,GS is used to measure the network performance of the link between the Resource Broker(RB) and Grid Resource or Grid Cluster Head or Grid Site (GS). NCF RB,GS = e (5.1) NCF RBList = (CF RB,GS1, CF RB,GS2,, CF RB,GSm ) (5.2)

66 Then the cost functions for the links between the each Grid cluster Head (GH), called as Grid Resource and it all Computing Elements (CE) are estimated to measure the network performance of the link and as well as the Grid Resource. NCF GH,CE = e (5.3) NCF CEList = (NCF GH1,CE1, NCF GH1,CE2,, NCF GH1,CEn ) (5.4) NCF GS = NCF (5.5) NCF GSList = (NCF GS1, NCF GS2,., NCF GSm ) (5.6) The Network Cost Value is computed by the following expression. NCV GS = (NCF RB,GSk +NCF GSk )/2 (5.7) where, k=1,2,,m and m is the number of Grid sites or Grid Resources available in Grid environment. The NCV varies in [0,1], because all cost functions range varies in[0,1]. 5.2.3 Network Aware Grid Monitoring Service for Resource Selection One of Grid Resource Broker s tasks is to find the suitable node for the submitted job on it. The network parameters can influence the scheduling decisions and can lead to preeminent outcome to help the resource broker in suitable resource selection. The end-to-end path characteristics between destination and each source have a major impact in the measurement of network performance as well as in prediction. An accurate prediction of the network performance needs of measurements of available bandwidth, packet loss, RTT, and jitter for the large file transfer performance. As the network characteristics are significantly dynamic, the each observation of the metrics

67 must be endorsed with timing information to indicate when the observation is made. The user submits the job to the Grid Resource Broker with detailed requirements specification which is needed to execute the job and the job specification is considered as a primary selection rule. The secondary selection rule combines the Resource Cost Value (RCV) and Network Cost Value (NCV). The Resource Cost Value and Network Cost value is computed for all the Grid Resources in a Grid environment. The Compound Cost Value (CCV) is computed by combining the value of RCV and NCV for all the Grid Resources. The Network Aware Grid Monitoring Service identifies the Grid Site or Grid Resource which has the highest CCV, that particular Grid Resource offers the best computing environment for the submitted job. The resource selection process based on resource and network monitoring is described in the Figure 5.2. Figure 5.2 Network Aware Resource Selection Process in Grid

68 The CRB perspective of the proposed Network Aware Resource Monitoring System is shown in Figure 5.3. Figure 5.3 CRB s perspective of the Proposed Network Aware Resource Monitoring The user submits the job to CARE Resource Broker (CRB). The CRB gathers information about the available computational resources through Network Aware Grid Monitoring Service and Global Archive. The resources that meet the specifications and minimum requirements such as minimum free memory and network threshold are considered as suitable candidates for job execution. It then creates the jobs according to the application description provided by the user. The scheduler within the broker then makes decision on where to submit a job based on the availability and cost of the compound cost

69 value which is derived from resource cost value and network cost value. The job is dispatched to the selected remote computational resource by the scheduler. After the job has finished processing, the results are sent back to the Resource Broker, where the user is submitted the job. This process is repeated until all the jobs within the set have completed. For a given set of m grid resources (GR 1, GR 2,, GR m ), there will be n computing elements (CE 1, CE 2,, CE n ). The RCV is computed using the Equation (4.6) and the Network Cost Value is computed using Equation (4.5) for all grid resources. Then the average of these two cost values is computed which range varies [0, 1] and the computed cost value is called as Compound Cost Value (CCV). For the Given set of m Grid resources (GR 1, GR2,, GR m ), the CCV is computed using the following expression. CCV i = (NCV i + RCV i )/2 (5.8) where, i= 1,2,,m and m is the number of Grid sites or Grid Resources in a Grid environment. The Grid Resource which has highest of CCV is selected as best resource for submitting the job, represented by the following expression, B GS = max m (CCV k ) (5.9) The following code segment specifies the service for computing the CCV for all Grid Resources in a Grid environment to identify the best resource for the submitted job. <?xml version= "1.0" encoding= "UTF-8"?> < definitions name= "CostService" targetnamespace= "http://www.grid.software/namespaces/first/cost_service" xmlns= "http://schemas.xmlsoap.org/wsdl/" xmlns:tns= "http://www.grid.software/namespaces/first/cost_service"

70 xmlns:xsd= "http://www.w3.org/2001/xmlschema"> < types> < xsd:schema targetnamespace= "http://www.grid.software/namespaces/first/cost_service" xmlns:tns= "http://www.grid.software/namespaces/first/cost_service" xmlns:xsd= "http://www.w3.org/2001/xmlschema">  < xsd:element name="getcostvalues"> < xsd:complextype/> < /xsd:element> < xsd:element name="getcostvaluesresponse">  < xsd:complextype> <xsd:sequence> < xsd:element name= "Finalvalue" type= "xsd:string"/> </xsd:sequence> < /xsd:complextype> < /xsd:element> < /xsd:schema> < /types> < message name="getcostvaluesinputmessage"> < part name= "parameters" element= "tns:getcostvalues"/> < /message> < message name="getcostvaluesoutputmessage"> <part name= "parameters" element= "tns:getcostvaluesresponse"/> < /message> < porttype name="costporttype"> < operation name="getcostvalues"> < input message="tns:getcostvaluesinputmessage"/> < output message="tns:getcostvaluesoutputmessage"/> < /operation> < /porttype> < /definitions>

71 The algorithm for the best resource selection from the CRB list of matched resources using Network Aware Resource Monitoring approach is described below. while there is any unsubmitted jobs do begin S j {R i }, RCF=, NCV =, CCV = // Matched resources list from CRB for job j for all R in S j do Update the resource information RCV R avg(freemem) / max(freemem) RCV list {RCV RCV R } end for all R in S j do Update the network metrics information NCF B,R {cost(link(gb,gs)), S j R} // Cost of the link using Equation 5.1 foreach r R do Update the network metrics information NCF R, r {cost(link(gs,ce)), r R} // Cost of the link using NCF R,r avg {NCF R, r } end Equation 5.3 end NCV R {(NCF B,R +NCF R,r )/2, S j R, r R } NCV list { NCV NCV R } end for all R in S j do CCV R {( RCV R + NCV R )/2, S j R } CCV list { CCV CCV R } B GS max{ CCV list }

72 5.2.4 Job Monitoring Service The Job Submission Description Language (JSDL) script contains a description of the job that is to be executed. The JSDL specification of job request is transferred from Resource Broker to the selected Resource for execution. This job monitoring service is deployed in CRB broker to monitor the status and progress of the submitted job. During the job execution, its status and progress are tracked and reported to the user through Resource Broker. After the job execution completes, the output is reported to the user. Other job attributes such as current directory of the job, its resource consumption, etc. are also reported to the user during execution of the job. The job monitoring process is shown in Figure 5.4. User submits the job Perform delegation of credentials for client Submit job to WS-GRAM service of remote node Send acknowledgement to client Store the credentials and notify pending status Submit the job to scheduler Notify the active status of submitted job Execute the job Notify cleanup status on completion Report the result to user Figure 5.4 Job monitoring process

73 5.3 IMPLEMENTATION OF AUTOMATED DEPLOYMENT OF NETWORK AWARE RESOURCE MONITORING SERVICE Whenever a new resource joins in Grid, it registers itself to the registration database which resides in the Resource Broker. The registration is done by sending the IP address. The registration node maintains a database of the IP address of all Grid resources. Now the registration node sends the IP address of the resource where the service is located i.e. in the Resource Broker. After getting the IP address of the Resource Broker, the newly joined resource invokes the deployment agent which resides in Resource Broker. The agent deploys the service requested in the new resource. The process of the mobile agent based automated deployment of Network aware Resource Monitoring service in a new resource is shown in the Figure 5.5. Figure 5.5 The Process of Automated Deployment of Proposed Monitoring Service The deployment agent has the details of which files to be transferred and what commands to be run for the deployment. These services are used for selecting a suitable resource to which a job can be submitted.

74 GRAM services are used for secure job submission to various types of schedulers. Job monitoring module periodically updates the job status and after completion of job the result is reported to the user. 5.4 NETWORK PERFORMANCE PREDICTION MODEL The network metrics such as bandwidth, RTT, packet loss, and jitter parameters are measured between all end-to-end links in the Grid environment. The Network Cost Function value along with time-stamp between the end-to-end nodes is stored in the information repository which provides data to the visualiser module. Predicting the future performance is a complex activity in network management. It needs immense observation of network status and identification of past patterns. Our predictor is linear and Historic-Based and it is shown in Figure 5.6. This model uses standard time series forecasting techniques to predict the performance based on a history of measurements from previous behaviors on the same path. Figure 5.6 Network Performance Prediction Model

75 The non-seasonal Holt-Winters predictor is a variation of EWMA. It captures the trend in the underlying time series, if such a trend exists and a separate smoothing component and a trend component, and it depends on two parameters and, both in (0, 1). The predicted value at time i is, (5.10) where, + ( ) (5.11) and + (1 (5.12) And the initial values of =Y 0 and = Y 1 Y 0 respectively, assuming that the time series starts at i=0. This HB approach of Holt-Winters prediction is more accurate to the proposed design and at most level it matches with the actual measurement done with the proposed network monitoring system. The predicted results with measured values are shown in the next section which is evident for the accuracy and the predicted results will be considered to tune the network performance for the effective monitoring of the network that resolves the maximum resource utilization issues in Grid.