CHAPTER 6 MAJOR RESULTS AND CONCLUSIONS

133 CHAPTER 6 MAJOR RESULTS AND CONCLUSIONS The proposed scheduling algorithms along with the heuristic intensive weightage factors, parameters and ß and their impact on the performance of the algorithms have been demonstrated in the thesis. The proposed ACO algorithms take both static and dynamic attributes of the resources into consideration before a choice is made to execute a task. In this way, the selection of a resource can effectively avoid being influenced by fluctuation of the resource performance. The algorithms work in such a way that the parameters automatically adjust based on the resource status and behaviour of the network. Extensive simulations have been carried out to demonstrate the effectiveness of the proposed scheduling techniques. The main objective in our experiments is to submit jobs of different sizes at different intervals under various loads of resources and when the network bandwidth is fluctuating and expect that the overall execution time be reduced. The simulation results of the adaptive scheduling algorithms are presented and discussed. The network API of simulation tool was used to provide dynamic information about the status of the network.

134 methodologies. The algorithm is strengthened through the following 1. The algorithm is so designed that the parameters automatically adjust based on the dynamic behavior of the resources and network through a parameter-adjusting procedure. 2. Intensive weightage factors are added in the heuristic equation for making good scheduling decisions and prioritizing the QoS factors. In chapter 3, the proposed adaptive QoS-guided ACO algorithm for data-intensive grid scheduling is evaluated against conventional ACO algorithms without considering intensive weightage factors and parameteradjusting procedure. The results indicated that considerable optimization could be achieved by dynamically optimizing the key elements: intensive weightage factors, heuristic value and pheromone intensity. A modified scheduling algorithm presented in chapter 4, provides a solution which combines both application-centric and system-centric benefits. Using two metrics, namely economic cost and makespan, a comparison of the performance of proposed algorithm was made against an application-centric algorithm proposed by Venugopal et al. (2005) and a system-centric algorithm proposed by Zhao et al (2006).

135 Experimental results indicated that this adaptive ACO algorithm showed better efficiency and reliability even under unreliable resource and network conditions. These results formed the basis for further investigation. In chapter 5, three metrics, namely economic cost, resource utilization and revenue for the providers, are used to compare the performance of the proposed economic-based adaptive QoS-guided ACO algorithm with the one proposed in chapter 4. It is experimentally showed that the economic-based approach could manage to decrease cost, increase revenues and maximize utilization. A comparison of the three algorithms in terms of different QoS metrics, namely makespan, reliability, control, economic cost, revenue for the providers and resource utilization, is shown in Table 6.1.

125 136

137 The major findings of the study are that: The proposed adaptive scheduling algorithm effectively utilizes the ACO approach and offers an improvement of 10 18% in reducing the makespan when the number of jobs and congestion rates are dynamically varied. The ACO-based algorithm combines both application-centric and system-centric QoS. It also provides adaptive solutions to the scheduling problem in which the algorithm and its parameters are used to make scheduling decisions according to the dynamic behavior of the resource and network performance. The economic approach is found to offer an adaptive solution where resource providers and consumers can take autonomous scheduling decisions, and both parties can get sufficient incentives by reducing the cost of an application for the consumer and increasing the revenues of the provider. 6.1 CONCLUSION This thesis began with characterizing and categorizing the different aspects of a data grid. Data grids have several unique features, such as the presence of applications with heavy data and computing requirements, geographically distributed and heterogeneous resources under different administrative domains, and large number of users to share resources and

138 collaborate with each other. In the introduction to the thesis, the motivation, objectives and scope of the work are presented and the challenges in grid scheduling are described in the chapter. The architecture of the scheduling process is briefly discussed. Further, the fundamental components of a data grid, such as data transport mechanism, data replication systems, and resource allocation and job scheduling, are discussed. Under literature review (Chapter 2), several existing scheduling algorithms are discussed from different perspectives, such as static versus dynamic policies, objective functions, application models, QoS constraints, and strategies to deal with the dynamic behavior of resources. Chapter 3 introduces a new class of the ACO heuristic algorithm to tackle the dynamic and unpredictable characteristics of the grid and the complex nature of the scheduling problem. A formal description of the algorithm is presented here. An ACO algorithm for scheduling data-intensive applications with various QoS requirements is dealt with in chapter 4. Further, the thesis focuses on system-centric and application-centric QoS requirements for data-intensive applications. Chapter 5 highlights the economic-based ACO algorithm for data-intensive grid scheduling. The strengths and weaknesses of an economic model and the evaluation of the proposed economic model are presented in this chapter. In chapter 6, the results are reviewed in view of the objectives set forth for the work.

139 The results of simulation experiments are presented and how the proposed adaptive scheduling algorithms could be used to maximize the objective functions is demonstrated. 6.2 FUTURE WORK This thesis will enhance the understanding of data-intensive grid environments and contribute to its advances in a few ways. Principally, it deals with the scheduling of applications that require multiple datasets each replicated on multiple data repositories on the grid. However, this thesis has only explored ACO scheduling algorithms within the space of scheduling sets of independent tasks. It would be interesting to investigate the applicability of the matching heuristics to other task models such as Genetic Algorithm (GA) and Directed Acyclic Graphs (DAGs), which are used to model workflows and process-oriented parallel applications. An immediate follow-up work would be to implement the matching heuristics within DAG scheduling algorithms. This thesis has investigated the properties that are unique to data grids. Currently, the utility of data grids is limited to scientific collaborations. However, some of the tools developed within data grids may find applicability to areas outside of scientific computing, such as enterprises, with similar requirements for resource sharing and data access. This would require taking into account more strict reliability and security standards. Another challenge would be to extend existing data grid techniques to work with technologies within enterprises such as databases. Present-day data grids are

140 based on the notion of sharing resources within virtual organizations. However, as the dependence on data grids increases, there will be higher demands for reliability and resource sharing. Service providers may not be able to fulfill these without investing in the infrastructure. Service consumers will require QoS guarantees enforced through Service Level Agreements (SLAs). Therefore, a wider exploration of the economic aspects of data grid requires investigation of the utility functions of the participants, SLAs and market mechanisms.