Design and Analysis of a Load Balancing Strategy in Data Grids
|
|
|
- Magdalen McCormick
- 10 years ago
- Views:
Transcription
1 Design and Analysis of a Load Balancing Strategy in Data Grids Xiao Qin Department of Computer Science New Mexico Institute of Mining and Technology Socorro, NM 87801, [email protected] Abstract Developing Data Grids has increasingly become a major concern to make Grids attractive for a wide range of data-intensive applications. Storage subsystems are most likely to be a performance bottleneck in Data Grids and, therefore, the focus of this paper is to design and evaluate a data-aware load-balancing strategy to improve the global usage of storage resources in Data Grids. We build a model to estimate the response time of job running at a local site or remote site. In light of this model, we can calculate slowdowns imposed on jobs in a Data Grid environment. Next, we propose a load-balancing strategy that aims to balance load of a Data Grid in such a judicious way that computation and storage resources in each site are simultaneously well utilized. We conduct experiments using a simulated Data Grid to analyze the performance of the proposed strategy. Experimental results confirm that our load-balancing strategy can achieve high performance for data-intensive jobs in Data Grid environments. Key words: load balancing, data-intensive, datagrids, performance evaluation PACS: 1 Introduction In the last decade, Data Grids have increasingly become popular for a wide range of scientific and commercial applications [1]. A Data Grid can be envisioned as a collection of geographically dispersed computing and storage resources that provide a diversity of services [2][3] to fit needs of data-intensive Article published in Future Generation Computer Systems: The Int l Journal of Grid Computing, vol. 23, no. 1, pp , Jan
2 applications like simulation analysis tools [4] and high-energy physics applications [5][6]. The Data Grid integrates data archives stored in an array of distributed sites into a single virtual data management system [7]. Due to a widening performance gap between CPU and storage devices [8], the impact of storage resources in a Data Grid on data-intensive applications becomes highly significant. As such, it is extremely important to develop load-balancing schemes providing effective usage of global storage resources as well as that of CPU and memory. In an effort to improve performance of storage resources in Data Grids, we propose in this paper a Data-aware Load-Balancing strategy (or DLB for short) for Data Grids. Our scheme aims at balancing load of a Data Grid in a judicious way that computation and storage resources at each site are simultaneously well utilized for a variety of data-intensive applications. After building a model to estimate the response time of job running at a local site or remote site, we conduct experiments using a simulated Data Grid to confirm that our approach can achieve high performance for data-intensive jobs running on a resource-sharing Data Grid. The rest of the paper is organized as follows. In the section that follows, related work in the literature is briefly reviewed. Section 3 describes the system model of a Data Grid. In Section 4, we propose the data-aware load-balancing strategy. Section 5 evaluates the performance of the new strategy by comparing it with existing approaches. The paper concludes with Section 6. 2 Related Work A handful of previous studies are geared towards leveraging data replication techniques to achieve high performance in Data Grids [3][9][10]. Existing data replication approaches in Data Grids can be divided into four camps, namely, (1) Architecture and management for data replication, (2) Data replication placement, (3) Data replication selection, and (4) Data consistency [7]. Load balancing and scheduling play a critical role in achieving high utilization of resources in Data Grids [3][11]. A large body of work has been done in the area of load balancing [12]. For example, many load balancing polices were develop to achieve high performance by improving utilization of CPU [13][14][15][16] or memory resources [17][18]. Most existing loadbalancing strategies are inadequate for Data Grids supporting data-intensive applications, since the existing schemes ignore the issue of load-balancing among storage resources. Several research efforts have been focused on design of cache and buffer tech- 2
3 niques to optimize performance of storage systems [19][20][21]. Existing cache and buffer techniques are orthogonal to our load-balancing strategies, meaning that the implementation of our strategies can achieve extra performance improvements when applied in concert with the cache and buffer schemes. Much attention has been paid to the development of load-balancing policies for disk systems [22][23]. Although previous disk techniques can improve storage performance by fully utilizing hard drives. However, these approaches are unable to effectively handle complex workload conditions where data-intensive applications are sharing resources with other types of applications. Unlike the existing disk load-balancing schemes, our proposed strategy takes into account both computation and storage resources. 3 System Model In this study we model a data grid as a collection of n sites, each of which is comprised of computational facilities and a storage subsystem. Let G = S 1, S 2,..., S n denote a set of sites, where each site S i is modeled as a vector with five parameters S i = (u i, b i, m i, C i, D i ), where u i is the number of computational nodes, b i is the bandwidth of the storage subsystem, m i is the memory capacity, C i is a set of jobs allocated to S i, and D i is an array of datasets accessed and generated by data-intensive applications. Job Job Job Job Load Manager Load Manager Site 1 Site N Local Scheduler Storage Manager Network Storage Manager Local Scheduler Computers Storage subsystem Site Site Storage subsystem Computers Fig. 1. Block diagram of the system model of a Data Grid. Fig. 1 shows the block diagram of the system model of a Data Grid. Each site in the Data Grid has a load manager that accommodates submitted jobs. The load manager aims at tracking global load information by periodically exchanging load status with other load managers in the system. Depending on load balancing policies employed in the Data Grid, the load manager may 3
4 execute locally submitted jobs or transfer them to other sites. If the local site decides to have a job executed by a remote site, the job will be dispatched to the remote site through a network. If the local site is not overloaded, the job will be accommodated and handled by a local scheduler. The functionalities of storage managers include monitoring local datasets and handling dataset requests issued by local schedulers [24]. A data-intensive job can be characterized by a set of parameters, i.e., J j = (q j, d j, A j, tc j, td j, y j ), where q j is the number of requested nodes, l j is the size of required datasets, and A j represents a set of sites storing the datasets. tc j is the required computational time, td j is the required time spent in data accesses, y j is the memory requirement of J j. td i is expressed by d i b k, where d i the amount of accessed data and b k is the bandwidth of the kth dedicated storage subsystem. The load manager makes use of the following two equations to quantify computational and data access load of the local site. Note that Eqs. 1 and 3 take into account heterogeneities in computation and data accesses. L C i = maxn k=1 (p k) p i L M i = maxn k=1 (m k ) m i L D i = maxn k=1 (b k) b i J j C i (q j tc j ), (1) J j C i u j, (2) J j C i d j. (3) An important metric used to evaluate performance of load-balancing strategies is slowdown, which is measured by the ratio between the time to execute a job in a resource-shared setting and the time to execute the same job in a dedicated environment. The slowdown of job J j executing on site S i can be expressed by Eq. 4, s i,j = w i,j + tc i + td j p i k=1 (pd i,k k) + tp i,j + tn j p i k=1 (pn i,k k) tc j + td j + tp j, (4) where w i,j = the waiting time of J j at S i, p i = the number of additional jobs running at S i, pd i,k = the probability that k jobs will access data at S i at the same time, pn i,k = the probability that k jobs at S i will communicate at the same time, tn j = the data communication time, tp i,j = the paging time of J j at S i, and tp j = the paging time of J j in a dedicated environment. tp i,j can be expressed as ρ i,j tc j t p, where ρ i,j is the page fault rate of J j at S i and t p is the paging overhead. Similarly, tp j is given by ρ j tc j t p, where ρ j 4
5 is the page fault rate in the dedicated environment. Given m jobs submitted to a Data Grid for execution, we design a load-balancing strategy to minimize m i=1 the mean slowdown, which is given by s i. m 4 Load Balancing in Data Grids In this section we present a Data-aware Load-Balancing (DLB) strategy for Data Grids. In addition to allocating data-intensive jobs in a way to balance computing resources, the proposed strategy strives to balance data access load in a Data Grid. Our load-balancing strategy is outlined in Figure 2 below. The DLB load-balancing strategy: Input: A job J j submitted to a local site S i ; begin if L D i = max n k=1 {LD k } then chose a site S g A j where L D g = min S k A j {L D k }; else g = i; if J k C i y k > m i then choose a site S h, where L M h = minn k=1 {LM k }; else choose a site S h, where L C h = minn k=1 {LC k }; estimate the response time R(J j,s i ) of J j on S i ; estimate the response time R(J j,s h,s g ) of J j on S h with data access on S g ; if R(J j,s i ) R(J j,s h,s g ) then g = i; h = i; endif notify S g to handle data accesses of J j ; allocate J j to S h ; for S k A j,s k S g do notify the storage manager of S k to maintain data consistency; endfor end; Fig. 2. The data-aware load balancing strategy (DLB) Given a job J j submitted to the local site S i, the DLB strategy gives the highest priority to storage resources. This is because jobs running under dataintensive workload conditions are likely to experience waiting time on data accesses. As such, in case that data access load in the system is unbalanced, DLB manages to notify the best candidate site from list A j to handle the data accesses of J j. The candidate site must have the lightest load in terms of data access. After choosing the best candidate site to deal with data access, DLB balances global memory usage if the memory load of S i exceeds its available memory amount. Thus, the memory resources is balanced by allocating J j to a site with the lightest memory load. When data access and memory loads are 5
6 well balanced, DLB strives to improve the usage of global computing resources by evenly distributing computational load. In other words, in this case DLB migrates S j to a remote site with the lightest computational load across the entire Data Grid. Next, the DLB strategy checks whether the response time of J j on the remote site is less than that of J j on the local site. In doing so, DLB guarantees that each remote execution can improve performance of the Data Grid. After J j is allocated to the target site, the corresponding storage managers are required to maintain data consistency for J j during the course of its execution. Let R(J j, S i ) denote the response time of job J j on site S i. Let R(J j, S h, S g ) be the response time of J j on site S h with data access on site S g. The DLB strategy leverages the values of R(J j, S i ) and R(J j, S h, S g ) to determine whether having J j remotely executed can improve performance of the Data Grid. R(J j, S i ) can be obtained as: p i R(J j, S i ) = w i,j + tc i + td j (pd i,k k) + tp i,j. (5) The four terms on the right-hand side of Eq. 5 represent the waiting time, computational time, data access time, and paging time, respectively. The following variables are necessary to facilitate the derivation of R(J j, S h, S g ), which can be written as Eq. 6. θ = the fixed cost of remote execution, b hg = the bandwidth of the network link between S h and S g, p hg = the number of jobs sharing the link between S h and S g, and pn hg,k = the probability that k jobs will use the link between S h and S g at the same time. k=1 R(J j, S h, S g ) = w i,h + tc i + td j p h k=1 (pd h,k k) + tp h,j + θ if h = g, w i,h + tc i + td j p h k=1 (pd h,k k) + tp h,j (6) +θ + d j b hg p hg k=1 (pn hg,k k) if h g. The first five terms of both the upper and the bottom line of Eq. 6 characterize the waiting time, computational time, data access time, paging time, and fixed cost of remote execution. The last term of the bottom line of Eq. 6 is the time spent on transmitting data over the network link between S h and S g. Let τ i,k be the earliest available time of the kth computational node at site S i, meaning that the node is used without any waste prior to τ i,k. We assume that τ i,1 τ i,2... τ i,ui. The waiting time w i,j of J j at S i is approximated by w i,j = τ qj. It is worth noting that R(J j, S i ) and R(J j, S h, S g ) are computed by the DLB strategy at runtime. Since the calculations of Eqs. 5 and 6 are efficient, they can be obtained without imposing too much overhead on a load manager. 6
7 Response times R(J j, S i ) and R(J j, S h, S g ) derived above rely on the probability pd i,k that k jobs are accessing the storage resources at site S i at the same time. In what follows, we derive the probability pd i,k. Let α v be the probability that job J v is storing or retrieving data. α v is measured by the percentage of time spent in accessing the storage subsystem. Thus, we have td v α v =. (7) tc v + td v + tn v The probability that J v is using computational and communication resources can be expressed as 1 α v. Let Φ m = {J γ1, J γ2,..., J γm } be m jobs accessing the storage subsystem of site S i during the execution of job J j. To facilitate the derivation of pd i,k, we have to calculate the probability that k 1 jobs in Φ m are using the storage resource at the same time. As such, we first introduce an array of subsets of Φ m, i.e., Φ m,k,1, Φ m,k,2,..., Φ m,k,f, where f = m, k 1 each subset contains k 1 jobs in set Φ m, and any pair of two subsets are not identical to each other. Let p m,k,x be the probability that jobs in Φ m,k,x are accessing data at the same time while jobs in Φ m,k,x = Φ m Φ m,k,x are either computing or communicating. Given a subset Φ m,k,x (1 x f), we can obtain the probability p m,k,x as p m,k,x = α v J v Φ m,k,x J v Φ m,k,x (1 α v ). (8) Based on the above probability for each subset of jobs, one can estimate the probability pd i,k using Eq. 10 below. f pd i,k = p m,k,x x=1 f = α v (1 α v ) (9) x=1 J v Φ m,k,x J v Φ m,k,x f = ( ) td v td v 1. x=1 J v Φ m,k,x tc v + td v + tn v tc J v + td v + tn v v Φ m,k,x For a special case where jobs in Φ m are identical, (i.e., J v Φ m : α v = α), pd i,k can be simplified as pd i,k = m α k 1 (1 α) m k+1. (10) k 1 7
8 Now we are positioned to derive the probability pn hg,k that k jobs will use the link between S h and S g at the same time. Let β v be the probability that job J v tries to communicate. We can obtain β v by the percentage of time spent in communication. Thus, β v = tn v tc v + td v + tn v. (11) Let Φ m = { } J γ 1, J γ 2,..., J γ m be m jobs communicating at the same time through the link between S h and S g. To determine the probability that k 1 jobs in Φ will try to communicate at the same time, we define f subsets of Φ m, i.e., Φ m,k,1, Φ m,k,2,..., Φ m,k,f, where f = m. Each subset k 1 contains k 1 jobs in Φ m, and any pair of two subsets are not identical. We denote p m,k,x be the probability that jobs in Φ m,k,x are communicating at the same time while other jobs in Φ m,k,x = Φ m Φ m,k,x are either computing or accessing data. Thus, one can obtain p m,k,x as p m,k,x = β v (1 β v ). (12) J v Φ m,k,x J v Φ m,k,x In light of Eq. 12, we quantify the probability pn hg,k as follows. pn hg,k = = = f p m,k,x x=1 f x=1 J v Φ m,k,x f x=1 J v Φ m,k,x β v J v Φ m,k,x tn v tc v + td v + tn v (1 β v ) (13) J v Φ m,k,x ( 1 tn v tc v + td v + tn v ). If all the jobs in Φ m are identical in terms of communication patterns, (i.e., J v Φ m : β v = β), pn hg,k in Eq. 6 can be expressed as pd i,k = m β k 1 (1 β) m k+1. (14) k 1 8
9 5 Performance Evaluation Now we evaluate the performance of the load-balancing strategy presented in Section 4 by conducting trace-driven simulations. We experimentally compare our DLB strategy with two existing schemes: CM [25][26], and WAL [27]. The CM policy is focused on effective usage of global CPU and memory resources, whereas the WAL scheme balances load using a weighted average of required resource loads to quantify load at each site. We simulated a Data Grid of six sites by extending a distributed system simulator developed by Harchol-Balter and Downey [14]. In the simulated Data Grid, each site consists of 128 computational nodes and a storage subsystem. We modified the traces used in [14][26] by adding storage access requests to each job, whose number of requested nodes is randomly generated with a uniform distribution in [1, 64]. The sizes of files accessed by jobs are generated uniformly at random from 10MB to 100MB. Each file has three replicas evenly distributed among the sites in the Data Grid. When any one of the disks in a storage subsystem is full, the least recently used files are automatically removed by the storage manager Mean Slowdown CM WAL DLB Percentage of time spent on data accessing (%) Fig. 3. Mean slowdowns as a function of the percentage of time spent on data accessing. Fig. 3 plots slowdowns of the three load-balancing strategies when the percentage of time spent on data accessing is varied from 50% to 85%. To fairly compare the performance of the three strategies, we generate traces with a good-mix of memory-intensive and data-intensive jobs. Fig. 3 demonstratively shows that regardless of which load-balancing strategy is employed, the mean slowdowns increase with the increasing percentage of time spent on accessing storage subsystems. This result is expected because high data access load leads to longer waiting times on storage resources due to a high utilization of storage subsystems. The result further shows that the proposed DLB strategy 9
10 substantially outperforms the two existing load-balancing schemes. For example, DLB reduces the slowdowns by up to 24.9% and 29.2% (with averages of 21.1% and 22.4%). This is because DLB improves the global usage of the storage resources, which dominate the Data Grid s performance in cases where the data access load is high. In addition, we measured the overall response times of the jobs. Our experimental results show that DLB is able to significant reduce response times of data-intensive jobs. Due to the space limit, we omit the results for the response times Mean Slowdown CM WAL DLB Network Bandwidth (Mbps) Fig. 4. Mean slowdowns as a function of the network bandwidth. Network bandwidth plays a critical role in the overall performance of a Data Grid. Fig. 4 reveals impact of network bandwidth on the mean slowdowns of the three evaluated load-balancing schemes. In this experiment, we vary the network bandwidth from 10Mbps to 2Gbps. The first observation made from Fig. 4 is that the mean slowdown decreases as the network bandwidth increases. We attribute this result to the fact that the Data Grid with a high network bandwidth gives rise to low load-balancing overhead, which in turn makes it possible for data-intensive jobs to benefit substantially from load balancing. A second observation drawn from Fig. 4 is that the DLB strategy is more sensitive to the network bandwidth than the other two alternatives. The reason behind this result is that the DLB strategy leverage the high bandwidth network to noticeably reducing remote data access cost, which contributes a significant fraction of response times of data-intensive jobs. The implication of this result is that our DLB load-balancing strategy can greatly improve the overall performance of a Data Grid with high-speed network interconnections. 10
11 6 Conclusion In the past five years, Data Grids have increasingly become an attractive computing platform for a variety of data-intensive applications. In recognition that storage subsystems of Data Grids are most likely to be a performance bottleneck, we developed a data-aware load-balancing strategy to improve the global usage of storage resources in the Data Grids. We built a model to estimate the response time of job running at a local site or remote site. With this model in place, we can calculate slowdowns imposed on jobs in a Data Grid environment. The proposed strategy aims at balancing load among sites of a Data Grid in a judicious way to improve the global usage of CPU, memory, and storage resources. To evaluate effectiveness of our novel load-balancing strategy, we compared it with two existing approaches that ignore storage resources in the process of load balancing. Our empirical results confirm that the proposed strategy can achieve high performance for data-intensive jobs under diverse workload conditions. There are, of course, several open issues that need to be addressed in load balancing for Data Grids. First, the performance of our load-balancing strategy in Data Grids depends to some extent on data movements. Reducing overhead incurred by data movements can ultimately improve performance of Data Grids. We plan to investigate a new approach to moving data at a time when source and destination sites are lightly loaded. It is expected that the new approach can achieve improved performance without dramatically affecting response times of data-intensive jobs running on Data Grids. Second, data placement in the context of load-balancing in Data Grids has received little attention in the past years. In the future work we will develop a framework that seamlessly integrates data placement techniques with our data-aware loadbalancing strategy. 7 Acknowledgments The work reported in this paper was supported in part by the New Mexico Institute of Mining and Technology under Grant and by Intel Corporation under Grant References [1] B. Tierney, W. Johnston, J. Lee, M. Thompson, A data intensive distributed computing architecture for grid applications, Future Generation Computer 11
12 Systems 16 (5) (2000) [2] J.-P. Martin-Flatin, P. V.-B. Primet, High-speed network and services for dataintensive grids: The datatag project, Future Generation Computer Systems 21 (4) (2005) [3] M. Tang, B.-S. Lee, X. Tang, C.-K. Yeo, The impact of data replication on job scheduling performance in the data grid, Future Generation Computer Systems 22 (3) (2006) [4] A. Breckenridge, L. Pierson, S. Sanielevici, J. Welling, R. Keller, U. Woessner, J. Schulze, Distributed, on-demand, data-intensive and collaborative simulation analysis, Future Generation Computer Systems 19 (6) (2003) [5] M. S. Prez, J. Carretero, F. Garca, J. M. Pea, V. Robles, Mapfs: A flexible multiagent parallel file system for clusters, Future Generation Computer Systems 22 (5) (2006) [6] W. Allcock, et al., Grid-enabled particle physics event analysis: experiences using a 10 gb, high-latency network for a high-energy physics application, Future Generation Computer Systems 19 (6) (2003) [7] X. Qin, H. Jiang, Data grids: Supporting data-intensive applications in wide area networks, in: High Performance Computing: Paradigm and Infrastructure, [8] J. Hennessy, D. Patterson, in: Computer Architecture: A Quantitative Approach, [9] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke, The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets, in: Proc. Network Storage Symp., [10] A. Samar, H. Stockinger, Grid data management pilot (gdmp): a tool for wide area replication, in: Proc. Int l Conf. Applied Informatics, [11] K. Ranganathan, I. Foster, Decoupling computation and data scheduling in distributed data-intensive applications, in: Proc. 11th IEEE Int l Conf. High Performance Distibuted Computing, 2002, p [12] J. Cao, D. P. Spooner, S. A. Jarvis, G. R. Nudd, Grid load balancing using intelligent agents, Future Generation Computer Systems 21 (1) (2005) [13] F. Zhang, A. Mawardi, E. Santos, R. Pitchumani, L. E. Achenie, Examination of load-balancing methods to improve efficiency of a composite materials manufacturing process simulation under uncertainty using distributed computing, Future Generation Computer Systems 22 (5) (2006) [14] M. Harchol-Balter, A. Downey, Exploiting process lifetime distributions for load balancing, ACM Transactions on Computer Systems 15 (3) (1997) [15] C. Hui, S. Chanson, Improved strategies for dynamic load sharing, IEEE Concurrency 7 (3). 12
13 [16] P. D. Michailidis, K. G. Margaritis, Performance evaluation of load balancing strategies for approximate string matching application on an mpi cluster of heterogeneous workstations, Future Generation Computer Systems 19 (7) (2003) [17] A. Acharva, S. Setia, Availability and utility of idle memory in workstation clusters, in: Proc. ACM SIGMETRICS Conf. Measuring and Modeling of Computer Systems, [18] G. Voelker, Managing server load in global memory systems, in: Proc. ACM SIGMETRICS Conf. Measuring and Modeling of Computer Systems, [19] X. Ma, M. Winslett, J. Lee, S. Yu, Faster collective output through active buffering, in: Proc. Int l Symp. on Parallel and Distributed Processing, [20] X. Qin, H. Jiang, Y. Zhu, D. Swanson, Dynamic load balancing for I/O- and memory-intensive workload in clusters using a feedback control mechanism, in: Proc. 9th Int l Euro-Par Conf. Parallel Processing, Klagenfurt, Austria, [21] B. Forney, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, Storage-aware caching: Revisiting caching for heterogeneous storage systems, in: Proc. Symp. File and Storage Technology, Monterey, California, USA, [22] P. Scheuermann, G. Weikum, P. Zabback, Data partitioning and load balancing in parallel disk systems, The VLDB Journal (1998) [23] L. Lee, P. Scheauermann, R. Vingralek, File assignment in parallel I/O systems with minimal variance of service time, IEEE Trans. Computers 49 (2) (2000) [24] T. Xie, X. Qin, Saha: A scheduling algorithm for security-sensitive jobs on data grids, in: Proc. IEEE/ACM 6th Int l Symp. Cluster Computing and the Grid, [25] L. Xiao, X. Zhang, Y. Qu, Effective load sharing on heterogeneous networks of workstations, in: Proc. Int l Symp. Parallel and Distributed Processing, [26] X. Zhang, Y. Qu, L. Xiao, Improving distributed workload performance by sharing both cpu and memory resources, in: Proc. 20th Int l Conf. Distributed Computing Systems, [27] M. Surdeanu, D. Modovan, S. Harabagiu, Performance analysis of a distributed question/answering system, IEEE Trans. on Parallel and Distributed Systems 13 (6) (2006)
A Hybrid Load Balancing Policy underlying Cloud Computing Environment
A Hybrid Load Balancing Policy underlying Cloud Computing Environment S.C. WANG, S.C. TSENG, S.S. WANG*, K.Q. YAN* Chaoyang University of Technology 168, Jifeng E. Rd., Wufeng District, Taichung 41349
Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing
Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 10 (April 2013), Pp 53-57 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com Fair Scheduling Algorithm with Dynamic
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed
Presentation of Multi Level Data Replication Distributed Decision Making Strategy for High Priority Tasks in Real Time Data Grids
Presentation of Multi Level Data Replication Distributed Decision Making Strategy for High Priority Tasks in Real Time Data Grids Naghmeh Esmaieli [email protected] Mahdi Jafari [email protected]
Keywords Load balancing, Dispatcher, Distributed Cluster Server, Static Load balancing, Dynamic Load balancing.
Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Hybrid Algorithm
Optimal Service Pricing for a Cloud Cache
Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,
MEASURING PERFORMANCE OF DYNAMIC LOAD BALANCING ALGORITHMS IN DISTRIBUTED COMPUTING APPLICATIONS
MEASURING PERFORMANCE OF DYNAMIC LOAD BALANCING ALGORITHMS IN DISTRIBUTED COMPUTING APPLICATIONS Priyesh Kanungo 1 Professor and Senior Systems Engineer (Computer Centre), School of Computer Science and
IMPROVED PROXIMITY AWARE LOAD BALANCING FOR HETEROGENEOUS NODES
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 6 June, 2013 Page No. 1914-1919 IMPROVED PROXIMITY AWARE LOAD BALANCING FOR HETEROGENEOUS NODES Ms.
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Ensuring Reliability and High Availability in Cloud by Employing a Fault Tolerance Enabled Load Balancing Algorithm G.Gayathri [1], N.Prabakaran [2] Department of Computer
SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol-1, Iss.-3, JUNE 2014, 54-58 IIST SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE
A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems
A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems RUPAM MUKHOPADHYAY, DIBYAJYOTI GHOSH AND NANDINI MUKHERJEE Department of Computer
Exploring RAID Configurations
Exploring RAID Configurations J. Ryan Fishel Florida State University August 6, 2008 Abstract To address the limits of today s slow mechanical disks, we explored a number of data layouts to improve RAID
Dynamic Adaptive Feedback of Load Balancing Strategy
Journal of Information & Computational Science 8: 10 (2011) 1901 1908 Available at http://www.joics.com Dynamic Adaptive Feedback of Load Balancing Strategy Hongbin Wang a,b, Zhiyi Fang a,, Shuang Cui
Overlapping Data Transfer With Application Execution on Clusters
Overlapping Data Transfer With Application Execution on Clusters Karen L. Reid and Michael Stumm [email protected] [email protected] Department of Computer Science Department of Electrical and Computer
How To Balance In Cloud Computing
A Review on Load Balancing Algorithms in Cloud Hareesh M J Dept. of CSE, RSET, Kochi hareeshmjoseph@ gmail.com John P Martin Dept. of CSE, RSET, Kochi [email protected] Yedhu Sastri Dept. of IT, RSET,
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key
Load Balancing in Distributed Web Server Systems With Partial Document Replication
Load Balancing in Distributed Web Server Systems With Partial Document Replication Ling Zhuo, Cho-Li Wang and Francis C. M. Lau Department of Computer Science and Information Systems The University of
Fault-Tolerant Framework for Load Balancing System
Fault-Tolerant Framework for Load Balancing System Y. K. LIU, L.M. CHENG, L.L.CHENG Department of Electronic Engineering City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong SAR HONG KONG Abstract:
Evaluating HDFS I/O Performance on Virtualized Systems
Evaluating HDFS I/O Performance on Virtualized Systems Xin Tang [email protected] University of Wisconsin-Madison Department of Computer Sciences Abstract Hadoop as a Service (HaaS) has received increasing
A novel load balancing algorithm for computational grid
International Journal of Computational Intelligence Techniques, ISSN: 0976 0466 & E-ISSN: 0976 0474 Volume 1, Issue 1, 2010, PP-20-26 A novel load balancing algorithm for computational grid Saravanakumar
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program
A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing
A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing Liang-Teh Lee, Kang-Yuan Liu, Hui-Yang Huang and Chia-Ying Tseng Department of Computer Science and Engineering,
Scheduling using Optimization Decomposition in Wireless Network with Time Performance Analysis
Scheduling using Optimization Decomposition in Wireless Network with Time Performance Analysis Aparna.C 1, Kavitha.V.kakade 2 M.E Student, Department of Computer Science and Engineering, Sri Shakthi Institute
GridFTP: A Data Transfer Protocol for the Grid
GridFTP: A Data Transfer Protocol for the Grid Grid Forum Data Working Group on GridFTP Bill Allcock, Lee Liming, Steven Tuecke ANL Ann Chervenak USC/ISI Introduction In Grid environments,
Load Balancing in Structured Peer to Peer Systems
Load Balancing in Structured Peer to Peer Systems DR.K.P.KALIYAMURTHIE 1, D.PARAMESWARI 2 Professor and Head, Dept. of IT, Bharath University, Chennai-600 073 1 Asst. Prof. (SG), Dept. of Computer Applications,
Load Balancing in Structured Peer to Peer Systems
Load Balancing in Structured Peer to Peer Systems Dr.K.P.Kaliyamurthie 1, D.Parameswari 2 1.Professor and Head, Dept. of IT, Bharath University, Chennai-600 073. 2.Asst. Prof.(SG), Dept. of Computer Applications,
Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems
215 IEEE International Conference on Big Data (Big Data) Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems Guoxin Liu and Haiying Shen and Haoyu Wang Department of Electrical
Hadoop Cluster Applications
Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday
Efficient Data Replication Scheme based on Hadoop Distributed File System
, pp. 177-186 http://dx.doi.org/10.14257/ijseia.2015.9.12.16 Efficient Data Replication Scheme based on Hadoop Distributed File System Jungha Lee 1, Jaehwa Chung 2 and Daewon Lee 3* 1 Division of Supercomputing,
IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION N.Vijaya Sunder Sagar 1, M.Dileep Kumar 2, M.Nagesh 3, Lunavath Gandhi
Resource Allocation Schemes for Gang Scheduling
Resource Allocation Schemes for Gang Scheduling B. B. Zhou School of Computing and Mathematics Deakin University Geelong, VIC 327, Australia D. Walsh R. P. Brent Department of Computer Science Australian
Use of Agent-Based Service Discovery for Resource Management in Metacomputing Environment
In Proceedings of 7 th International Euro-Par Conference, Manchester, UK, Lecture Notes in Computer Science 2150, Springer Verlag, August 2001, pp. 882-886. Use of Agent-Based Service Discovery for Resource
Proxy-Assisted Periodic Broadcast for Video Streaming with Multiple Servers
1 Proxy-Assisted Periodic Broadcast for Video Streaming with Multiple Servers Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National
RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009
An Algorithm for Dynamic Load Balancing in Distributed Systems with Multiple Supporting Nodes by Exploiting the Interrupt Service Parveen Jain 1, Daya Gupta 2 1,2 Delhi College of Engineering, New Delhi,
Locality Based Protocol for MultiWriter Replication systems
Locality Based Protocol for MultiWriter Replication systems Lei Gao Department of Computer Science The University of Texas at Austin [email protected] One of the challenging problems in building replication
Self-Compressive Approach for Distributed System Monitoring
Self-Compressive Approach for Distributed System Monitoring Akshada T Bhondave Dr D.Y Patil COE Computer Department, Pune University, India Santoshkumar Biradar Assistant Prof. Dr D.Y Patil COE, Computer
Advanced Load Balancing Mechanism on Mixed Batch and Transactional Workloads
Advanced Load Balancing Mechanism on Mixed Batch and Transactional Workloads G. Suganthi (Member, IEEE), K. N. Vimal Shankar, Department of Computer Science and Engineering, V.S.B. Engineering College,
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh
CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT
81 CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT 5.1 INTRODUCTION Distributed Web servers on the Internet require high scalability and availability to provide efficient services to
Efficient DNS based Load Balancing for Bursty Web Application Traffic
ISSN Volume 1, No.1, September October 2012 International Journal of Science the and Internet. Applied However, Information this trend leads Technology to sudden burst of Available Online at http://warse.org/pdfs/ijmcis01112012.pdf
A Data Grid Model for Combining Teleradiology and PACS Operations
MEDICAL IMAGING TECHNOLOGY Vol.25 No.1 January 2007 7 特 集 論 文 / 遠 隔 医 療 と 画 像 通 信 A Data Grid Model for Combining Teleradiology and Operations H.K. HUANG *, Brent J. LIU *, Zheng ZHOU *, Jorge DOCUMET
A Locality Enhanced Scheduling Method for Multiple MapReduce Jobs In a Workflow Application
2012 International Conference on Information and Computer Applications (ICICA 2012) IPCSIT vol. 24 (2012) (2012) IACSIT Press, Singapore A Locality Enhanced Scheduling Method for Multiple MapReduce Jobs
Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing
www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,
A Dynamic Approach for Load Balancing using Clusters
A Dynamic Approach for Load Balancing using Clusters ShwetaRajani 1, RenuBagoria 2 Computer Science 1,2,Global Technical Campus, Jaipur 1,JaganNath University, Jaipur 2 Email: [email protected] 1
Comparison on Different Load Balancing Algorithms of Peer to Peer Networks
Comparison on Different Load Balancing Algorithms of Peer to Peer Networks K.N.Sirisha *, S.Bhagya Rekha M.Tech,Software Engineering Noble college of Engineering & Technology for Women Web Technologies
Improved Hybrid Dynamic Load Balancing Algorithm for Distributed Environment
International Journal of Scientific and Research Publications, Volume 3, Issue 3, March 2013 1 Improved Hybrid Dynamic Load Balancing Algorithm for Distributed Environment UrjashreePatil*, RajashreeShedge**
Load Distribution in Large Scale Network Monitoring Infrastructures
Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu
A Novel Switch Mechanism for Load Balancing in Public Cloud
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) A Novel Switch Mechanism for Load Balancing in Public Cloud Kalathoti Rambabu 1, M. Chandra Sekhar 2 1 M. Tech (CSE), MVR College
Rackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
Load Balancing on a Grid Using Data Characteristics
Load Balancing on a Grid Using Data Characteristics Jonathan White and Dale R. Thompson Computer Science and Computer Engineering Department University of Arkansas Fayetteville, AR 72701, USA {jlw09, drt}@uark.edu
How To Improve Performance On A Single Chip Computer
: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings
How To Compare Load Sharing And Job Scheduling In A Network Of Workstations
A COMPARISON OF LOAD SHARING AND JOB SCHEDULING IN A NETWORK OF WORKSTATIONS HELEN D. KARATZA Department of Informatics Aristotle University of Thessaloniki 546 Thessaloniki, GREECE Email: [email protected]
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,
Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services
Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services Ms. M. Subha #1, Mr. K. Saravanan *2 # Student, * Assistant Professor Department of Computer Science and Engineering Regional
Journal of Theoretical and Applied Information Technology 20 th July 2015. Vol.77. No.2 2005-2015 JATIT & LLS. All rights reserved.
EFFICIENT LOAD BALANCING USING ANT COLONY OPTIMIZATION MOHAMMAD H. NADIMI-SHAHRAKI, ELNAZ SHAFIGH FARD, FARAMARZ SAFI Department of Computer Engineering, Najafabad branch, Islamic Azad University, Najafabad,
DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH
DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH P.Neelakantan Department of Computer Science & Engineering, SVCET, Chittoor [email protected] ABSTRACT The grid
Implementing Parameterized Dynamic Load Balancing Algorithm Using CPU and Memory
Implementing Parameterized Dynamic Balancing Algorithm Using CPU and Memory Pradip Wawge 1, Pritish Tijare 2 Master of Engineering, Information Technology, Sipna college of Engineering, Amravati, Maharashtra,
Dynamic Load Balancing in a Network of Workstations
Dynamic Load Balancing in a Network of Workstations 95.515F Research Report By: Shahzad Malik (219762) November 29, 2000 Table of Contents 1 Introduction 3 2 Load Balancing 4 2.1 Static Load Balancing
Performance Comparison of Assignment Policies on Cluster-based E-Commerce Servers
Performance Comparison of Assignment Policies on Cluster-based E-Commerce Servers Victoria Ungureanu Department of MSIS Rutgers University, 180 University Ave. Newark, NJ 07102 USA Benjamin Melamed Department
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang University of Waterloo [email protected] Joseph L. Hellerstein Google Inc. [email protected] Raouf Boutaba University of Waterloo [email protected]
A Review of Customized Dynamic Load Balancing for a Network of Workstations
A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester
Figure 1. The cloud scales: Amazon EC2 growth [2].
- Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 [email protected], [email protected] Abstract One of the most important issues
Load Balancing Algorithm Based on Services
Journal of Information & Computational Science 10:11 (2013) 3305 3312 July 20, 2013 Available at http://www.joics.com Load Balancing Algorithm Based on Services Yufang Zhang a, Qinlei Wei a,, Ying Zhao
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,
IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE
IDENTIFYING AND OPTIMIZING DATA DUPLICATION BY EFFICIENT MEMORY ALLOCATION IN REPOSITORY BY SINGLE INSTANCE STORAGE 1 M.PRADEEP RAJA, 2 R.C SANTHOSH KUMAR, 3 P.KIRUTHIGA, 4 V. LOGESHWARI 1,2,3 Student,
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
Energy aware RAID Configuration for Large Storage Systems
Energy aware RAID Configuration for Large Storage Systems Norifumi Nishikawa [email protected] Miyuki Nakano [email protected] Masaru Kitsuregawa [email protected] Abstract
Public Cloud Partition Balancing and the Game Theory
Statistics Analysis for Cloud Partitioning using Load Balancing Model in Public Cloud V. DIVYASRI 1, M.THANIGAVEL 2, T. SUJILATHA 3 1, 2 M. Tech (CSE) GKCE, SULLURPETA, INDIA [email protected] [email protected]
AUTOMATED AND ADAPTIVE DOWNLOAD SERVICE USING P2P APPROACH IN CLOUD
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 4, Apr 2014, 63-68 Impact Journals AUTOMATED AND ADAPTIVE DOWNLOAD
Dynamic Multi-User Load Balancing in Distributed Systems
Dynamic Multi-User Load Balancing in Distributed Systems Satish Penmatsa and Anthony T. Chronopoulos The University of Texas at San Antonio Dept. of Computer Science One UTSA Circle, San Antonio, Texas
Research on Job Scheduling Algorithm in Hadoop
Journal of Computational Information Systems 7: 6 () 5769-5775 Available at http://www.jofcis.com Research on Job Scheduling Algorithm in Hadoop Yang XIA, Lei WANG, Qiang ZHAO, Gongxuan ZHANG School of
Understanding the Benefits of IBM SPSS Statistics Server
IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster
The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform
The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee Abstract Virtualization platform solutions
System Models for Distributed and Cloud Computing
System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE Mr. Santhosh S 1, Mr. Hemanth Kumar G 2 1 PG Scholor, 2 Asst. Professor, Dept. Of Computer Science & Engg, NMAMIT, (India) ABSTRACT
Distributed Framework for Data Mining As a Service on Private Cloud
RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &
Energy Efficient Load Balancing among Heterogeneous Nodes of Wireless Sensor Network
Energy Efficient Load Balancing among Heterogeneous Nodes of Wireless Sensor Network Chandrakant N Bangalore, India [email protected] Abstract Energy efficient load balancing in a Wireless Sensor
Path Selection Methods for Localized Quality of Service Routing
Path Selection Methods for Localized Quality of Service Routing Xin Yuan and Arif Saifee Department of Computer Science, Florida State University, Tallahassee, FL Abstract Localized Quality of Service
Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.
Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Load Measurement
