An Optimized Loadbalancing Scheduling Method Based on the WLC Algorithm for Cloud Data Centers


 Bonnie Wade
 1 years ago
 Views:
Transcription
1 Journal of Computational Information Systems 9: 7 (23) Available at An Optimized Loadbalancing Scheduling Method Based on the WLC Algorithm for Cloud Data Centers Lianying ZHOU, Xingping CUI, Shuyue WU School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 223, China Abstract The WLC (weighted leastconnection) scheduling algorithm is a widely adopted scheduling algorithm in cloud computing systems. However, it has the following shortcomings: one is that using the WLC scheduling algorithm, it is not easy to modify the weight of each server instantaneously once it is determined; the other is that merely using the number of tasks connected to each server to indicate the server load is not accurate. In this paper we propose a DWLC (dual weighted least connection) scheduling algorithm, which is an improved algorithm based on WLC scheduling algorithm and overcomes the above mentioned shortcomings. In the DWLC algorithm, a more reasonable dynamic strategy is adopted to determine the weight of each server; the differences between tasks are also considered in order to reflect the realtime load of each server more accurately. The detailed process and parameter design of the DWLC algorithm are given in this paper. We also simulated the DWLC algorithm using the open source CloudSim simulation platform. The simulation results show that the DWLC algorithm can achieve better load balancing degree and higher system efficiency and thus can better satisfy the requirements of cloud data centers. Keywords: Load Balance; Resource Scheduling Algorithm; WLC; DWLC Introduction With the development of cloud computing technology [], data centers [2] have been improved a lot. There are millions of servers or PCs in a data center. Large amounts of resources are distributed in these servers. How to schedule these resources to make servers fit well with each other, and at the same time provide users with efficient and reasonably priced services are worthy of discussion. Resource management strategy [3, 4] in cloud computing systems is still not perfect due to its short development time. Therefore, there are many aspects worthy to be further studied and discussed. As a result of the uncertainty of user agents selection for physical nodes and the differences of processing capacity between physical nodes, there is usually a disparity among the system Corresponding author. address: (Xingping CUI) / Copyright 23 Binary Information Press DOI:.2733/jcis653 September, 23
2 682 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) resource utilization of the data center servers, leading to the imbalance among servers in accepting users requests. That is, some servers are idle all the time while others are always busy. As a consequence, the physical nodes of low resource utilization will have the problem of resource wasting while those of high resource utilization may be busy all the time due to overloading, eventually leading to the overall decline of the cluster performance. Therefore, how to balance the load of physical nodes in data centers has become a problem needs to be addressed urgently [5, 6]. To solve this problem, effective load balancing techniques must be adopted and the key point is to employ a proper load balancing algorithm. By using an effective load balancing algorithm, user requests can be reasonably assigned to cloud servers. Consequently, servers can undertake tasks more balanced, thus increasing the processing capacity and quality of services of the entire data centers. Therefore, research on load balancing scheduling algorithms has become a hotspot [7]. 2 Some Existing Load Balancing Scheduling Algorithms 2. Brief introduction to some existing load balancing scheduling algorithms The current popular load balancing algorithms fall into two basic categories: static algorithms and dynamic ones. Static load balancing (SLB)algorithms such as RR( roundrobin) scheduling algorithm and WRR(weighted roundrobin) scheduling algorithm schedule tasks using the preset strategies without considering the realtime load condition of background servers. Dynamic load balancing (DLB) algorithms like LC (leastconnection) scheduling algorithm and WLC scheduling algorithm distribute users requests according to the dynamic load condition of servers. However, DLB scheduling algorithms distribute all the newly arrived requests to the server with the least requests. If a lot of requests arrive within a period, these algorithms would reduce the load balance degree. The following are several widelyused load balancing scheduling algorithms [8]. () RR scheduling algorithm. Assume that the processing performances of all servers are the same This algorithm assigns newlyarrived requests to the servers according to the order of rotation. It is simple but does not apply to the situation where the processing performances of the servers are different. (2) WRR scheduling algorithm In this algorithm, different weights are used to denote the processing capacities of different servers. The number of requests assigned to each server is in proportion to its weight to ensure that servers with better processing capacities process more requests. (3) LC scheduling algorithm This algorithm assumes that the processing capabilities of all servers are the same and assigns the newly arrived request to the server with the least connection. However, the system performance is not ideal when the processing capabilities of the servers are different. (4) WLC scheduling algorithm [9] This algorithm is developed from the LC scheduling algorithm. It is the default scheduling
3 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) algorithm of LVS(Linux Virtual System). The main idea is that the processing capability of each server is represented by a corresponding weight. A server s load is indicated by the number of the connections connected to that server. When a new request arrives, the algorithm computes the ratio of each server s current connections and weight and assigns the request to the server with the least ratio. This algorithm is suitable for the situation where the processing capabilities of the servers are different. Suppose there are a group of servers S = {S, S,, Sn, W (Si) represents the weight number of server Si, its default value is. C(Si) represents the number of connections that are currently connected to server Si. Csum = C(Si) (i =,,...n ) represents the sum of all the connections that are currently connected to all the servers. The newly arrived request will be assigned to server Sm with the following condition: (C(Sm)/Csum)/W (Sm) = min{(c(si)/csum)/w (Si) (i =,,...n ), wherein W (Si) is not zero. Csum is a constant in one round, so the condition can be simplified to: C(Sm)/W (Sm) = min{c(si)/w (Si) (i =,,...n ), where in W (Si) is not zero. The computation overhead of division is much bigger than multiplication, and floatingpoint division is not allowed in the Linux kernel, so in order to achieve a better performance we further optimize the judge condition C(Sm)/W (Sm) > C(Si)/W (Si) to C(Sm) W (Si) > C(Si) W (Sm) under the assumption that the weight of a server is greater than zero. Meanwhile, the algorithm should ensure not to schedule a server when its weight is zero. The detailed algorithm is as follows. for(m=;m<n;m++){ if(w (Sm)>){ for(i=m+;i<n;i++){ if(c(sm) W (Si) > C(Si) W (Sm)) m=i; return Sm; return ULL; 2.2 Deficiency of the WLC scheduling algorithm The WLC scheduling algorithm considers the processing capacity of each server using a corresponding weight. It can achieve a higher load balancing degree than the LC algorithm. However, it still has the following shortcomings. () The weight of each server is not set reasonably and accurately. In most cases, the weight of a server is preset by the system administrator based on the hardware configuration and the administrator s personal experience. It is not dynamically adjusted based on the actual load of a server, therefore cannot reflect the realtime processing capability of a server during the scheduling
4 6822 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) process. With time goes on, some servers may become overloaded while others are idle all the time. This will cause the load imbalance of the system and decrease the system performance. (2) The load of a server represented merely by the number of connections can not accurately indicate the real time load of the server. The number of connections can reflect the load of a server to some extent; however, in the situation where the time and resource demands of different tasks are vary from each other it is not accurate enough. For example, in the www service, suppose that two servers keep the same number of connections at a certain time. But a server deals with a multimedia video transfer request, while the other one is responsible for handling web information of plain text transfer. The actual loads of the two servers are obviously different despite that they have the same number of connections. Therefore, using only the number of connections to represent the load of a server is incomplete. 3 Improvement of WLC: the DWLC Scheduling Algorithm 3. The main idea of the DWLC scheduling algorithm In order to achieve a better load balancing degree, it is necessary to improve the algorithm described above. The performances of the servers are considered comprehensively to get the weights of the servers dynamically. Meanwhile, the weight of each task is also determined according to its complexity. Consequently, The scheduler can get the real time processing capacity and load status of each server more accurately, and then select the most appropriate server. The main idea of the improved algorithm is as follows. () By adopting real time information to compute the weight of each server dynamically, the real time processing capacity of each server can be evaluated more accurately. Usually, the processing capacity of each server can be measured comprehensively by several indexes: CPU type, number of CPUs, memory idle rate, CPU idle rate, remaining network bandwidth, the number of processors and so on. In order not to introduce too much computation overhead to the scheduler to avoid it becoming a new bottleneck, two most important parameters, namely the CPU idle rate and memory idle rate are used to describe the server weight in the improved algorithm. The scheduler will collect the CPU idle rate and memory idle rate of each server and figure out the weight of each server when there is a request waiting to be assigned. (2) The tasks are assigned different weights according to their complexity. In this paper, we divide the tasks into four types for simplicity. The more complex the task is, the higher weight it is assigned. The real time load of a server is exactly the total weight of all tasks it is processed at that time. The scheduler will calculate the real time load of each server when there is a request waiting to be assigned. (3) When a new request arrives, the scheduler calculates the ratio of each server s real time load and weights and assigns the request to the server with the minimum ratio to avoid load imbalance between different servers.
5 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) Implementation of the DWLC algorithm Suppose that there are a group of servers S = {S, S,, Sn. CPU idle rate, memory idle rate and the weight of server Si are denoted as V c(si), V m(si), W (Si) respectively. And the higher weight indicates the stronger processing capability of a server. When there is a node failure, we set the weight of that node to zero. Here we introduce a function to express the weight of server Si: W (Si) = k V c(si) + k2 V m(si)(k + k2 =, V c(si) (, ), V m(si) (, )) It is obvious that 4 C ij P j and k 2 cannot be zero at the same time. Besides, the possibility of CPU and memory are fully loaded at the same time is very small in real systems. So we can reasonably assume that CPU idle rate and memory idle rate of server Si cannot be zero at the same time either. Therefore, the weight of the normal working server calculated using this formula cannot be zero. When a server s weight is assigned to be zero, it s safe to infer that this server goes down. As we can see from the weight calculation function, the value of k and k 2 represent the importance of CPU idle rate and the memory idle rate to some extent. Considering that CPU idle rate is more important than memory idle rate, so k should be larger than k 2. Here, we set (i =,,...n ) to the golden ratio, namely.68:.382. Taking into account the complexity of floatingpoint calculation, the approximate value of.6:.4 is adopted, namely, k =.6, k 2 =.4. The complete weight calculation function is as follows: W (Si) =.6 V c(si) +.4 V m(si)(v c(si) (, ), V m(si) (, )) We denote this value as the capability weight of a server. Suppose that there are four kinds of tasks M = {M, M2, M3, M4, their weights are assigned to be P = {P, P 2, P 3, P 4 respectively according to their complexity. The more complex the tasks is, the higher weight it is assigned. C(S i ) represents the number of connections currently connected to server S i, Cij represents the number of kind j tasks that server S i is processing. M is the task to be scheduled. The sum of the weight values of all the tasks on server S i is: 4 C ij P j, we denote this value as the task weight on server S i. For a server, a small task weight indicates small real time load, and a large server weight indicates a strong processing capability. Therefore, when a new request arrives, it will always be assigned to the server which has the minimum ratio of the task weight and the server weight. More specifically, This new connection request will be sent to server S m which satisfies the following condition: ( C mj P j )/W(S m )= min(( C ij P j )/W(S i ))(i =,,...n ) Therefore, the determination condition is: ( C ij P j )/W (S i ) < ( C mj P j )/W (S m )(i =,,...n ) The computation overhead of division is much bigger than multiplication, and floatingpoint division is not allowed in the Linux kernel. Besides, the weight of a server cannot be zero. So in order to achieve a better performance we further optimize the judge condition to : ( C ij P j ) W (S m ) < ( C mj P j ) W (S i )(i =,,...n )
6 6824 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) Furthermore, the algorithm should ensure that the server will not be scheduled when its weight is zero. The detailed algorithm is as follows. for(m=;m<n;m++){ if(w (Sm)>){ for(i=m+;i<n;i++){ if(( 4 C ij P j ) W (S m ) < ( 4 C mj P j ) W (S i )) m=i; if(m==m) Cm + +; if(m==m2) S i + +; if(m==m3) Cm3 + +; if(m==m4) Cm4 + +; return Sm; return ULL; The weight of server S i is: W (Si) =.6 V c(si) +.4 V m(si), and the weight of server Sm is: W (Sm) =.6 V c(sm) +.4 V m(sm). The flowchart of the WLC scheduling algorithm is shown in Figure. 4 Simulation and Performance Analysis 4. Simulation tools We use the opensource platform Cloudsim[] to simulate our proposed algorithm and compare its performance with the existing scheduling algorithms. 4.2 Design of simulation experiment We simulated three kinds of scheduling algorithms, namely LC scheduling algorithm, WLC scheduling algorithm and the DWLC scheduling algorithm in three groups with different number of tasks. In each group there are 5, 5 and 5 tasks respectively. All the tasks are generated randomly with various sizes. There are 5 servers in each group. Comparison analyses of these three algorithms according to the simulation results were given. Mean value stands for the average task completion time of all the servers in the group; it represents the system efficiency. While standard deviation stands for the load balancing degree of the system. 4.3 Simulation results and analysis () 5 tasks We simulated the above three algorithms on 5 randomly generated tasks. The performance comparison is shown in Fig. 2.
7 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) m= m<n return ull m++ W(S m )> i=m+ i++ i<n M=M M=M 2 condition M=M 3 m=i C m ++ C m2 ++ C m3 ++ C m4 ++ return S m condition:(( 4 Cij P j) W (Sm)) < (( 4 Cmj P j) W (Si)) Fig. : The flowchart of the WLC scheduling algorithm As we can see from Fig. 2, the load balancing degree of the DWLC scheduling algorithm is the best, followed by the WLC scheduling algorithm. And the load balancing degree of the LC scheduling algorithm is the worst. We further compared the mean values and standard deviations of these three algorithms. The result is shown in Fig. 3. As we can see from Fig. 3, the DWLC scheduling algorithm can guarantee higher efficiency compared with the LC scheduling algorithm and the WLC scheduling algorithm. And the standard deviation of the DWLC scheduling algorithm is the minimum, showing that the load balancing degree of this algorithm is the best, followed by the WLC scheduling algorithm. The standard deviation of the LC scheduling algorithm is the highest, indicating that there has been apparent imbalance among all the servers. (2) 5 tasks We simulated the above three algorithms when the number of tasks increases to 5, and the performance comparison is shown in Fig. 4. As we can see from Fig. 4, when cloud computing servers receive 5 randomly generated tasks of different weights, the system efficiency of the DWLC scheduling algorithm is the best compared
8 6826 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) LeastConnection Scheduling Algorithm 25 Weighted LeastConnection Scheduling Algorithm 6 Dual Weighted LeastConnection Scheduling Algorithm : : : : : : Fig. 2: The performance of these three algorithms with 5 tasks The number of tasks:5 LeastConnection Weighted LeastConnection Dual Weighted LeastConnection Task CompletionTime(s) Fig. 3: Comparison of the mean values and standard deviations for these three algorithms with 5 tasks with the other two algorithms. The standard deviation of the DWLC scheduling algorithm is the minimum, and the advantage compared to the other two algorithms is more obvious than in the situation of 5 tasks, followed by the WLC scheduling algorithm, showing that the load balancing degree for all the servers is high for these two algorithms. The standard deviation of LC scheduling algorithm is high, indicating that there has been apparent imbalance among the servers. We also compared the mean values and standard deviations of these three algorithms. The result is shown in Fig. 5. As we can see from Fig. 5, the standard deviation of the DWLC scheduling algorithm is the minimum, showing that the load balancing degree of this algorithm is the best, followed by the WLC scheduling algorithm. The figure also shows that the DWLC scheduling algorithm can guarantee high efficiency and the efficiency advantage compared with the other two algorithms gets higher with the number of tasks increases. (3) 5 tasks We continued to simulate the above three algorithms when the number of tasks increased to 5, and the performance comparison is shown in Fig. 6. As we can see from Fig. 6, the load balancing degree of the DWLC scheduling algorithm is the best when cloud computing servers receive 5 randomly generated tasks of different weights, and the advantage is obvious than the situation of 5 tasks, followed by the WLC scheduling
9 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) x 4 LeastConnection Scheduling Algorithm 2 x 4 Weighted LeastConnection Scheduling Algorithm 4 Dual Weighted LeastConnection Scheduling Algorithm :.83e+4 2 :.226e :.792e : : 83.6 : Fig. 4: The performance of these three algorithms with 5 tasks x 4 The number of tasks:5 LeastConnection Weighted LeastConnection Dual Weighted LeastConnection Task CompletionTime(s) Fig. 5: Comparison of the mean values and standard deviations for these three algorithms with 5 tasks algorithm, showing that the difference of the task completion time for all servers is very small. And the load balancing degree of the LC scheduling algorithm is the worst. Mean values and standard deviations of these three algorithms were compared, as shown in Fig. 7. As we can see from Fig. 7, the standard deviation of the DWLC scheduling algorithm is the minimum, showing that the load balancing degree of this algorithm is the best, followed by the WLC scheduling algorithm. The standard deviation of the LC scheduling algorithm is still the highest, and the imbalance among the servers is more obvious than that of the first two groups. The figure also shows that as for the system efficiency, the DWLC scheduling algorithm still achieves the best performance. To sum up, the DWLC scheduling algorithm shows preferable performance not matter the number of tasks is small or big. The load balancing degree and the system efficiency of the DWLC scheduling algorithm has improved a lot compared with the WLC scheduling algorithm and the LC scheduling algorithm.
10 6828 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) x 5 LeastConnection Scheduling Algorithm 2 x 5 Weighted LeastConnection Scheduling Algorithm 4 x 4 Dual Weighted LeastConnection Scheduling Algorithm :.834e+5 2 :.33e :.662e : 6.82e : 2525 : Fig. 6: The performance of these three algorithms with 5 tasks x 5 The number of tasks:5 LeastConnection Weighted LeastConnection Dual Weighted LeastConnection Task CompletionTime(s) Fig. 7: Comparison of the mean values and standard deviations for these three algorithms with 5 tasks 5 Conclusions This paper explored the existing scheduling algorithms in cloud data centers and proposed the DWLC scheduling algorithm, which is an improved algorithm based on the WLC algorithm. The DWLC algorithm adopted a more reasonable dynamic assignment strategy to determine the weight of each server compared to the WLC algorithm. It took both the weight differences of servers and tasks into consideration, therefore is a dual weighted scheduling algorithm. The DWLC scheduling algorithm makes it possible to achieve load balancing and high efficiency even in a system where both cloud servers and tasks are diverse. We also simulated the improved algorithm using the open source CloudSim simulation platform. And a comparison was made among the performance of the LC scheduling algorithm, the WLC scheduling algorithm and the DWLC scheduling algorithm. The analysis shows that the DWLC algorithm can achieve better load balancing degree and higher system efficiency than existing scheduling algorithms and thus can better satisfy the requirements of cloud data centers. References [] Peng LIU. Cloud Computing [M], Second Edition. Beijing: Electronic Industry Press, 2. [2] Xiaoqian LIU. Research on Data Center Structure and Scheduling Mechanism in Cloud Computing
11 L. Zhou et al. /Journal of Computational Information Systems 9: 7 (23) [D]. Hefei: University of Science and Technology of China, 2. [3] Wenhong TIA, ong ZHAO, uanliang ZHOG et al. Dynamic and Integrated LoadBalancing Scheduling Algorithm for Cloud Data Centers [J]. China Communicatins, 2, (6): [4] Wenhong TIA, ong ZHAO. Cloud Computing: Resource scheduling management [M]. Beijing: ational Defense Industry Press, 2. [5] iqiu FAG, Daohong TAG, Junwei GE. Enerygyaware Schedule Strategy Based on Dynamic Migration of Virtual Machines in Cloud Computing [J]. Journal of Computational Information Systems, 22, 8(): [6] A.SIGH, M.KORUPOLU, D.MOHAPATRA, ServerStorage Virtualization: Integration and Load Balancing in Data Centers, in the proceedings of the 28 ACM/IEEE conference on Supercomputing (28), pp. 2. [7] Xiu MIAO. Design and Load Balancing of Mobile IPTV Based on Cloud Computing Platform [D]. Beijing: Beijing University of Posts and Telecommunications, 2. [8] Chuang KA. A ew Dynamic Loading Balance Algorithm Based On LVS Cluster [D]. Ocean University of China, 28. [9] Song WE. Load Balancing of LVS [EB/OL]. [28]. [] Buyya R., Ranjan R., Calheiros R.. Modeling and Simulation of Scalable Cloud Computing Environments and the Cloudsim Toolkit: Challenges and Opportunities [C] Proc. of International Conference on High Performance Computing & Simulation. Kochi, India: [s. n.], 29: .
Analysis of EndtoEnd Response Times of MultiTier Internet Services
Analysis of EndtoEnd Response Times of MultiTier Internet Services ABSTRACT Modern Internet systems have evolved from simple monolithic systems to complex multitiered architectures For these systems,
More informationVirtual Networks: Isolation, Performance, and Trends
Annals of Telecommunications manuscript No. (will be inserted by the editor) Virtual Networks: Isolation, Performance, and Trends Natalia C. Fernandes Marcelo D. D. Moreira Igor M. Moraes Lyno Henrique
More informationES 2 : A Cloud Data Storage System for Supporting Both OLTP and OLAP
ES 2 : A Cloud Data Storage System for Supporting Both OLTP and OLAP Yu Cao, Chun Chen,FeiGuo, Dawei Jiang,YutingLin, Beng Chin Ooi, Hoang Tam Vo,SaiWu, Quanqing Xu School of Computing, National University
More informationOvercast: Reliable Multicasting with an Overlay Network
Overcast: Reliable Multicasting with an Overlay Network John Jannotti David K. Gifford Kirk L. Johnson M. Frans Kaashoek James W. O Toole, Jr. Cisco Systems {jj,gifford,tuna,kaashoek,otoole}@cisco.com
More informationCLoud Computing is the long dreamed vision of
1 Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data Cong Wang, Student Member, IEEE, Ning Cao, Student Member, IEEE, Kui Ren, Senior Member, IEEE, Wenjing Lou, Senior Member,
More informationA Survey of Design Techniques for SystemLevel Dynamic Power Management
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 8, NO. 3, JUNE 2000 299 A Survey of Design Techniques for SystemLevel Dynamic Power Management Luca Benini, Member, IEEE, Alessandro
More informationRealTime Dynamic Voltage Scaling for LowPower Embedded Operating Systems
RealTime Dynamic Voltage Scaling for LowPower Embedded Operating Syste Padmanabhan Pillai and Kang G. Shin RealTime Computing Laboratory Department of Electrical Engineering and Computer Science The
More informationBenchmarking Cloud Serving Systems with YCSB
Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears Yahoo! Research Santa Clara, CA, USA {cooperb,silberst,etam,ramakris,sears}@yahooinc.com
More informationEXPERIMENTAL VERIFICATION OF HYPERV PERFORMANCE ISOLATION QUALITY LEVEL
Computer Science 15 (2) 2014 http://dx.doi.org/10.7494/csci.2014.15.2.159 Krzysztof Rzecki Michał Niedźwiecki Tomasz Sośnicki Andrzej Martyna EXPERIMENTAL VERIFICATION OF HYPERV PERFORMANCE ISOLATION
More informationCuckoo Filter: Practically Better Than Bloom
Cuckoo Filter: Practically Better Than Bloom Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Carnegie Mellon University, Intel Labs, Harvard University {binfan,dga}@cs.cmu.edu, michael.e.kaminsky@intel.com,
More informationFinding the Right Facts in the Crowd: Factoid Question Answering over Social Media
Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media ABSTRACT Jiang Bian College of Computing Georgia Institute of Technology Atlanta, GA 30332 jbian@cc.gatech.edu Eugene
More information1 NOT ALL ANSWERS ARE EQUALLY
1 NOT ALL ANSWERS ARE EQUALLY GOOD: ESTIMATING THE QUALITY OF DATABASE ANSWERS Amihai Motro, Igor Rakov Department of Information and Software Systems Engineering George Mason University Fairfax, VA 220304444
More informationAddressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm
Addressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm Mi Zhang,2 Jie Tang 3 Xuchen Zhang,2 Xiangyang Xue,2 School of Computer Science, Fudan University 2 Shanghai Key Laboratory
More informationAn Efficient Scheme to Remove Crawler Traffic from the Internet
An Efficient Scheme to Remove Crawler Traffic from the Internet X. Yuan, M. H. MacGregor, J. Harms Department of Computing Science University of Alberta Edmonton, Alberta, Canada Email: xiaoqin,macg,harms
More informationApproximately Detecting Duplicates for Streaming Data using Stable Bloom Filters
Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters Fan Deng University of Alberta fandeng@cs.ualberta.ca Davood Rafiei University of Alberta drafiei@cs.ualberta.ca ABSTRACT
More informationSemantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering  IME/RJ Department of Computer Engineering  Rio de Janeiro  Brazil [awallace,anamoura]@de9.ime.eb.br
More informationAnomalous System Call Detection
Anomalous System Call Detection Darren Mutz, Fredrik Valeur, Christopher Kruegel, and Giovanni Vigna Reliable Software Group, University of California, Santa Barbara Secure Systems Lab, Technical University
More informationRobust Set Reconciliation
Robust Set Reconciliation Di Chen 1 Christian Konrad 2 Ke Yi 1 Wei Yu 3 Qin Zhang 4 1 Hong Kong University of Science and Technology, Hong Kong, China 2 Reykjavik University, Reykjavik, Iceland 3 Aarhus
More informationFEW would argue that one of TCP s strengths lies in its
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 13, NO. 8, OCTOBER 1995 1465 TCP Vegas: End to End Congestion Avoidance on a Global Internet Lawrence S. Brakmo, Student Member, IEEE, and Larry L.
More informationJust in Time Clouds: Enabling HighlyElastic Public Clouds over Low Scale Amortized Resources
Just in Time Clouds: Enabling HighlyElastic Public Clouds over Low Scale Amortized Resources Rostand Costa 1,2, Francisco Brasileiro 1 1 Federal University of Campina Grande Systems and Computing Department
More informationSolving Big Data Challenges for Enterprise Application Performance Management
Solving Big Data Challenges for Enterprise Application Performance Management Tilmann Rabl Middleware Systems Research Group University of Toronto, Canada tilmann@msrg.utoronto.ca Sergio Gómez Villamor
More informationPolicyBased Grooming in Optical Networks
DOI 10.1007/s1092200790749 PolicyBased Grooming in Optical Networks Fábio Luciano Verdi Æ Cláudio Carvalho Æ Maurício F. Magalhães Æ Edmundo R. M. Madeira Ó Springer Science+Business Media, LLC 2007
More informationOnline Selection of Mediated and DomainSpecific Predictions for Improved Recommender Systems
Online Selection of Mediated and DomainSpecific Predictions for Improved Recommender Systems Stephanie Rosenthal, Manuela Veloso, Anind Dey School of Computer Science Carnegie Mellon University {srosenth,veloso,anind}@cs.cmu.edu
More informationCostAware Strategies for Query Result Caching in Web Search Engines
CostAware Strategies for Query Result Caching in Web Search Engines RIFAT OZCAN, ISMAIL SENGOR ALTINGOVDE, and ÖZGÜR ULUSOY, Bilkent University Search engines and largescale IR systems need to cache
More informationDealing with Uncertainty in Operational Transport Planning
Dealing with Uncertainty in Operational Transport Planning Jonne Zutt, Arjan van Gemund, Mathijs de Weerdt, and Cees Witteveen Abstract An important problem in transportation is how to ensure efficient
More informationThe limits of Web metadata, and beyond
The limits of Web metadata, and beyond Massimo Marchiori The World Wide Web Consortium (W3C), MIT Laboratory for Computer Science, 545 Technology Square, Cambridge, MA 02139, U.S.A. max@lcs.mit.edu Abstract
More informationLow Overhead Concurrency Control for Partitioned Main Memory Databases
Low Overhead Concurrency Control for Partitioned Main Memory bases Evan P. C. Jones MIT CSAIL Cambridge, MA, USA evanj@csail.mit.edu Daniel J. Abadi Yale University New Haven, CT, USA dna@cs.yale.edu Samuel
More informationNo One (Cluster) Size Fits All: Automatic Cluster Sizing for Dataintensive Analytics
No One (Cluster) Size Fits All: Automatic Cluster Sizing for Dataintensive Analytics Herodotos Herodotou Duke University hero@cs.duke.edu Fei Dong Duke University dongfei@cs.duke.edu Shivnath Babu Duke
More informationUnderstanding Memory Resource Management in VMware vsphere 5.0
Understanding Memory Resource Management in VMware vsphere 5.0 Performance Study TECHNICAL WHITE PAPER Table of Contents Overview... 3 Introduction... 3 ESXi Memory Management Overview... 4 Terminology...
More informationSplitStream: HighBandwidth Multicast in Cooperative Environments
SplitStream: HighBandwidth Multicast in Cooperative Environments Miguel Castro Peter Druschel 2 AnneMarie Kermarrec Animesh Nandi 2 Antony Rowstron Atul Singh 2 Microsoft Research, 7 J J Thomson Avenue,
More information