Efficient On-Demand Operations in Dynamic Distributed Infrastructures
|
|
|
- Daniella Robinson
- 10 years ago
- Views:
Transcription
1 Efficient On-Demand Operations in Dynamic Distributed Infrastructures Steven Y. Ko and Indranil Gupta Department of Computer Science University of Illinois at Urbana-Champaign {sko, ABSTRACT In a large-scale distributed infrastructure, users and administrators typically desire to perform on-demand operations that act upon the most up-to-date state of the infrastructure. These on-demand operations range from monitoring the up-to-date machine properties in the infrastructure, to making Grid scheduling decisions for different tasks based on the current status of the infrastructure. However, the scale and dynamism present in the operating environment make it challenging to support these operations efficiently. This paper discusses several on-demand operations that we have been studying, challenges associated with them, and how to meet the challenges. Specifically, we build techniques for 1) on-demand group monitoring that allows users and administrators of an infrastructure to query and aggregate the up-to-date state of the machines (e.g., CPU utilization) in a group or multiple groups, 2) an on-demand Grid scheduling strategy that makes scheduling decisions based on the current availability of compute nodes, 3) an on-demand Grid computation strategy that chooses the best algorithm for the current input data set among multiple algorithms available. We also present our ongoing work. 1. INTRODUCTION The success of the Internet has brought to us the daily use of large-scale distributed infrastructures. Typical web users interact with data centers daily using web-based applications. Scientific communities utilize various computational Grids to perform their experiments. Moreover, the promises of cloud computing initiatives by industry leaders and academics such as the Cloud Computing Testbed at University of Illinois [6], Amazon [1], Google [10], and IBM [11], and HP [12] evidence that this trend will likely continue for the years to come. We believe that one important class of operations in these large-scale infrastructures is on-demand operations, necessitated either by the need from users and administrators of the infrastructures or by the characteristics of workloads running on the infrastructures. Broadly, on-demand operations are the operations that act upon the most up-to-date state of the infrastructure. For example, a data center administrator typically desires to have on-demand monitoring capability for the data center. This capability of querying the up-to-date state (e.g., CPU utilization, software versions, etc) of the entire data center on-demand allows the administrator to understand the innerworking of the data center for troubleshooting as well as to make well-informed decisions for management tasks such as capacity planning and patch management. Another example is on-demand scheduling in computational Grids, where scheduling decisions often need to reflect the most current resource availability such as the availability of compute nodes. Thus, supporting these on-demand operations efficiently is both desirable and necessary. The challenges in supporting such operations can be categorized into two areas - scale and dynamism - and each challenge appears in a variety of flavors unique to the specific on-demand operation at hand. For example, in the case of on-demand monitoring, scale comes not only from the physical number of machines in the infrastructure, but also from the number of attributes to monitor, and in the case of ondemand scheduling, from the number of tasks to process. Dynamism comes not only from the failures of services and machines, but also from the changes in resource availability, usage, workload, etc. In this paper, we briefly examine some of our work on design and implementation of on-demand operations. We discuss some examples of on-demand operations, different kinds of dynamism and scale associated with them, and how to address the challenges. We also discuss our ongoing work. We believe that exploring on-demand operations reveals the many faces of dynamism and scale present in large-scale infrastructures. By doing so, we can revisit well-known aspects of dynamism and scale, and also uncover new aspects of the two characteristics that were previously overlooked. Ultimately, the collective knowledge accumulated from this exploration can form the basis of reasoning when designing new large-scale infrastructures or services. 2. OUR WORK SO FAR In this section, we present a brief summary of on-demand operations that we have been working on.
2 2.1 On-Demand Group Monitoring Several distributed aggregation systems have been proposed in the past, in order to monitor large-scale infrastructures such as data centers [26, 31, 23]. These systems provide a query interface for aggregating machine properties such as memory utilization, CPU utilization, etc. Aggregation queries play a key role as the summary view of the infrastructure often yields more insights than the individual view of each machine. We argue that such a monitoring system should be 1) groupbased, as users and administrators typically desire to monitor groups of machines - these groups are specified implicitly and could be either static (e.g., web servers, databases, etc) or dynamic (e.g., machines with CPU utilization > 50%), and 2) on-demand, as the users should be able to query the most up-to-date data within a group. Although previous systems have developed techniques to support aggregation queries, they target global aggregation; each machine in the whole infrastructure receives and answers every aggregation query, wasting bandwidth if a query actually targets only a group of machines. There is currently no distributed technique for aggregating data from a group (or multiple groups) in a bandwidth-efficient manner without contacting all machines in the infrastructure. This on-demand group aggregation adds a new dimension to the challenges of scale with respect to the number of machines and the number of attributes to monitor - the number of groups. This new dimension has been recognized in the context of multicast [3]. However, since aggregation employs different techniques such as in-network processing, we need a solution that is tailored towards the harmony with aggregation techniques. We have built a system called Moara that builds per-group aggregation trees to address these challenges [19]. Moara builds a group tree that spans over the members of the group, so that each query does not have to flood the whole infrastructure as some global aggregation schemes do. Aggregation is also done over the group tree, resulting in a faster query response time than global aggregation schemes. In addition, multiple groups can share one tree to ensure scalability in terms of the number of groups. The challenge of dynamism comes from two sources, group churn and workload. To be more specific, the size and composition of each group can change over time due to joining and leaving of machines as well as changes in the attributevalue space. In addition, a user may need to aggregate data from different groups at different times, so we can expect that the query rate and target might vary significantly over time. Thus, a group aggregation system has to be adaptive to the group churn and changes in the user workload. This dynamism in group and workload poses an interesting trade-off in terms of bandwidth usage. On the one hand, we can save per-query bandwidth cost by utilizing group trees. However, group churn requires us to maintain each group tree, which costs bandwidth. Thus, if the query rate is high enough for a group tree, it might be better to maintain the group tree to save per-query bandwidth. However, if the group churn rate is high, it might be better to flood each query and not maintain the group tree, because the maintenance cost might be too high. Considering this trade-off, Moara includes a distributed maintenance protocol adaptive to the current group query rate and group churn rate. Our experiments have shown that Moara reduces bandwidth usage compared to both the scheme that floods the whole infrastructure and the scheme that maintains group trees all the time. Moara, as any distributed system subjected to dynamism in the environment, suffers from the CAP dilemma [4], which states that it is difficult to provide both strong consistency guarantees and high availability in failure-prone distributed settings. Moara treads this dilemma by preferring to provide high availability and scalability, while providing eventual consistency guarantees on aggregation results. This philosophy is in line with that of existing aggregation systems such as Astrolabe [26] and SDIMS [31]. Moara could also allow the use of metrics proposed by Jain et al. [15, 16] in order to track the imprecision of the query results. 2.2 On-Demand Scheduling The current trend in both academia and industry shows that the workload is being shifted from computation-oriented tasks to data-intensive tasks in computational Grids. Many scientific communities such as Astronomy, Earth Science, and Biophysics already write data-intensive applications that access and generate multi-tb data [25]. Also, NSF has announced CluE initiative [7] to explore data-intensive computing with HP, Intel, and Yahoo s Cloud Computing Testbed as well as Google-IBM clusters. We argue that any scheduling algorithm for data-intensive tasks has to be on-demand, i.e., it has to make scheduling decisions based on the most up-to-date state such as the availability of compute nodes (or workers ) and the characteristics of the current input data, due to the dynamism present in the environment. We have developed scheduling algorithms that address the dynamism of resources [18] and the dynamism of workload [17]. We elaborate each further below. First, we have developed a series of on-demand worker-centric scheduling algorithms for data-intensive tasks [18], which considers the availability of workers as the primary criterion. Our algorithms make a scheduling decision for a task to a worker only when the worker can start executing the task immediately. We term this worker-centric scheduling as opposed to task-centric scheduling (e.g., [29, 5, 25]), where a scheduler assigns a task to a worker without considering whether or not the worker can start executing the task immediately after the task assignment. Worker-centric scheduling is important because it avoids two inherent problems of task-centric scheduling. First, taskcentric scheduling might overload some workers with tasks, because the scheduler is agnostic to the number of tasks already assigned to each worker. Second, conditions at a worker during scheduling time of a task may be different from the conditions at the worker during execution of the task, because each task usually waits in the worker s task queue for a while. Worker-centric scheduling avoids these
3 two problems, since each worker executes the newly-assigned task immediately. Our experiments show that the proposed worker-centric algorithms can achieve utilization of workers more than 90%, and outperform the state-of-the-art taskcentric algorithm roughly up to 20% in terms of task completion time. Second, in order to address the dynamism in the input workload, we have developed a system called HoneyAdapt, a peer-to-peer system for workers to choose the best algorithm on-demand as the input data changes [17]. This is possible since often times there are multiple algorithms that perform differently depending on the input data. For example, there are many sorting algorithms (e.g., quick sort, bubble sort, etc) that perform differently depending on how the given input data is organized. Our technique is based on the foraging behavior of honeybees that choose the best quality honey among multiple sources adaptively. Our experiments with sorting algorithms show that our technique can roughly achieve up to 40% reduction in sorting time. 3. RELATED WORK On-Demand Monitoring:. MON [20] is among the first systems that support on-demand operations for large-scale infrastructures. It supports propagation of one-shot management commands. In PlanetLab, there are other management tools that support on-demand operations, such as vxargs and PSSH [24]. Most commercial monitoring and management systems such as HP OpenView, IBM Tivoli, and CA Unicenter are also centralized in nature. Thus, these tools do not address scalability and expressive queries simultaneously. Several academic efforts propose distributed systems for aggregating data in large systems. Astrolabe [26] provides a generic aggregation abstraction, but uses a single static tree and hence has limited scalability with the number of metrics. SDIMS [31] constructs multiple trees for scalability with the number of metrics, but assumes a single group of the entire system. PIER [13] supports recursive SQLstyle queries, but does not leverage in-network aggregation. Huebsch et al. [14] present a way to optimize global aggregation queries, while Moara optimizes multiple group-based aggregation trees. Seaweed [23] focuses on dealing with data unavailability, which is different from Moara s focus. MON [20] supports one-shot queries and constructs query trees on-demand, but does not support expressive queries. Finally, Ganglia [21] uses a single hierarchical tree, but collects all data without in-network aggregation. On-Demand Scheduling:. Spatial Clustering [22] creates a task workflow based on the spatial relationship of files in the input data set. It improves data reuse and diminishes file transfers by clustering together tasks with high input-set overlap. Tasks in a cluster are then assigned to workers that sit on a same site. The great reduction in file transfers causes an important decrease in execution time. Two drawbacks to this approach are that (1) it cannot handle new jobs arriving asynchronously and (2) it is application specific. Storage Affinity [29] also addresses file reuse for data-intensive applications. The algorithm computes a data affinity value for each task, for each site, according to the input set of each task and the data currently stored at a site s networked storage. In this way it finds the task+site pair with largest data affinity (common bytes), which is chosen as the next schedule. To address inefficient CPU assignments, they propose replicating tasks, also based on the storage affinity. The algorithm shows improved makespan and good data reuse, specially when compared to the XSufferage [5] scheduling heuristic. Decoupling data scheduling from task scheduling was proposed by Ranganathan et al. [25]. The work evaluates four simple task scheduling mechanisms and three simple data scheduling mechanisms. Best results are obtained when a task is scheduled to a site that has a good part of its input data already in place, combined with proactive replication of a popular input data-set to a random/least-loaded site. Note that execution time improves with the data replication schemes due to the data set popularity distribution, which was set to be geometric not all grid applications will have the same characteristic. A few Grid scheduling algorithms can be considered as ondemand scheduling since they make scheduling decisions based on the current status of the Grid. Examples include a pullbased scheduler proposed by Viswanathan et al. [30] and theoretical work by Rosenberg et al. [27, 28]. 4. ONGOING WORK AND CONCLUSION We are currently working on a new on-demand operation and continuing to improve our current systems. On-Demand Log Processing: On-demand log processing refers to the ability of processing the most up-to-date log entries that are distributed. Although there are many applications that generate and process logs [2, 9, 8, 32]), the approaches are typically centralized. Every log is collected from the machine that generates the log, and then processed by a centralized algorithm. This centralized approach limit the capability of on-demand processing of logs because of the time involved in collection. We are working on a distributed approach that tries to minimize the overall processing time by intelligently deciding the timing and the number of log entries to ship. State Management for Moara: Since Moara maintains many states related to groups, it is important to have a mechanism to garbage-collect these states. We are investigating several policies for this purpose. Some possibilities include: 1) to garbage-collect each state after a timeout expires, 2) to keep only the last k groups queried, and 3) to garbage-collect the least frequently queried group. Incorporating Grid Scheduling Algorithms into Cloud Computing Testbed (CCT): We are also working to incorporate our HoneyAdapt and scheduling strategies into CCT. In conclusion, we believe that on-demand operations reveal various aspects of dynamism and scale of large-scale infras-
4 tructures such as data centers. Thus, supporting on-demand operations are necessary not only from the perspective of users and administrators, but also from the perspective of designers and architects of the infrastructure and services running over them. 5. REFERENCES [1] Amazon Web Services. http: // [2] P. Bahl, R. Chandra, A. Greenberg, S. Kandula, D. A. Maltz, and M. Zhang. Towards Highly Reliable Enterprise Network Services via Inference of Multi-level Dependencies. In Proceedings of ACM SIGCOMM, [3] M. Balakrishnan, K. Birman, A. Phanishayee, and S. Pleisch. Ricochet: Lateral Error Correction for Time-Critical Multicast. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI), [4] E. Brewer. Towards Robust Distributed Systems (Invited Talk). In Proceedings of the 9th ACM Symposium on Principles of Distributed Computing, July [5] H. Casanova, D. Zagorodnov, F. Berman, and A. Legrand. Heuristics for Scheduling Parameter Sweep Applications in Grid Environments. In Proceedings of the 9th Heterogeneous Computing Workshop, [6] HP, Intel, Yahoo in cloud tie-up. http: //news.bbc.co.uk/2/hi/technology/ stm. [7] NSF Cluster Exploratory (CluE) program. [8] R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. X-Trace: A Pervasive Network Tracing Framework. In Proceedings of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI), [9] D. Geels, G. Altekar, P. Maniatis, T. Roscoe, and I. Stoica. Friday: Global Comprehension for Distributed Replay. In Proceedings of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI), [10] Google App Engine. [11] Google and IBM Join in Cloud Computing Research. 08cloud.html. [12] HP Labs Dynamic Cloud Services. [13] R. Huebsch, B. Chun, J. M. Hellerstein, B. T. Loo, P. Maniatis, T. Roscoe, S. Shenker, I. Stoica, and A. R. Yumerefendi. The Architecture of PIER: an Internet-Scale Query Processor. In Proceedings of the Conference on Innovative Data Systems Research (CIDR), [14] R. Huebsch, M. Garofalakis, J. M. Hellerstein, and I. Stoica. Sharing Aggregate Computation for Distributed Queries. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD), [15] N. Jain, D. Kit, P. Mahajan, P. Yalagandula, M. Dahlin, and Y. Zhang. PRISM: Precision-Integrated Scalable Monitoring (extended). Technical Report TR-06-22, UT Austin Department of Computer Sciences, March [16] N. Jain, D. Kit, P. Mahajan, P. Yalagandula, M. Dahlin, and Y. Zhang. STAR: Self Tuning Aggregation for Scalable Monitoring. In Proceedings of the 33rd International Conference on Very Large Databases (VLDB), June [17] S. Y. Ko, I. Gupta, and Y. Jo. Novel Mathematics-Inspired Algorithms for Self-Adaptive Peer-to-Peer Computing. ACM Transactions on Autonomous and Adaptive Systems (TAAS), to appear, [18] S. Y. Ko, R. Morales, and I. Gupta. New Worker-Centric Scheduling Strategies for Data-Intensive Grid Applications. In Proceedings of the ACM/IFIP/USENIX 8th International Middleware Conference (Middleware), [19] S. Y. Ko, P. Yalaganula, I. Gupta, V. Talwar, D. Milojicic, and S. Iyer. Moara: Flexible and Scalable Group-Based Querying System. In Proceedings of the 9th ACM/IFIP/USENIX Middleware, to appear, [20] J. Liang, S. Y. Ko, I. Gupta, and K. Nahrstedt. MON: On-demand Overlays for Distributed System Management. In Proceedings of the 2nd USENIX Workshop on Real, Large Distributed Systems (WORLDS), [21] M. L. Massie, B. N. Chun, and D. E. Culler. The Ganglia Distributed Monitoring System: Design, Implementation and Experience. Parallel Computing, 30(7), July [22] L. Meyer, J. Annis, M. Mattoso, M. Wilde, and I. Foster. Planning Spatial Workflows to Optimize Grid Performance. Technical report, GriPhyN , [23] D. Narayanan, A. Donnelly, R. Mortier, and A. Rowstron. Delay Aware Querying with Seaweed. In Proceedings of the 32nd Very Large Data Bases Conference (VLDB), [24] PlanetLab: Contributed Software. Planetlab/ContributedSoftware. [25] K. Ranganathan and I. Foster. Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. In Proceedings of the 11th International Symposium on High Performance Distributed Computing (HPDC-11), [26] R. V. Renesse, K. P. Birman, and W. Vogels. Astrolabe: A Robust and Scalable Technology for Distributed System Monitoring, Management, and Data Mining. ACM Transactions on Computer Systems, 21(2): , May [27] A. L. Rosenberg. On Scheduling Mesh-Structured Computations for Internet-Based Computing. IEEE Transactions on Computers, 53(9), September [28] A. L. Rosenberg and M. Yurkewych. Guidelines for Scheduling Some Common Computation-Dags for Internet-Based Computing. IEEE Transactions on Computers, 54(4), April [29] E. Santos-Neto, W. Cirne, F. V. Brasileiro, and A. Lima. Exploiting Replication and Data Reuse to Efficiently Schedule Data-Intensive Applications on
5 Grids. In Proceedings of the 10th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), [30] S. Viswanathan, B. Veeravalli, D. Yu, and T. G. Robertazzi. Design and Analysis of a Dynamic Scheduling Strategy with Resource Estimation for Large-Scale Grid Systems. In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing (Grid 2004), [31] P. Yalagandula and M. Dahlin. A Scalable Distributed Information Management System. In Proceedings of ACM SIGCOMM, [32] Y. Zhang and V. Paxson. Detecting Stepping Stones. In Proceedings of the 9th USENIX Security Symposium, 2000.
Self-Compressive Approach for Distributed System Monitoring
Self-Compressive Approach for Distributed System Monitoring Akshada T Bhondave Dr D.Y Patil COE Computer Department, Pune University, India Santoshkumar Biradar Assistant Prof. Dr D.Y Patil COE, Computer
Research Statement for Henri Casanova
Research Statement for Henri Casanova Advances in networking technology have made it possible to deploy distributed scientific applications on platforms that aggregate large numbers of diverse and distant
Performance Evaluation of the Illinois Cloud Computing Testbed
Performance Evaluation of the Illinois Cloud Computing Testbed Ahmed Khurshid, Abdullah Al-Nayeem, and Indranil Gupta Department of Computer Science University of Illinois at Urbana-Champaign Abstract.
The Case for a Hybrid P2P Search Infrastructure
The Case for a Hybrid P2P Search Infrastructure Boon Thau Loo Ryan Huebsch Ion Stoica Joseph M. Hellerstein University of California at Berkeley Intel Research Berkeley boonloo, huebsch, istoica, jmh @cs.berkeley.edu
Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing
Volume 5, Issue 1, January 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Survey on Load
Locality Based Protocol for MultiWriter Replication systems
Locality Based Protocol for MultiWriter Replication systems Lei Gao Department of Computer Science The University of Texas at Austin [email protected] One of the challenging problems in building replication
Energy Constrained Resource Scheduling for Cloud Environment
Energy Constrained Resource Scheduling for Cloud Environment 1 R.Selvi, 2 S.Russia, 3 V.K.Anitha 1 2 nd Year M.E.(Software Engineering), 2 Assistant Professor Department of IT KSR Institute for Engineering
Load Balancing in Distributed Data Base and Distributed Computing System
Load Balancing in Distributed Data Base and Distributed Computing System Lovely Arya Research Scholar Dravidian University KUPPAM, ANDHRA PRADESH Abstract With a distributed system, data can be located
Towards Lightweight Logging and Replay of Embedded, Distributed Systems
Towards Lightweight Logging and Replay of Embedded, Distributed Systems (Invited Paper) Salvatore Tomaselli and Olaf Landsiedel Computer Science and Engineering Chalmers University of Technology, Sweden
Paolo Costa [email protected]
joint work with Ant Rowstron, Austin Donnelly, and Greg O Shea (MSR Cambridge) Hussam Abu-Libdeh, Simon Schubert (Interns) Paolo Costa [email protected] Paolo Costa CamCube - Rethinking the Data Center
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang University of Waterloo [email protected] Joseph L. Hellerstein Google Inc. [email protected] Raouf Boutaba University of Waterloo [email protected]
Dynamic Resource Pricing on Federated Clouds
Dynamic Resource Pricing on Federated Clouds Marian Mihailescu and Yong Meng Teo Department of Computer Science National University of Singapore Computing 1, 13 Computing Drive, Singapore 117417 Email:
Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
Load Distribution in Large Scale Network Monitoring Infrastructures
Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu
Cloud Management: Knowing is Half The Battle
Cloud Management: Knowing is Half The Battle Raouf BOUTABA David R. Cheriton School of Computer Science University of Waterloo Joint work with Qi Zhang, Faten Zhani (University of Waterloo) and Joseph
Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus
Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus International Symposium on Grid Computing 2009 (Taipei) Christian Baun The cooperation of and Universität Karlsruhe (TH) Agenda
SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol-1, Iss.-3, JUNE 2014, 54-58 IIST SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE
Concept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
Hadoop Fair Sojourn Protocol Based Job Scheduling On Hadoop
Hadoop Fair Sojourn Protocol Based Job Scheduling On Hadoop Mr. Dileep Kumar J. S. 1 Mr. Madhusudhan Reddy G. R. 2 1 Department of Computer Science, M.S. Engineering College, Bangalore, Karnataka 2 Assistant
Multi-level Metadata Management Scheme for Cloud Storage System
, pp.231-240 http://dx.doi.org/10.14257/ijmue.2014.9.1.22 Multi-level Metadata Management Scheme for Cloud Storage System Jin San Kong 1, Min Ja Kim 2, Wan Yeon Lee 3, Chuck Yoo 2 and Young Woong Ko 1
Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud
Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud Gunho Lee, Byung-Gon Chun, Randy H. Katz University of California, Berkeley, Yahoo! Research Abstract Data analytics are key applications
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE Mr. Santhosh S 1, Mr. Hemanth Kumar G 2 1 PG Scholor, 2 Asst. Professor, Dept. Of Computer Science & Engg, NMAMIT, (India) ABSTRACT
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh
ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
S 3 : A Scalable Sensing Service for Monitoring Large
S 3 : A Scalable Sensing Service for Monitoring Large Networked Systems Praveen Yalagandula, Puneet Sharma, Sujata Banerjee, Sujoy Basu and Sung-Ju Lee HP Labs, Palo Alto, CA ABSTRACT Efficiently operating
A Novel Switch Mechanism for Load Balancing in Public Cloud
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) A Novel Switch Mechanism for Load Balancing in Public Cloud Kalathoti Rambabu 1, M. Chandra Sekhar 2 1 M. Tech (CSE), MVR College
Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities
Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities Kavitha Ranganathan, Adriana Iamnitchi, Ian Foster Department of Computer Science, The University
A SURVEY ON MAPREDUCE IN CLOUD COMPUTING
A SURVEY ON MAPREDUCE IN CLOUD COMPUTING Dr.M.Newlin Rajkumar 1, S.Balachandar 2, Dr.V.Venkatesakumar 3, T.Mahadevan 4 1 Asst. Prof, Dept. of CSE,Anna University Regional Centre, Coimbatore, [email protected]
Using Proxies to Accelerate Cloud Applications
Using Proxies to Accelerate Cloud Applications Jon Weissman and Siddharth Ramakrishnan Department of Computer Science and Engineering University of Minnesota, Twin Cities Abstract A rich cloud ecosystem
NetworkCloudSim: Modelling Parallel Applications in Cloud Simulations
2011 Fourth IEEE International Conference on Utility and Cloud Computing NetworkCloudSim: Modelling Parallel Applications in Cloud Simulations Saurabh Kumar Garg and Rajkumar Buyya Cloud Computing and
Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load
Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Pooja.B. Jewargi Prof. Jyoti.Patil Department of computer science and engineering,
IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION N.Vijaya Sunder Sagar 1, M.Dileep Kumar 2, M.Nagesh 3, Lunavath Gandhi
From Internet Data Centers to Data Centers in the Cloud
From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs
Exploring Graph Analytics for Cloud Troubleshooting
Exploring Graph Analytics for Cloud Troubleshooting Chengwei Wang, Karsten Schwan, Brian Laub, Mukil Kesavan, and Ada Gavrilovska, Georgia Institute of Technology https://www.usenix.org/conference/icac14/technical-sessions/presentation/wang
Exploiting Cloud Heterogeneity for Optimized Cost/Performance MapReduce Processing
Exploiting Cloud Heterogeneity for Optimized Cost/Performance MapReduce Processing Zhuoyao Zhang University of Pennsylvania, USA [email protected] Ludmila Cherkasova Hewlett-Packard Labs, USA [email protected]
Title: Scalable, Extensible, and Safe Monitoring of GENI Clusters Project Number: 1723
Title: Scalable, Extensible, and Safe Monitoring of GENI Clusters Project Number: 1723 Authors: Sonia Fahmy, Department of Computer Science, Purdue University Address: 305 N. University St., West Lafayette,
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,
REVIEW OF CLOUD TESTING, TYPES, CHALLENGES AND FUTURE SCOPE
http:// REVIEW OF CLOUD TESTING, TYPES, CHALLENGES AND FUTURE SCOPE 1 Bhumika Maurya, 2 Chandraprabha and 3 Rashmi Patil 1,2 Research Scholar, SRMS CET, Bareilly. (India) 3 Assistant Professor, SRMS CET,
GeoGrid Project and Experiences with Hadoop
GeoGrid Project and Experiences with Hadoop Gong Zhang and Ling Liu Distributed Data Intensive Systems Lab (DiSL) Center for Experimental Computer Systems Research (CERCS) Georgia Institute of Technology
Building Platform as a Service for Scientific Applications
Building Platform as a Service for Scientific Applications Moustafa AbdelBaky [email protected] Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department
A Survey Study on Monitoring Service for Grid
A Survey Study on Monitoring Service for Grid Erkang You [email protected] ABSTRACT Grid is a distributed system that integrates heterogeneous systems into a single transparent computer, aiming to provide
Cloud Scale Resource Management: Challenges and Techniques
Cloud Scale Resource Management: Challenges and Techniques Ajay Gulati [email protected] Ganesha Shanmuganathan [email protected] Anne Holler [email protected] Irfan Ahmad [email protected] Abstract Managing
S06: Open-Source Stack for Cloud Computing
S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Richard Gass Intel Michael Kozuch Intel Michael Ryan Intel 1 Agenda Sessions: (A) Introduction 8.30-9.00 (B) Hadoop 9.00-10.00 Break
Krunal Patel Department of Information Technology A.D.I.T. Engineering College (G.T.U.) India. Fig. 1 P2P Network
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Secure Peer-to-Peer
Rackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction
Vol. 3 Issue 1, January-2014, pp: (1-5), Impact Factor: 1.252, Available online at: www.erpublications.com Performance evaluation of cloud application with constant data center configuration and variable
Research Statement Immanuel Trummer www.itrummer.org
Research Statement Immanuel Trummer www.itrummer.org We are collecting data at unprecedented rates. This data contains valuable insights, but we need complex analytics to extract them. My research focuses
An Efficient Hybrid P2P MMOG Cloud Architecture for Dynamic Load Management. Ginhung Wang, Kuochen Wang
1 An Efficient Hybrid MMOG Cloud Architecture for Dynamic Load Management Ginhung Wang, Kuochen Wang Abstract- In recent years, massively multiplayer online games (MMOGs) become more and more popular.
Stability of QOS. Avinash Varadarajan, Subhransu Maji {avinash,smaji}@cs.berkeley.edu
Stability of QOS Avinash Varadarajan, Subhransu Maji {avinash,smaji}@cs.berkeley.edu Abstract Given a choice between two services, rest of the things being equal, it is natural to prefer the one with more
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM Dr.S. DHANALAKSHMI 1, R. ANUPRIYA 2 1 Prof & Head, 2 Research Scholar Computer Science and Applications, Vivekanandha College of Arts and Sciences
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto
Operating System Multilevel Load Balancing
Operating System Multilevel Load Balancing M. Corrêa, A. Zorzo Faculty of Informatics - PUCRS Porto Alegre, Brazil {mcorrea, zorzo}@inf.pucrs.br R. Scheer HP Brazil R&D Porto Alegre, Brazil [email protected]
Service allocation in Cloud Environment: A Migration Approach
Service allocation in Cloud Environment: A Migration Approach Pardeep Vashist 1, Arti Dhounchak 2 M.Tech Pursuing, Assistant Professor R.N.C.E.T. Panipat, B.I.T. Sonepat, Sonipat, Pin no.131001 1 [email protected],
Scalable Run-Time Correlation Engine for Monitoring in a Cloud Computing Environment
Scalable Run-Time Correlation Engine for Monitoring in a Cloud Computing Environment Miao Wang Performance Engineering Lab University College Dublin Ireland [email protected] John Murphy Performance
A Proposed Framework for Ranking and Reservation of Cloud Services Based on Quality of Service
II,III A Proposed Framework for Ranking and Reservation of Cloud Services Based on Quality of Service I Samir.m.zaid, II Hazem.m.elbakry, III Islam.m.abdelhady I Dept. of Geology, Faculty of Sciences,
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key
POSIX and Object Distributed Storage Systems
1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India
3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing
ORACLE DATABASE 10G ENTERPRISE EDITION
ORACLE DATABASE 10G ENTERPRISE EDITION OVERVIEW Oracle Database 10g Enterprise Edition is ideal for enterprises that ENTERPRISE EDITION For enterprises of any size For databases up to 8 Exabytes in size.
A NEW FULLY DECENTRALIZED SCALABLE PEER-TO-PEER GIS ARCHITECTURE
A NEW FULLY DECENTRALIZED SCALABLE PEER-TO-PEER GIS ARCHITECTURE S.H.L. Liang Department of Geomatics Engineering, University of Calgary, Calgary, Alberta, CANADA T2N 1N4 [email protected] Commission
Using Peer to Peer Dynamic Querying in Grid Information Services
Using Peer to Peer Dynamic Querying in Grid Information Services Domenico Talia and Paolo Trunfio DEIS University of Calabria HPC 2008 July 2, 2008 Cetraro, Italy Using P2P for Large scale Grid Information
Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing
Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 10 (April 2013), Pp 53-57 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com Fair Scheduling Algorithm with Dynamic
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National
Self-Adapting Load Balancing for DNS
Self-Adapting Load Balancing for DNS Jo rg Jung, Simon Kiertscher, Sebastian Menski, and Bettina Schnor University of Potsdam Institute of Computer Science Operating Systems and Distributed Systems Before
From mini-clouds to Cloud Computing
From mini-clouds to Cloud Computing Boris Mejías, Peter Van Roy Université catholique de Louvain Belgium {boris.mejias peter.vanroy}@uclouvain.be Abstract Cloud computing has many definitions with different
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM Akmal Basha 1 Krishna Sagar 2 1 PG Student,Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, India. 2 Associate
AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK
Abstract AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK Mrs. Amandeep Kaur, Assistant Professor, Department of Computer Application, Apeejay Institute of Management, Ramamandi, Jalandhar-144001, Punjab,
Towards a Load Balancing in a Three-level Cloud Computing Network
Towards a Load Balancing in a Three-level Cloud Computing Network Shu-Ching Wang, Kuo-Qin Yan * (Corresponding author), Wen-Pin Liao and Shun-Sheng Wang Chaoyang University of Technology Taiwan, R.O.C.
SCHEDULING IN CLOUD COMPUTING
SCHEDULING IN CLOUD COMPUTING Lipsa Tripathy, Rasmi Ranjan Patra CSA,CPGS,OUAT,Bhubaneswar,Odisha Abstract Cloud computing is an emerging technology. It process huge amount of data so scheduling mechanism
Data Consistency on Private Cloud Storage System
Volume, Issue, May-June 202 ISS 2278-6856 Data Consistency on Private Cloud Storage System Yin yein Aye University of Computer Studies,Yangon [email protected] Abstract: Cloud computing paradigm
Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.
Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Load Measurement
Massive Cloud Auditing using Data Mining on Hadoop
Massive Cloud Auditing using Data Mining on Hadoop Prof. Sachin Shetty CyberBAT Team, AFRL/RIGD AFRL VFRP Tennessee State University Outline Massive Cloud Auditing Traffic Characterization Distributed
International journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
RESEARCH ARTICLE ISSN: 2321-7758 GLOBAL LOAD DISTRIBUTION USING SKIP GRAPH, BATON AND CHORD J.K.JEEVITHA, B.KARTHIKA* Information Technology,PSNA College of Engineering & Technology, Dindigul, India Article
The IBM Cognos Platform
The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent
Varalakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam
A Survey on P2P File Sharing Systems Using Proximity-aware interest Clustering Varalakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam
RevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
