Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann

Size: px
Start display at page:

Download "Resource Management for Scientific Application in Hybrid Cloud Computing Environments. Simon Ostermann"

Transcription

1 Resource Management for Scientific Application in Hybrid Cloud Computing Environments Dissertation by Simon Ostermann submitted to the Faculty of Mathematics, Computer Science and Physics of the University of Innsbruck in partial fulfillment of the requirements for the degree of doctor of science advisor: Assoz.-Prof. Priv.-Doz. Dr. Radu Prodan, Institute of Computer Science Innsbruck, 17 April 2012

2

3 Certificate of authorship/originality I certify that the work in this thesis has not previously been submitted for a degree nor has it been submitted as part of requirements for a degree except as fully acknowledged within the text. I also certify that the thesis has been written by me. Any help that I have received in my research work and the preparation of the thesis itself has been acknowledged. In addition, I certify that all information sources and literature used are indicated in the thesis. Simon Ostermann, Innsbruck on the 17 April 2012 i

4 ii

5 Abstract Cloud computing is an emerging commercial infrastructure paradigm that promises to eliminate the need for maintaining expensive computing hardware. Nevertheless, the potential of using Cloud computing infrastructure to support computational and data-intensive scientific applications has not yet been sufficiently addressed. This thesis closes this gap by researching an architecture and techniques for performance and cost-efficient execution of scientific applications on Cloud computing infrastructures, organized in five chapters. First, we investigate the suitability of the workflow paradigm for programming scientific applications, following from their success on related distributed computing infrastructures such as computational Grids. We present case studies for modeling two applications from the astrophysics field as scientific workflow applications to be run with improved performance on multiple leased Cloud resources. We further analyze the workflow traces collected over the last three years of research in the Austrian Grid and classify them according to their different structural and performance characteristics for later evaluation purposes. Second, we investigate the problem of provisioning and management of Cloud resources to large-scale scientific workflows that do not benefit from sufficient free Grid resources, as required by their computational requirements. For this purpose, we propose an extended architecture comprising new services that allow using Cloud resources in an integrated manner with minimal application interface change: resource management for virtualized hardware, software catalogues for machine images, and integrated security and authentication features. The evaluation of the proposed architecture indicates that using Cloud resources for scientific applications is a viable choice, and execution times can be significantly reduced by acquiring additional ondemand Cloud resources. Third, there is currently a lack of models to understand the performance offered by existing Cloud computing infrastructures required for scheduling and running scientific workflow applications. We perform an empirical evaluation of the performance of four commercial Cloud computing services including Amazon EC2, using different benchmarks for single resource instances and virtual clusters. We compare the

6 performance characteristics and cost models of Clouds to other scientific computing platforms such as production parallel computers and computational Grids. The results prove that certain resource types offered by Cloud providers have high potential for speeding up the execution of loosely coupled parallel applications such as scientific workflows, especially for short deadlines. Fourth, to address the lack of scalable simulators to support the Cloud computing research, we developed GroudSim, a Grid and Cloud simulation toolkit for scientific computing based on a scalable simulation-independent discrete-event engine. GroudSim provides a comprehensive set of features for complex simulation scenarios from simple job executions on leased computing resources to file transfers, calculation of costs, and background load on resources. We illustrate real scenarios of using this simulation toolkit to accelerate the evaluation of various optimized resource provisioning techniques by a factor of 700 compared to real execution with no resource cost expenses. Finally, we address the problem of dynamic provisioning of Cloud resources to large-scale scientific workflows with respect to four important aspects: (1) when to extend the Grid infrastructure with Cloud resources, (2) the amount of Cloud resources to be provisioned, (3) when to move computation from Cloud to the Grid and (4) when to release Cloud resources if no longer necessary. Then, we address the NP-complete problem of scheduling scientific workflows on heterogeneous Cloud resources by proposing an extension to the dynamic critical path scheduling algorithm for dealing with the general resource leasing model encountered in today s commercial Clouds. We analyze the availability of the cheaper and unreliable Spot instances and study their potential to complement the unavailability of Grid resources for large workflow executions. Experimental results demonstrate that Spot instances represent a 60% cheaper but equally reliable alternative to Standard instances provided that a correct user bet is made. iv

7 Acknowledgements I thank my advisor, Professor Radu Prodan, for the invaluable support and guidance throughout my Ph.D. years. His advice and example left a strong positive imprint in my formation as a researcher. I thank Professor Thomas Fahringer, for the opportunity of working in his research group at the University of Innsbruck and for his continued support and confidence in my abilities. I thank the present and past members of the Distributed and Parallel Systems group, especially Kassian Plankensteiner, Vlad Nae and Simone Pellegrini, who helped me in my research efforts. I thank my bachelor and master students Daniel Bodner, Georg Kraler, Christian Hollaus and Markus Brejla for implementing tools needed to fulfill these research ideas and Alexandru Iosup for our pleasant and fruitful collaboration. I thank my family, Renate, Gerhard, Felix, Maria, Regina and Theres, for offering support and creating an environment, which enabled and often motivated my studies and for the weekend lunches, which were always one of my weeks culinary and social highlights. Finally, I would like to express my strong bond to Tyrol and its beautiful mountains. The time spent hiking and climbing here helped me regenerate and conquering Tyrol s unique mountain tops gave me the strength to get through the most demanding times of my work. v

8 vi

9 Contents 1 Introduction Motivation Scientific Workflows Resource Management Cloud Performance Simulation Resource Provisioning and Scheduling Goals Scientific Workflows Resource Management Cloud Performance Simulation Resource Provisioning and Scheduling Summary Outline Model Workflows Activities Structures Grid Computing Austrian Grid Globus Toolkit Cloud Computing Scientific View Market View vii

10 2.3.3 Virtualization Cloud Types Amazon Elastic Compute Cloud Eucalyptus ASKALON Execution Engine Scheduling Resource Management Performance Prediction Performance Analysis Summary Scientific Workflows Design and Analysis Montage Design Evaluation Grasil Design Evaluation Wien2k Invmod MeteoAG Workflow Characteristics Workflow-based Grid Workload Analysis Results Related Work Future Work Summary Architecture Cloud Computing Survey Taxonomy Service Type Resource Deployment viii

11 4.2.3 Hardware Runtime Tuning Security Business Model Middleware Performance Resource Management Architecture ASKALON Resource Management Cloud Resource Management Image Catalogue Security Cloud-based Workflow Execution Related Work Conclusions and Future Work Summary Cloud Performance Analysis Introduction Benchmark Design Many Task Computing Method and Experimental Setup Results Cloud Performance Evaluation Method Experimental Setup Results Clouds versus other Infrastructures Method Experimental Setup Results Related Work Conclusions and Future Work Summary ix

12 6 Simulation Toolkit GroudSim Discrete-event Simulation Entities Jobs File Transfer Cost Tracing Probability Distributions Failures Background Load Evaluation ASKALON Integration Evaluation Simulation Times Sequential Grid Sites Parallel Grid Sites Simulation versus Execution Related Work Conclusions Summary Resource Provisioning and Scheduling Optimized Cloud Provisioning Cloud Start Instance Size Grid Rescheduling Cloud Stop Provisioning Evaluation Scheduling Resource Model Dynamic Critical Path Algorithm Spot Price Analysis x

13 7.6 Dynamic Critical Path for Clouds DCP-C Algorithm Rescheduling Cloud Choice Prescheduling DCP-C Evaluation Wien2k Invmod Related Work Conclusion Summary Conclusions Scientific Workflows Resource Management Cloud Performance Simulation Toolkit Resource Provisioning and Scheduling List of Figures 203 List of Tables 207 Bibliography 209 xi

14 xii

15 Chapter 1 Introduction 1.1 Motivation Scientific computing requires an ever-increasing number of resources to deliver results for growing problem sizes in a reasonable timeframe. Few years ago, supercomputers were the only way to get enough computation power for such compute intensive tasks. In the last decade, while the largest research projects were able to afford expensive supercomputers, other projects were forced to opt for cheaper resources such as commodity clusters or more modern and challenging computational Grids [52]. While aggregating a potentially unbounded number of computational resources to serve highly demanding scientific applications, computational Grids suffer from serious problems related to reliability, fulfillment of Quality of Service (QoS) guarantees, and automation of software deployment and installation processes, which makes their use rather tedious and accessible only to specialized computer specialists. Moreover, while an enormous amount of funding has been invested by national and international agencies to build large-scale computational Grids, operational costs and ultimately hardware deprecation are significant barriers in their daily and long term maintenance. Today, a new research direction coined by the term Cloud computing proposes an alternative by which resources are no longer hosted by the researcher s computational facilities, but leased from large specialized data centers only when needed. Compared to traditional parallel and distributed environments such as Grids, computational Clouds present at least four advantages that make them attractive to scientific computing scientists. First, Clouds promote the concept of leasing remote resources rather than buying own hardware, which frees institutions from permanent maintenance costs and eliminates the 1

16 burden of hardware deprecation following Moore s law. Second, Clouds eliminate the physical overhead cost of adding new hardware such as compute nodes to clusters or supercomputers and the financial burden of permanent over-provisioning of occasionally needed resources. Through a new concept of scaling-by-credit-card, Clouds promise to immediately scale up/down an infrastructure according to the temporal needs in a cost-effective fashion. Third, the concept of hardware virtualization can represent a significant breakthrough in enhancing resource utilization. Additional the automatic and scalable deployment of complex scientific software that today remains a tedious and manual process that requires manual intervention of skillful computer scientists can be simplified when virtualization is used. Fourth, the provisioning of resources through business relationships constrains specialized data center companies in offering a certain degree of QoS encapsulated in Service Level Agreements (SLA) that significantly increases the reliability and the fulfillment of user expectations. Despite the existence of many vendors that, similar to Grid computing, aggregate a potentially unbounded number of compute resources, Cloud computing remains a domain dominated by business applications (e.g. Web hosting, database servers) and whose suitability for scientific computing remains largely unexplored. The way resources are offered and advertised by the Cloud providers opens several questions about hardware, software and performance in general. As only little research was done in this area, the use of Clouds for scientific computing is the major contribution of this thesis to the field of computer science Scientific Workflows An important class of applications, which has been largely ignored so far when dealing with effective parallelization for Cloud resources, are workflow applications [90]. Workflows have a strong impact on application development in industry [54], commerce [63] and science [149] on desktop, server and parallel computing infrastructures, accelerating and simplifying programming by allowing programmers to focus on the composition of existing legacy programs to create larger and more powerful applications. Workflows have emerged as an easier way to formalize and structure data analysis, to execute the necessary computations on computing resources, to collect information about the derived results and, if necessary, to repeat the analysis. Re- 2

17 searchers across many disciplines such as life sciences, physics, astronomy, ecology, meteorology, neuroscience or computational chemistry create and use ever-increasing amounts of often highly complex data, and rely more and more on computationallyintensive modeling, simulations and statistical analysis. Scientific workflows [149] have become a key paradigm for managing such complex tasks and have emerged as a unifying mechanism for handling scientific data. Similarly, industry and commerce have been using workflow technology for a long time to describe and manage business and industry processes, and to define flows of work and data that have high business values to companies [63]. In short, workflows encapsulate the essence of describing scientific, industrial, and business user expertise through logical flows of data and work, modeled and described in the form of workflow activities, which are mapped onto a concrete computing infrastructure with the goal of managing the data processing in an automated and scalable way. With so many driving forces at work, it is clear that workflow are here to stay and will play a major role in the future Information Technology (IT) strategies of business and scientific organizations, both large and small. A large variety of workflows has been created and is being used in production in numerous areas of science [149], industry [54] and business [63]. Many of these workflows are highly data and/or computation-intensive presenting great potential for taking advantage of today s Cloud computing resources. Although a plethora of techniques and tools exists to manage and execute workflows on sequential and distributed computing infrastructures such as Grids [149], workflow applications have not yet entered the domain of Cloud computing and lack effective tools and support for programming, parallelization and optimization on Cloud infrastructures. This observations result in the following motivational question we are trying to answer with our research in the area of scientific workflows: are scientific workflow applications well suited for executions on Cloud environments? Resource Management In the last decade, Grid computing gained high popularity in the field of scientific computing through the idea of distributed resource sharing among institutions and scientists. Scientific computing is traditionally a high-utilization workload, with production 3

18 Grids often running at over 80% utilization [66] (generating high and often unpredictable latencies), and with smaller national Grids offering a rather limited amount of high-performance resources. Running large-scale simulations in such overloaded Grid environments often becomes latency-bound or suffers from well-known Grid reliability problems [33]. Despite the existence of several integrated environments for transparent programming and high-performance use of Grid infrastructures for scientific applications [176], there are still no results published in the community that report on extending them to enjoy the benefits offered by Cloud computing. While there are several early efforts that investigate the appropriateness of Clouds for scientific computing, they are either limited to simulations [36], do not address the highly successful workflow paradigm [10], and do not attempt to extend Grids with Clouds as a hybrid combined platform for scientific computing. Additional open questions have to be answered in the field resource management and provisioning. Grid resources offer monitoring services that allow resource managers to keep an overview of the current system state in a transparent way. Clouds, on the other hand, do not offer a standardized representation of their available hardware and billing models, which makes it difficult to automatically parse and compare their resources. The existing providers only offer commercial information mostly distributed over multiple Web pages, which are time-consuming to gather. To handle this problem, a resource manager is needed that can handle the information of the offered Cloud hardware in a unified and centralized manner. Clouds have advantages compared to Grid hardware. For example local resource manager systems, which are used on Grid systems to allow multiple users to share the resources result in sometimes large and unpredictable queueing overheads. Clouds on the other side to not target the resource sharing use-case and eliminate this queuing overhead by not having a resource management system by default. Instead Cloud systems introduce provisioning overheads, which is the time spend on processing an resource request and making it available for the user, that does not exist in Grid systems. The different advantages and disadvantages of Grid and Cloud systems could be combined into a hybrid system that uses both types of resources by trying to eliminate as much overheads as possible. When no Grid resources are available, the provisioning 4

19 overhead of Clouds might take less time than waiting for Grid resources to become available. We ask the research question: can the resource pool of a Grid be extended with Cloud computing resources and does a scientist have benefits from using this hybrid approach? Cloud Performance When talking about scientific usage of commercial Clouds, the expenses required for their usage are an important factor. Typically, existing Infrastructure as a Service (IaaS)-based Cloud providers offer classes of resources described in fuzzy terms, whose suitability for scientific applications is unclear. For example, Amazon Elastic Compute Cloud (EC2) advertises the so called Elastic Compute Unit (ECU) of its resources as the equivalent of a gigahertz 2007 Opteron processor, and the I/O performance of the associated storage systems with fuzzy medium and high values. On the other hand, the virtual machines to be executed on IaaS-based Clouds are often cross-compiled on local compatible architectures (e.g. an AMD Opteron image executed on a Xeon processor) and may exhibit significant losses in performance when executed on unknown Cloud resources. This uncertainty, combined with the fuzzy terms in which the Cloud resources are described makes the execution of applications a black-box approach that is hard to understand and predict, requiring different models than the ones used for computational parallel computers or computational Grids. Although a few benchmarking efforts have been conducted to evaluate the raw performance of existing academic and commercial Clouds, there is a lack of performance models to support schedulers in taking optimal mapping decisions. Without such knowledge the risk of spending money without gaining performance is considerably high. In our research we want to take the cost and performance metrics into account and show more precise results compared to related work, which does not take this speed into account or use billing intervals that are not offered by any Cloud provider right now. The majority of Cloud providers only offer hourly billing, and simulating an environment with billing based on the used seconds, as done in related work, oversimplifies the billing model of Clouds. The main question we are interested in when evaluating the benchmarks executed 5

20 on the Cloud is: is the performance delivered by the different Cloud providers suitable for scientific computing and what do we have to pay for it? Simulation Commercial Cloud resources are not offered for free to customers or scientists when needed in a larger amount. These costs result in problems when trying to evaluate scientific approaches on this resource type as such an evaluation might need several thousand hours of excessive usage. To handle this problem and to speed up the evaluation process, we investigated the available simulation frameworks that support Grid and Cloud environments. ASKALON [47] is a software middleware that eases the use of distributed Grid and Cloud resources by providing high-level abstractions for programming complex scientific workflow applications through a textual XML representation or a graphical UML-based diagram. Beside this, different middleware services support the user with sophisticated mechanisms for transparent scheduling and execution of the applications on the underlying hardware resources. Besides execution of applications in real Grid and Cloud environments as supported by ASKALON, simulation is an alternative technique to analyze a real-world model, which has several important advantages: it delivers fast results, it allows for reproducibility, it saves costs, and it enables the investigation of scenarios that cannot be easily reproduced in reality. The number of experiments can significantly be increased when using simulation and the resources used can be more flexibly specified. To support such simulations, there is the need for a simulation framework that supports Grid and Cloud resources and workflow applications. To let researchers benefit the most from the simulation system, an integration into the existing execution framework would be a desired solution. This leads to the research question in the field of simulation, which we try to answer in this thesis: is it possible to integrate a simulation framework into an execution framework to allow seamless simulation and execution? 6

21 1.1.5 Resource Provisioning and Scheduling The scientific community is highly interested in the field of Cloud computing, characterized by leasing of computation, storage, message queues, databases, and other raw resources from specialized providers under certain Quality of Service and Service Level Agreements (usually a certain resource uptime for a certain price). Extending Grid infrastructures with on demand Cloud resources appears to be a promising way to improve executions of scientific applications that do not have sufficient Grid resources available for their computational demand. However the Cloud resources have different characteristics than Grid resources that have to be taken into account when deciding about their provisioning. In the worst case, when taking the wrong decisions, the application execution might take longer with Cloud resources, which has to be avoided. Scheduling of scientific workflow applications to heterogeneous resources is known to be a NP complete problem, meaning that an optimal mapping in terms of execution time can not calculated in polynomial time. For each part of the workflow a resource must be chosen for its execution. We require a heuristical scheduling method that produces mappings that is close to the optimum in a reasonable time. By restricting the general scheduling problem with additional constrains that represent our use case better, the problem can be simplified, which reduces the complexity. Therefore we need an algorithm that performs well when resource availability frequently changes as it is the case when additional Cloud resources can be started and stopped at any time. Besides the Standard instances rented at a fixed price per hour, Amazon gives the possibility to bet on unused resources called Spot instances and rent them at variable prices with no reliability guaranteed. The user demand influences and determines the Spot instance market price, which is in most cases lower than the standard one. However, when this price gets higher than the user s bet, the access is terminated and the resources are claimed back by Amazon. Therefore, the availability of such Spot instances is limited and delivers a cheaper but possibly unreliable environment. We investigate the usage of Cloud resources and propose optimizations to raise efficiency and therefore lower the overall cost that has to be spending on resources. We extend an existing scheduling algorithm to handle the new resource type and add several optimizations to lower the overall resource cost. 7

22 The motivation in terms of provisioning and scheduling can be summarized in the short question: how can we reduce the workflow execution time and minimize the overall resource cost in a hybrid Grid and Cloud execution environment? 1.2 Goals We define the following goals matching the motivational thoughts that led to this thesis Scientific Workflows We want to fulfill two goals in the field of scientific workflows. First, we want to take sequential scientific applications and decompose them into smaller application parts, which can then be put together in a workflow shaped application. This transformation allows efficient execution of the workflow on distributed systems such as Grids and Clouds. We have an existing set of workflow application available that can be reused for the planed research, but additional scientific applications will help to proof the usability of the workflow paradigm for scientific applications. Second, we aim to analyze execution logs of existing workflow applications in the Austrian Grid and to observe the typically execution parameters and workflow sizes. Based on this study, we will be able to select a set of reference workflows and sizes to be used for our later evaluations Resource Management We see a high potential in combining this two resource types to increase performance for workflow executions. There is need for an infrastructure that allows the execution of workflows on conventional Grid resources which can be supplemented on demand with additional Cloud resources, if necessary. Details about the extensions to the resource management service to consider Cloud resources, comprising new Cloud management, software (image) deployment, and security components are of particular interest. We plan to seamlessly integrate the Cloud resource class into the existing scientific Grid workflow environment available to the scientists of the distributed and parallel computing group at the University of Innsbruck. The changes should not affect the 8

23 end-user besides the need to store his access credentials for the Cloud into our system if he plans to use this resource class. The credential management is required to allow accounting based on the personal account with a Cloud provider. Executions will be possible on Grid resources as before but if the results are needed faster, additional Cloud resources will be usable, if required login credentials are given, to speed up executions. Experimental results using a real-world application in the Austrian Grid environment, extended with an own academic Cloud constructed using a private Cloud with virtualization will be used to verify the usefulness of the presented integration Cloud Performance Cloud providers sell their offers mostly as black boxes to the end user. Performance values are hard to find and no guarantees are given about the speed, but mostly only about the reachability and availability. We will analyze the resources that can be rented from four different Cloud providers and will investigate the pure performance that is available to the end user. We evaluate using well-known micro-benchmarks and application kernels the performance of four commercial cloud computing services that can be used for scientific computing, among which the Amazon Elastic Compute Cloud (EC2), the largest commercial computing cloud in production. We compare the performance of Clouds with scientific computing alternatives such as Grids and parallel production infrastructures. Our comparison uses trace-based simulation and the empirical performance results from our Cloud performance evaluation Simulation Execution of scientific workflows is time-consuming and resource intensive. When optimizing such a workflow execution, several hundreds of executions are needed to evaluate the optimization impact and more runs are needed to verify that there are no side effects on the system. To allow a faster evaluation process, we plan to develop an event-based simulation framework in Java that provides an improved scalability compared to other related approaches. Simulation reduces the workflow execution time significantly by reducing the execution time of tasks to zero. The resulting sys- 9

24 tem allows running several hundreds of workflows within minutes without the need of available hardware from the Grid or Cloud. This development will be useful to all researchers working with our execution environment and will dramatically speed up development and debugging processes which should result in better validation of the proposed and developed features. Experimental setups will be more flexible in a simulated environment and error handling can be analyzed better then when using real environments where it is not always possible to have failures when needed to test fault tolerant aspects of new developments Resource Provisioning and Scheduling When Grid resources, which are mostly available for free to scientists in Austria, are extended with Cloud resources that are charged on a hourly basis, the additional cost is an important factor that needs to be considered. We plan to develop a scheduling mechanism that uses additional paid resources only when they have a positive effect on the overall execution time and price ratio. We study the problem of dynamically provisioning of additional Cloud resources to large-scale scientific workflows running in Grid infrastructures with respect to four important aspects: (1) Cloud start representing when is sensible to extend the Grid infrastructure with Cloud resources, (2) instance size quantifying the amount of Cloud resources that shall be provisioned; (3) Grid rescheduling indicating when to move computation from Cloud to the Grid, if new fast resources become available; and (4) Cloud stop meaning when it is sensible to release Cloud resources if no longer necessary considering their hourly payment interval. We analyze the impact of these four aspects with respect to the overall execution time, overall cost, as well as cost per unit of saved time for using Cloud resources. We plan to research a scheduling algorithm that is potentially well suited for the new challenges that arise by using Cloud resources. Once a good candidate is selected, we plan to implement and extend this algorithm within the existing workflow environment with advanced provisioning optimizations. As workflow scheduling is a NP complete problem a scheduler with reasonable complexity, that can schedule a workflow within seconds or minutes, will not give the optimal results. We aim for a heuristic that is well suited for the workflow applications running on hybrid Grid and Cloud resources. 10

25 1.2.6 Summary These five goals aim to research a workflow system that supports Cloud resources and Grid environments. Simulation of executions is one of the planed features that will allow fast end excessive evaluation of developments in the workflow system. Using this simulator will help evaluating all further goals. Workflow executions can benefit from sophisticated scheduling mechanism that optimizes the resource mapping for execution time and resource cost. The end user wants to decrease the execution time of his applications with minimal economic cost of leased resources. The benchmark analysis of the different Cloud providers will help to understand how much performance a user gets for his investment from the different Cloud providers. This knowledge can significantly improve the resource selection process when using Cloud resources in scientific workflows. The overall goal is a seamless integration of the new Cloud resources into an existing Grid workflow environment. The optimization needed to make this none trivial combination a useful alternative to existing Grid only executions is one of the key research goals, even though it is not directly visible to end users. Therefore the advantages need to be proven with detailed evaluation for each optimization. 1.3 Outline In Chapter 2 we introduce the model for the research presented in this thesis. The important terms and technologies are defined and explained. We continue in Chapter 3 with the introduction and development of scientific workflows we used for our evaluation process in this thesis. The Chapter continues with the analysis of workflow characteristics collected from historical execution traces in the Austrian Grid. Chapter 4 starts with a classification of the Infrastructure as a Service Cloud providers in eight criteria derived from a market analysis. This information is then used to explain the technical integration of this resource class into the ASKALON system. Detailed performance analysis of four Cloud providers are presented in Chapter 5. Different benchmarks are executed and evaluated combined with a comparison of workload traces from different cluster systems. In Chapter 6, we introduce a combined Grid and Cloud simulator, which is more 11

26 scalable then other available simulators. The simulator is integrated into the ASKALON system to allow simulations from within the regular execution environment. We develop and evaluate four provisioning optimizations in the first half of Chapter 7. In the second part we present the extension of a scheduling algorithm that is optimized for Cloud resources with variable pricing. Chapter 8 concludes the thesis. 12

27 Chapter 2 Model This chapter defines the terminologies needed for understanding the presented topics. The connection between the components is explained in detail given a model of the used environment and environmental conditions these studies rely on. Key topics of this chapter are workflows, the hardware and software used for the workflow execution and the environment for the experimental evaluation. General terminologies are defined for the scope of this work to understand the point of science this work relies on. 2.1 Workflows This section introduces workflows in general and in particular by the scientific workflow application this thesis focuses on. After the introduction of the general workflow model, detailed information about the applications used throughout this thesis are presented. Definition 1. A workflow consists of a sequence of concatenated steps. Emphasis is on the flow paradigm, where each step follows the precedent without delay or gap and ends just before the subsequent step may begin. This simple case of a workflow can be extendet to the more concrete workflow application, where each step is a fine grain part of the overall application. Additional data dependencies can exist between these application steps, resulting in a transfer delay between steps leading to a more precise definition. 13

28 Definition 2. A workflow application is a software application, which automates, at least to some degree, a process or processes. The processes are usually businessrelated, but it may be any process that requires a series of steps that can be automated via software. The workflows this thesis is focusing on are a subset of the general workflow application and rely on a more technical perspective: Definition 3. A scientific workflow application is a computational intensive software application, which might take a long time period to be executed. The workflow structure allows its execution in a distributed fashion to speed up the overall execution. Processes in general need input parameters and files and produce output files. As basis for this workflow structure a graph representation is used which is introduced in Section We use for our workflows the model introduced by Coffman and Graham [31]. Formally, a workflow is a directed graph G in which nodes are computational tasks which we call activities, and the directed edges represent communication; we denote the number of nodes and edges by N and E respectively. To begin an activity, all its predecessor activities and all communication with that activity as the sink must be finished. Workflows in general are often directed acyclic graph (DAG) based, but we rely an a richer representation model that allows loops, which are important for complex scientific applications. More details about the applications and their workflow representations are given in Chapter 3. Resulting from possible distributed executions, the file dependencies of such an application are very important. Files need to be copied to each location where tasks of a workflow are executed and after completion results have to be gathered. The workflows are no longer built from generic steps but a composition of tasks we call activities Activities The building block for workflows are called activities. In this section we will define the different flavors of activities that are encountered throughout this thesis. Definition 4. An atomic activity is a single task that can not be split into smaller activities. It represents an activity that applies a specified transformation/calculation 14

29 on a given input and produces output. A scientific workflow application is a graph of connected atomic activities. These activities represent the application parts that the workflow consists of. Each activity is of a special activity type which represents the functionality of such an atomic activity by: Input ports representing the files and parameters needed for execution of the activity. Output ports representing the files generated by an activity. The activity name and possible semantic information about this activity. Definition 5. An activity type is an abstraction of a application type defined by an activity name and an arbitrary number of input and output ports. These ports might be of different types. The ports of such an activity type might be from one of the following types: agwl:file representing a file that is required for execution when used in an input port, and a resulting file when being an output port. xs:integer representing an integer number. xs:float representing a float number. xs:boolean representing the values true or false. xs:string representing a string value. agwl:collection representing one or multiple of the other port types. This structure is comparable with a vector. This concept of activity types represents an abstraction from the real applications that are executed on physical hardware, which are stored and represented by activity deployments. These deployments are comparable with concrete application installations and consists of: 15

30 File names that this applications expects for its input and outputs. The underlying service architecture for execution, which may be a Web Service, a Globus service or simple SSH job submission routines. The location of the deployment by server, service URI or file system location. Information of the invocation of this deployment via command line or URI parameters needed for the execution. Definition 6. An activity deployment is a concrete installation matching an activity type. The deployment contains all the information needed to execute the application represented. One or multiple of these activities are connected with control flow and data flow dependencies. To allow more complex workflow structures a workflow language is used that supports different high level structures to build workflows Structures Workflows can be simple structured lists of atomic activities, which should be executed one after the other. More interesting for parallel computing are workflows with a higher complexity in the structure as in compound activities. To allow easier creation of complex workflow structures the used Abstract Grid Workflow Language (AGWL) [48] supports multiple compound activities which can have an arbitrary number of atomic activities in their body: for has in its body a set of activities which are executed an arbitrary number of times sequentially similar to a classical for loop. parallelfor is similar to for, but the iterations of the loop are executed all in parallel. foreach is a structure that needs a collection as an input and then executes the activities defined in its body for each element of the collection in sequence. parallelforeach is similar to the foreach, but all the elements of the collection are processed in parallel. 16

31 while is an unbounded for loop with a stop condition. if allows to have optional workflow activities or alternative workflow structures allowing different atomic activities on the if and else branch. fork defines parallel sections in the workflow and allows two or more activities to be executed beside each other. DAG stands for Directed Acyclic Graph and allows sections with dependencies known from this graph structure. Here dependencies might be less strict then with the fork construct. With these structures it is easy to build workflows that allow distributed execution on Grid and/or Cloud environments. The scientific workflows used for the experiments and evaluation are presented in Chapter 3 and show example of most of this compound activities. 2.2 Grid Computing A computer can be defined as a machine with one or many CPUs, memory and a hard drive connected to a main board. This unit is empowered by an operating system that allows easier use of the given hardware. When multiple of these computers, mostly of same hardware, are interconnected with fast networks the system built is called a cluster. Those can also be build from standard desktop computers like in commodity clusters i.e. Beowulf clusters. When special hardware is used to achieve better performance, these clusters are built in servers cases hosted in racks in server rooms having redundant power supplies and interconnections with high speed networks. In the 1990s scientists of the different universities had access to their local cluster systems, which were not fast enough to fulfill their requirements on peak usage. But an undeniable amount of time the cluster was not used that much and scientists developed the idea to combine their cluster with other universities clusters to a bigger system, which is called the Grid. This system is globally distributed and heterogeneous in the terms of hardware and operating systems. The administrators of the different domains of such a Grid need to agree on a set of standard software that needs to be installed on 17

32 all systems that participate in such a Grid, to allow uniform access to all the resources for scientists. Back in 1998, Carl Kesselman and Ian Foster attempted a definition in the book [52], which includes the following definition: A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. Later on a list was developed with three aspects a system has to fulfill to be a Grid: Computing resources are not administered centrally. Open standards are used. Nontrivial quality of service is achieved. From a computer scientist s point of view, the Grid is an infrastructure that offers resources which might be used for computation, storage or other form of sensors. They are used to share resources over administrative domains and allow uniform access to different systems with central authentication. For the work presented in this thesis, we had access to the Austrian Grid system, which is explained in more detail in the upcoming subsection Austrian Grid The Austrian Grid is a nationwide initiative to establish Grid computing in Austria. It combines Austria s leading researchers in advanced computing technologies with well-recognized partners in Grid-dependent application areas. [160] The Austrian Grid was started in 2004 as a three years project that was continued with a second project from 2007 to Different Universities from Vorarlberg to Vienna contributed to this national project with their hardware and created a shared hardware pool with fluctuating cluster count and hit a peak maximum of 1372 cores in total. After the end of the second project the infrastructure was kept functional to allow scientists to continue their work. The hardware available within the Grid has a high diversity starting from desktop clusters build from unused computer rooms of the universitys to expensive shared memory systems with up to 768 core. 18

33 Table 2.1 shows a snapshot of the Austrian Grid system from from the Monitoring & Discovery System (MDS), which is installed to give an machine readable overview over the usable resources. The value for RAM is given in megabyte and per computing node, which does not show the total amount of memory available for traditional cluster systems. Additional two benchmark values are presented where SI00 shows the result of the SpecInt2000 benchmark and SF00 for the SpecFloat2000 benchmark (single CPU). The local resource management system (LRM) indicates what scheduling system is used locally within this system. This information is important when submitting jobs directly to the cluster and bypassing the Grid middleware used or when adding additional parameters that should be passed to the LRM. Site Master CPUs Free RAM SI00 SF00 LRM altix-uibk altix1.uibk.ac.at torque jku lilli.edvz.uni-linz.ac.at dps-prod karwendel.dps.uibk.ac.at sge jku alex.jku.austriangrid.at pbs leo1 login.leo1.uibk.ac.at sge Sum Table 2.1: Austrian Grid resources snapshot taken at Globus Toolkit Globus is open source Grid software that addresses the most challenging problems in distributed resource sharing. The Globus Toolkit is a fundamental enabling technology for building Grids that allow distributed computing power, storage resources, scientific instruments, and other tools to be shared securely across corporate, institutional, and geographic boundaries. [152] The Austrian Grid uses Globus as middleware to allow uniform login to the different Clusters available for scientists all over Austria and cooperating countries. Job submission is done using the Globus job manager called GRAM which is build on top of the local resource management systems of the different clusters and offers a uniform interface to the different resource management flavors. The monitoring of the resources is done using MDS and file transfers are supported by the GridFTP protocol, which is 19

34 a optimized file transfer protocol that allows multi streaming and buffer adjustments to maximize throughput of transfers. 2.3 Cloud Computing Cloud computing is a hyped buzzword in the IT at the current point of time. Industry and most companies want to label their products with Cloud computing to be covered with the media attention currently given to this term. Most Internet businesses, starting from one person start-ups to global players, want to promote their services as Cloud based to be part of the hype. This trend results in a growing area in the IT that wants to be covered by the term Cloud Scientific View Scientific computing requires an ever-increasing number of resources to deliver results for ever-growing problem sizes in a reasonable time frame. In the last decade, while the largest research projects were able to afford (access to) expensive supercomputers, many projects were forced to opt for cheaper resources such as commodity clusters [147, 151] and Grids [52]. Cloud computing proposes an alternative in which resources are no longer hosted by the researchers computational facilities, but are leased from big data centers only when needed in a pay-per-use fashion. From a scientific point of view, the most popular interpretation of Cloud computing is Infrastructure as a Service (IaaS), which provides generic means for hosting and provisioning of access to raw computing infrastructure and its operating software. IaaS are typically provided by data centers renting modern hardware facilities to customers that only pay for what they effectively use, which frees them from the burden of hardware maintenance and deprecation. IaaS is characterized by the concept of resource virtualization which allows a customer to deploy and run his own guest operating system on top of the virtualization software (e.g. [29]) offered by the provider. Virtualization in IaaS is also a key step towards distributed, automatic, and scalable deployment, installation, and maintenance of software. More informations about IaaS and virtualization is given in the sections and To deploy a guest operating system showing to the user another abstract and higher- 20

35 level emulated platform, the user creates a virtual machine image, in short image. In order to use a Cloud resource, the user needs to copy and boot an image on top, called virtual machine instance, in short instance. After an instance has been started on a Cloud resource [6], we say that the resource has been provisioned and can be used. If a resource is no longer necessary, it must be released such that the user no longer pays for its use. Commercial Cloud providers typically provide to customers a selection of resource classes or instance types with different characteristics including CPU type, number of cores, memory, hard disk, and I/O performance. The Cloud computing paradigm holds great promise for the performance-hungry scientific computing community: Clouds can be a cheap alternative to supercomputers and specialized clusters, a much more reliable platform than Grids, and a much more scalable platform than the largest of commodity clusters. Clouds also promise to scale by credit card, that is, to scale up instantly and temporarily within the limitations imposed only by the available financial resources, as opposed to the physical limitations of adding nodes to clusters or even supercomputers and to the administrative burden of over-provisioning resources. Moreover, through the use of resource management such as Condor [151] Clouds offer good support for bags-of-tasks, which currently constitute the dominant Grid application type [72] Market View But in general, the term Cloud computing is defined broader then the pay-as-you go section of this large field, that we are interested in. The search for a common definition brings up multiple different interpretations where we want to show three to visualize the diversity of the general definitions: Cloud computing is Internet-based computing, whereby shared resources, software, and information are provided to computers and other devices on demand, like the electricity Grid. from en.wikipedia.org/wiki/cloud_computing A self-service environment for the creation of highly-scalable applications, with the immediate availability of compute power and granular levels of billing. from blog.labslice.com/2011/03/taxonomy-of-amazon-cloud.html Using word processing, spread sheet or programs that are installed some- 21

36 where other than on the computer upon which you are currently typing. Simply put, the applications live in a Cloud on the Internet rather than being installed on your computer hard drive. from susanmaus.com/marketing-dictionary Resulting from this lack of a general definition of the term Cloud computing the work in [166] tried to categorize the field of Cloud computing more scientifically in The research done in [57] in 2009 did a survey resulting in multiple possible definitions for the term Cloud computing collected from experts in the field of computer science. These definitions where checked for common features that definitions from multiple experts contained and the following list of Cloud characteristics was derived: Virtualization: Shield the user from the hardware via a virtualization layer. Pay per use: No monthly contracts are required and the user only pays for the real resources usage (mostly on a hourly basis). User friendliness: Easy to use interfaces, graphical control portals are often available to control and use the Cloud. Internet centric: Services are located in the Internet and not on local servers. User sometimes does not even know where his data or service is stored. Variety of resources: A provider has different offers performance wise, which fit the different needs from customer with diverse requirements. Automatic adaptation: Clouds should be evolving and dynamic system that make it easy to adapt to new paradigms and trends. Scalability: When more power is needed due overloading of a service the system should offer scalability of the used services in an easy adoptable way or automatically. Resource optimization: The hardware should be used at best possible efficiency by using optimized software and resource and time sharing methods. Service SLAs: Clouds should offer contracts that specify what service is guaranteed and what compensations are applied when the service quality is below the specified limit. 22

37 The goal of this thesis is not to define the term Cloud computing. We aim to use products that are promoted under the term Cloud computing for scientific workflow executions. In the following subsections we introduce a short categorization of Cloud types and define the part this research is focusing on Virtualization Servers can be virtualized to raise efficiency by putting together multiple applications on a single hardware server. For instance, a mail server with average 5% load, a SQL server with 10% load and a Web server with 15% load could be put together using three virtual machines with a total load of 30% in average on a single server saving the investment for the other two servers. The advantage of such an optimization is the lower hardware cost and better resource utilization but the disadvantage is that on peak usage the reserves for each service are shared and chances to overload the overall system are higher. Assuming a equal resource sharing each service can have 33.3% server utilization at maximum compared to 100% when executed on a dedicated server. Virtualization can be used to lower over provisioning when sharing resources for different applications. From this development, the virtualization evolved to be used to virtualize not only shared resources to improve utilization, but also to allow strict resource separation by creating multiple virtual servers on one big server infrastructure. This newly created virtual dedicated servers behave like physical servers with only a part of the computational power of the host. Scientific software, compared to commercial mainstream products, is often hard to install and use. In most cases special requirements have to be fulfilled to install them as discussed in [20]. The time needed to install specific required compiler versions, libraries and software on each host used is higher than the time spent with the creation of one virtual machine image, which can be started by virtualization software on any hardware supported. The two most common virtualization environments are VMWare [165] and Xen [29], but there exist other solutions used by smaller communities such as KVM, Virtual Box, vsersers, OpenVZ, and Qemu. The largest part of the scientific community, including Amazon, has chosen Xen as their virtualization platform as it is open source and, therefore, can be freely used and 23

38 Micro-Kernel Virtualization Paravirtualization User Apps User Apps User Apps User Apps Mgt Code Linux Windows Mgt Code Device Driver Driver Linux Windows Micro- Kernel { Mgt API Binary Translation Mgt API Virtual Hardware API } Device Device Device Device Driver Driver Driver Driver Hardware Hardware Intel VT AMD Pacific Figure 2.1: Xen paravirtualisation compared to a micro kernel architectures adapted to various needs, if required. Figure 2.1 shows the paravirtualization architecture implemented by Xen. The advantage is the thinner layer between the hardware and the guest operating system in the software stack to ensure best possible performance. If the operating system and hardware support paravirtualization, the guest operating system can directly access the hardware resources like CPU and memory. The device drivers are moved to a different layer of the stack to separate them from the guest. The advantages of virtualization are rather important when using heterogeneous computing resources as common in Grid and Cloud computing. A virtual machine image is only needed to be created once and can then be used on all machines having the same virtualization software installed. While the performance reachable with this approach is depending on the used software and hardware, Xen claims to be able to add an overhead of only % [106] in the best case. However, in most cases when comparing to hardware optimized binaries and libraries, this performance loss may dramatically increase. Using machines with similar architecture and speed, and a machine image optimized for this main architecture, this may result in a small total virtualization overhead and comparable performance with less compilation and optimization effort. For example, when running a virtual machine image on an AMD Opteron host environment the used libraries and binaries should have been compiled to use the 3D Now extension that this architecture is offering, but this image will then not be executable on Intel Xeon architectures. 24

39 2.3.4 Cloud Types We can classify the field of Cloud computing that we are interested in into four categories. Many Cloud computing companies distinguish themselves in the type of services they offer. At the highest-level, we observe three main directions (see Figure 2.2) and one area where we see all the other services that do not fall into the first three directions. Service Type Infrastructure as a Service (IaaS) Software as a Service (SaaS) Platform as a Service (PaaS) Specialized services Web hosting File hosting Figure 2.2: Service type taxonomy. Infrastructure as a Service (IaaS) IaaS provides generic functionality for hosting and provisioning of access to raw computing infrastructure and its operating middleware software. IaaS are typically provided by data centers that rent modern hardware facilities to customers, which are freed from the burden of their maintenance and deprecation. IaaS is characterized by the concept of resource virtualization, which allows a customer to deploy and run his own guest operating system on top of the virtualization software offered by the provider. Virtualization in IaaS is a key step towards distributed, automatic, and scalable deployment, installation, and maintenance of software. An example for this service type is described in Section

40 Software as a Service (SaaS) SaaS is the second category of Cloud services defining a new model of software deployment where an application is hosted as a service and provided to customers across the Internet, with no need to install and run it on customer s own computer. In SaaS, the hosting is done transparently by the service provider (usually the same as the developer), which eliminates the hosting intermediary (and its underlying functionality) between itself and the customer. SaaS is a more restrictive model than IaaS which constrains customers to using an existing set of services, rather than deploying there own ones. Software companies adopt this model to give access to their software on a pay-as-you go model rather then sell licenses to software. This allows a cheaper usage of software at the beginning but might increase the tool cost for the customer on a long-term usage scenario. Examples for SaaS providers are: Google Docs, Microsoft Office 365 and Adobe Photoshop Express. Platform as a Service (PaaS) PaaS, also known as Cloudware, is the third category that brings IaaS and SaaS one step further by providing all facilities and APIs to support the complete life cycle of building and delivering Web applications and services (including design, development, testing, deployment, and hosting), with no more need for tedious software downloads and installations. PaaS is a relatively new and immature concept that still needs to gain community acceptance and support before being surveyed in detail. Examples for PaaS providers are: Google AppEngine, Microsoft Azure and Heroku. Specialized hosting services Besides these three main categories, we introduce a fourth category of specialized hosting services that are closely related to, or claim to support Cloud computing, although they offer significantly restricted or specialized functionality. We see two successful representatives of this category on the market: 1. Web hosting environments act as intermediaries between service providers and customers by renting packages for hosting Web sites, comprising Web servers, FTP and SSH access, storage space, and various software capabilities such as 26

41 Perl, PHP, Python, or Ruby. There are three main aspects invoked by Web hosting companies which connect them to Clouds: (i) virtualization of resources (although not exposed to the users) for improved management of timesharing resources, (ii) automatic scaling of the provisioned resources to cope with dynamic client load with guaranteed Quality of Service (QoS) (see Section 4.2.4), and (iii) business models inspired from utility computing (see Section 4.2.6); 2. File hosting environments offer a virtual and persistent storage system where customers can safely save their data at a certain price with guaranteed QoS delivery. 3. Everything else is of minor interest for the scientific community. Therefore we did not concentrate on the providers that claim that they offer Cloud computing but do not fit into ant of the previews categories Amazon Elastic Compute Cloud Amazon Elastic Compute Cloud (Amazon EC2) is a Web service that provides resizable compute capacity in the Cloud. It is designed to make Web-scale computing easier for developers. - Amazon EC2 s simple Web service interface allows you to obtain and configure capacity with minimal friction. It provides a complete control of computing resources and lets one run on Amazon s proven computing environment. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing one to quickly scale capacity, both up and down, as the computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios. [6] EC2 is an IaaS provider with a wide range of options for the customer to choose from. It is one of the biggest provider known and plays the role of the market leader. As internal numbers are not often published, these observations rely on third party companies that try to figure out how many resources such a provider has and how these are used. In Figure 2.3 we see such a analysis from [137] showing how many of the top Web sites listed in the Internet have IP addresses associated with 27

42 Figure 2.3: Estimation of market share of different Cloud providers. [137] Cloud providers. Amazon is at the first place closely followed by Rackspace Cloud servers. It is also remarkable that from the total sites analyzed, nearly or 1,8% are using Cloud technologies from the Web pages producing the most traffic. This shows the importance of the Cloud technology on the Web hosting market. There are several terms that are important when talking about EC2 and IaaS Cloud Computing: Definition 7. An instance image is a file containing the operating system and possible additional application, which can be started in a Cloud. The image might be optimized for special architectures and has a specified size. When started the instance image needs to be copied, as the original file is not changes when executed. After termination, all changes to such an image are lost when not stored separately. Definition 8. An instance type is a hardware configuration, which can be from physical resources or a virtual environment. This type has characteristics which might be defined in detail or in a broader way the CPU or memory of such an instance might be shared or dedicated. Definition 9. An instance is a Cloud resource, which is running a specific instance image on a matching instance type. The user has root rights on this instance and 28

43 can log into it using the SSH protocol with public/private key authentication once the startup process is finished Eucalyptus Eucalyptus [111] is a university project that tries to mimic the feature set offered by Amazon EC2 and S3 by implementing the same underlying API, and additionally a Web portal for user management. A public test Cloud is also provided. This software bundle allows user to install their own EC2 compatible cloud on local hardware for developing, test or production environments. We installed Eucalyptus on two local servers with total 12 cores allowing us to start up to 12 virtual machines. 2.4 ASKALON The goal of ASKALON is to simplify the development and optimization of applications that can harness the power of Grid computing. The ASKALON project crafts a novel environment based on new innovative tools, services, and methodologies to make Grid application development and optimization for real applications an everyday practice. [157] The ASKALON Grid middleware [47] can execute via its workflow execution engine workflows specified in AGWL. During execution, this abstract specification is instantiated, that is, the tasks are annotated with details concerning the used resources. ASKALON s workflow execution engine features a fine grain event system, which is implemented as WSRF service and allows event-forwarding even through NAT or firewalls. An overview of this and other Grid workflow systems can be found in [176]. ASKALON is a service oriented architecture that consists of several independent services, which communicate with each other to allow scientists to have a simple environment to execute workflow application on parallel and distributed systems such as clusters, Grids and Clouds.For the composition of a workflow, ASKALON provides a graphical user interface, as shown in Figure 2.4, through which application developers can conveniently assemble activities, define their execution order, and decide which activities can be executed in parallel and monitor them. Figure 2.5 shows the architecture of ASKALON at the state where the presented 29

44 Figure 2.4: User interface of ASKALON. research started. In the following subsections, we give more details about the main services of ASKALON, which were used and extended to produce the research presented in this thesis Execution Engine The execution engine is the entry point to the ASKALON services. The user submits an workflow for execution in a Web service call with an embedded AGWL description of the workflow to execute. In the year 2008, the ASKALON workflow execution engine evolved in two major steps. The first version, DEE [41], focused on functionality and hard coded optimizations. DEE s primary shortcomings were the internal loop unrolling from the workflow specification, and the complete scheduling at the start of the execution. To improve on scalability and on adaptability to highly dynamic Grid environments, the secondgeneration engine Execution Engine 2 (EE2) was developed [128]. The EE2 uses internally a structure that is kept close to the AGWL specification and better scales for the execution of large workflows. Each job that is ready for execution is dynamically sent to the best available Grid site at a certain moment. 30

45 Figure 2.5: Original ASKALON architecture with no Cloud support. [157] To decide where a activity should be executed, the EE2 asks the Scheduler and once a decision taken, EE2 will submit the to the specified resource. The engine is responsible for the correct execution of file transfers and activities that follow the control and data flow of the workflow. Optimizations for better workflow performance and stability are also part of this service, which are not covered in this thesis Scheduling The scheduler is the component who has to decide on which host an activity should be executed. As the scheduling of a workflow on heterogeneous resources is known to be NP-complete [156], it is not possible to deliver the best solution for scheduling problems in a reasonable time. The scheduler gets a request from EE2 to map one activity to a resource. Currently there are different implementations of this scheduler using different methods: JIT is a Just In Time (JIT) scheduler that looks at single activities only and tries to find the resource with the minimum completion time for the current activity. The scheduler does not take dependencies or other activities into account when taking this decision. The advantage is the fast decision-making and the up-to- 31

DYNAMIC CLOUD PROVISIONING FOR SCIENTIFIC GRID WORKFLOWS

DYNAMIC CLOUD PROVISIONING FOR SCIENTIFIC GRID WORKFLOWS DYNAMIC CLOUD PROVISIONING FOR SCIENTIFIC GRID WORKFLOWS Simon Ostermann, Radu Prodan and Thomas Fahringer Institute of Computer Science, University of Innsbruck Technikerstrasse 21a, Innsbruck, Austria

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Cloud Computing and Amazon Web Services

Cloud Computing and Amazon Web Services Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD

More information

Grid Computing Vs. Cloud Computing

Grid Computing Vs. Cloud Computing International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid

More information

RevoScaleR Speed and Scalability

RevoScaleR Speed and Scalability EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

System Models for Distributed and Cloud Computing

System Models for Distributed and Cloud Computing System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Workflow Engine for clouds

More information

9/26/2011. What is Virtualization? What are the different types of virtualization.

9/26/2011. What is Virtualization? What are the different types of virtualization. CSE 501 Monday, September 26, 2011 Kevin Cleary kpcleary@buffalo.edu What is Virtualization? What are the different types of virtualization. Practical Uses Popular virtualization products Demo Question,

More information

Dynamic Resource Distribution Across Clouds

Dynamic Resource Distribution Across Clouds University of Victoria Faculty of Engineering Winter 2010 Work Term Report Dynamic Resource Distribution Across Clouds Department of Physics University of Victoria Victoria, BC Michael Paterson V00214440

More information

Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2

Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 In the movie making, visual effects and 3D animation industrues meeting project and timing deadlines is critical to success. Poor quality

More information

Building Platform as a Service for Scientific Applications

Building Platform as a Service for Scientific Applications Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing N.F. Huysamen and A.E. Krzesinski Department of Mathematical Sciences University of Stellenbosch 7600 Stellenbosch, South

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 10 (April 2013), Pp 53-57 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com Fair Scheduling Algorithm with Dynamic

More information

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of

More information

Relational Databases in the Cloud

Relational Databases in the Cloud Contact Information: February 2011 zimory scale White Paper Relational Databases in the Cloud Target audience CIO/CTOs/Architects with medium to large IT installations looking to reduce IT costs by creating

More information

An approach to grid scheduling by using Condor-G Matchmaking mechanism

An approach to grid scheduling by using Condor-G Matchmaking mechanism An approach to grid scheduling by using Condor-G Matchmaking mechanism E. Imamagic, B. Radic, D. Dobrenic University Computing Centre, University of Zagreb, Croatia {emir.imamagic, branimir.radic, dobrisa.dobrenic}@srce.hr

More information

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland The Lattice Project: A Multi-Model Grid Computing System Center for Bioinformatics and Computational Biology University of Maryland Parallel Computing PARALLEL COMPUTING a form of computation in which

More information

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University Cloud computing: the state of the art and challenges Jānis Kampars Riga Technical University Presentation structure Enabling technologies Cloud computing defined Dealing with load in cloud computing Service

More information

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing New Paradigms: Clouds, Virtualization and Co. EGEE08, Istanbul, September 25, 2008 An Introduction to Virtualization and Cloud Technologies to Support Grid Computing Distributed Systems Architecture Research

More information

Software-defined Storage Architecture for Analytics Computing

Software-defined Storage Architecture for Analytics Computing Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture

More information

Scheduling and Resource Management in Computational Mini-Grids

Scheduling and Resource Management in Computational Mini-Grids Scheduling and Resource Management in Computational Mini-Grids July 1, 2002 Project Description The concept of grid computing is becoming a more and more important one in the high performance computing

More information

Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring

Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring University of Victoria Faculty of Engineering Fall 2009 Work Term Report Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring Department of Physics University of Victoria Victoria, BC Michael

More information

Everything you need to know about flash storage performance

Everything you need to know about flash storage performance Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices

More information

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION Kirandeep Kaur Khushdeep Kaur Research Scholar Assistant Professor, Department Of Cse, Bhai Maha Singh College Of Engineering, Bhai Maha Singh

More information

Relocating Windows Server 2003 Workloads

Relocating Windows Server 2003 Workloads Relocating Windows Server 2003 Workloads An Opportunity to Optimize From Complex Change to an Opportunity to Optimize There is much you need to know before you upgrade to a new server platform, and time

More information

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. White Paper 021313-3 Page 1 : A Software Framework for Parallel Programming* The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. ABSTRACT Programming for Multicore,

More information

Technology Insight Series

Technology Insight Series Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary

More information

Cloud Management: Knowing is Half The Battle

Cloud Management: Knowing is Half The Battle Cloud Management: Knowing is Half The Battle Raouf BOUTABA David R. Cheriton School of Computer Science University of Waterloo Joint work with Qi Zhang, Faten Zhani (University of Waterloo) and Joseph

More information

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING Carlos de Alfonso Andrés García Vicente Hernández 2 INDEX Introduction Our approach Platform design Storage Security

More information

Nexenta Performance Scaling for Speed and Cost

Nexenta Performance Scaling for Speed and Cost Nexenta Performance Scaling for Speed and Cost Key Features Optimize Performance Optimize Performance NexentaStor improves performance for all workloads by adopting commodity components and leveraging

More information

Base One's Rich Client Architecture

Base One's Rich Client Architecture Base One's Rich Client Architecture Base One provides a unique approach for developing Internet-enabled applications, combining both efficiency and ease of programming through its "Rich Client" architecture.

More information

Cloud and Virtualization to Support Grid Infrastructures

Cloud and Virtualization to Support Grid Infrastructures ESAC GRID Workshop '08 ESAC, Villafranca del Castillo, Spain 11-12 December 2008 Cloud and Virtualization to Support Grid Infrastructures Distributed Systems Architecture Research Group Universidad Complutense

More information

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications

Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications Performance Prediction, Sizing and Capacity Planning for Distributed E-Commerce Applications by Samuel D. Kounev (skounev@ito.tu-darmstadt.de) Information Technology Transfer Office Abstract Modern e-commerce

More information

On-Demand Supercomputing Multiplies the Possibilities

On-Demand Supercomputing Multiplies the Possibilities Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Image courtesy of Wolfram Research, Inc. On-Demand Supercomputing Multiplies the Possibilities Microsoft Windows Compute Cluster Server

More information

An Introduction to Cloud Computing Concepts

An Introduction to Cloud Computing Concepts Software Engineering Competence Center TUTORIAL An Introduction to Cloud Computing Concepts Practical Steps for Using Amazon EC2 IaaS Technology Ahmed Mohamed Gamaleldin Senior R&D Engineer-SECC ahmed.gamal.eldin@itida.gov.eg

More information

CLOUD COMPUTING. When It's smarter to rent than to buy

CLOUD COMPUTING. When It's smarter to rent than to buy CLOUD COMPUTING When It's smarter to rent than to buy Is it new concept? Nothing new In 1990 s, WWW itself Grid Technologies- Scientific applications Online banking websites More convenience Not to visit

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

Private Cloud for the Enterprise: Platform ISF

Private Cloud for the Enterprise: Platform ISF Private Cloud for the Enterprise: Platform ISF A Neovise Vendor Perspective Report 2009 Neovise, LLC. All Rights Reserved. Background Cloud computing is a model for enabling convenient, on-demand network

More information

Key Requirements for a Job Scheduling and Workload Automation Solution

Key Requirements for a Job Scheduling and Workload Automation Solution Key Requirements for a Job Scheduling and Workload Automation Solution Traditional batch job scheduling isn t enough. Short Guide Overcoming Today s Job Scheduling Challenges While traditional batch job

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Service Virtualization:

Service Virtualization: Service Virtualization: Reduce the time and cost to develop and test modern, composite applications Business white paper Table of contents Why you need service virtualization 3 The challenges of composite

More information

CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

More information

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services Ronnie D. Caytiles and Byungjoo Park * Department of Multimedia Engineering, Hannam University

More information

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber Introduction to grid technologies, parallel and cloud computing Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber OUTLINES Grid Computing Parallel programming technologies (MPI- Open MP-Cuda )

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 12, December 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Designs of

More information

Managing Traditional Workloads Together with Cloud Computing Workloads

Managing Traditional Workloads Together with Cloud Computing Workloads Managing Traditional Workloads Together with Cloud Computing Workloads Table of Contents Introduction... 3 Cloud Management Challenges... 3 Re-thinking of Cloud Management Solution... 4 Teraproc Cloud

More information

ioscale: The Holy Grail for Hyperscale

ioscale: The Holy Grail for Hyperscale ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often

More information

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine Grid Scheduling Architectures with and Sun Grid Engine Sun Grid Engine Workshop 2007 Regensburg, Germany September 11, 2007 Ignacio Martin Llorente Javier Fontán Muiños Distributed Systems Architecture

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions 64% of organizations were investing or planning to invest on Big Data technology

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

EQUELLA Whitepaper. Performance Testing. Carl Hoffmann Senior Technical Consultant

EQUELLA Whitepaper. Performance Testing. Carl Hoffmann Senior Technical Consultant EQUELLA Whitepaper Performance Testing Carl Hoffmann Senior Technical Consultant Contents 1 EQUELLA Performance Testing 3 1.1 Introduction 3 1.2 Overview of performance testing 3 2 Why do performance testing?

More information

Using In-Memory Computing to Simplify Big Data Analytics

Using In-Memory Computing to Simplify Big Data Analytics SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed

More information

Introduction to Service Oriented Architectures (SOA)

Introduction to Service Oriented Architectures (SOA) Introduction to Service Oriented Architectures (SOA) Responsible Institutions: ETHZ (Concept) ETHZ (Overall) ETHZ (Revision) http://www.eu-orchestra.org - Version from: 26.10.2007 1 Content 1. Introduction

More information

LSKA 2010 Survey Report Job Scheduler

LSKA 2010 Survey Report Job Scheduler LSKA 2010 Survey Report Job Scheduler Graduate Institute of Communication Engineering {r98942067, r98942112}@ntu.edu.tw March 31, 2010 1. Motivation Recently, the computing becomes much more complex. However,

More information

Load Balancing in Distributed Data Base and Distributed Computing System

Load Balancing in Distributed Data Base and Distributed Computing System Load Balancing in Distributed Data Base and Distributed Computing System Lovely Arya Research Scholar Dravidian University KUPPAM, ANDHRA PRADESH Abstract With a distributed system, data can be located

More information

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,

More information

A Middleware Strategy to Survive Compute Peak Loads in Cloud

A Middleware Strategy to Survive Compute Peak Loads in Cloud A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk

More information

Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration

Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration Part I of an ebook series of cloud infrastructure and platform fundamentals not to be avoided when preparing

More information

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091 Citation: Alhamad, Mohammed and Dillon, Tharam S. and Wu, Chen and Chang, Elizabeth. 2010. Response time for cloud computing providers, in Kotsis, G. and Taniar, D. and Pardede, E. and Saleh, I. and Khalil,

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm A REVIEW OF THE LOAD BALANCING TECHNIQUES AT CLOUD SERVER Kiran Bala, Sahil Vashist, Rajwinder Singh, Gagandeep Singh Department of Computer Science & Engineering, Chandigarh Engineering College, Landran(Pb),

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

SCALABILITY AND AVAILABILITY

SCALABILITY AND AVAILABILITY SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase

More information

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY

GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY GRIDCENTRIC VMS TECHNOLOGY VDI PERFORMANCE STUDY TECHNICAL WHITE PAPER MAY 1 ST, 2012 GRIDCENTRIC S VIRTUAL MEMORY STREAMING (VMS) TECHNOLOGY SIGNIFICANTLY IMPROVES THE COST OF THE CLASSIC VIRTUAL MACHINE

More information

1.1.1 Introduction to Cloud Computing

1.1.1 Introduction to Cloud Computing 1 CHAPTER 1 INTRODUCTION 1.1 CLOUD COMPUTING 1.1.1 Introduction to Cloud Computing Computing as a service has seen a phenomenal growth in recent years. The primary motivation for this growth has been the

More information

Resource Provisioning in Clouds via Non-Functional Requirements

Resource Provisioning in Clouds via Non-Functional Requirements Resource Provisioning in Clouds via Non-Functional Requirements By Diana Carolina Barreto Arias Under the supervision of Professor Rajkumar Buyya and Dr. Rodrigo N. Calheiros A minor project thesis submitted

More information

SLA BASED SERVICE BROKERING IN INTERCLOUD ENVIRONMENTS

SLA BASED SERVICE BROKERING IN INTERCLOUD ENVIRONMENTS SLA BASED SERVICE BROKERING IN INTERCLOUD ENVIRONMENTS Foued Jrad, Jie Tao and Achim Streit Steinbuch Centre for Computing, Karlsruhe Institute of Technology, Karlsruhe, Germany {foued.jrad, jie.tao, achim.streit}@kit.edu

More information

Project Proposal. Data Storage / Retrieval with Access Control, Security and Pre-Fetching

Project Proposal. Data Storage / Retrieval with Access Control, Security and Pre-Fetching 1 Project Proposal Data Storage / Retrieval with Access Control, Security and Pre- Presented By: Shashank Newadkar Aditya Dev Sarvesh Sharma Advisor: Prof. Ming-Hwa Wang COEN 241 - Cloud Computing Page

More information

New resource provision paradigms for Grid Infrastructures: Virtualization and Cloud

New resource provision paradigms for Grid Infrastructures: Virtualization and Cloud CISCO NerdLunch Series November 7, 2008 San Jose, CA New resource provision paradigms for Grid Infrastructures: Virtualization and Cloud Ruben Santiago Montero Distributed Systems Architecture Research

More information

wu.cloud: Insights Gained from Operating a Private Cloud System

wu.cloud: Insights Gained from Operating a Private Cloud System wu.cloud: Insights Gained from Operating a Private Cloud System Stefan Theußl, Institute for Statistics and Mathematics WU Wirtschaftsuniversität Wien March 23, 2011 1 / 14 Introduction In statistics we

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

Lecture 02a Cloud Computing I

Lecture 02a Cloud Computing I Mobile Cloud Computing Lecture 02a Cloud Computing I 吳 秀 陽 Shiow-yang Wu What is Cloud Computing? Computing with cloud? Mobile Cloud Computing Cloud Computing I 2 Note 1 What is Cloud Computing? Walking

More information

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Pooja.B. Jewargi Prof. Jyoti.Patil Department of computer science and engineering,

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

C-Meter: A Framework for Performance Analysis of Computing Clouds

C-Meter: A Framework for Performance Analysis of Computing Clouds C-Meter: A Framework for Performance Analysis of Computing Clouds Nezih Yigitbasi, Alexandru Iosup, and Dick Epema {M.N.Yigitbasi, D.H.J.Epema, A.Iosup}@tudelft.nl Delft University of Technology Simon

More information

Building a Converged Infrastructure with Self-Service Automation

Building a Converged Infrastructure with Self-Service Automation Building a Converged Infrastructure with Self-Service Automation Private, Community, and Enterprise Cloud Scenarios Prepared for: 2012 Neovise, LLC. All Rights Reserved. Case Study Report Introduction:

More information

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND FEATURES AND FACTS FEATURES Hardware and hardware support for a monthly fee Optionally acquire Exadata Storage Server Software and

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload

More information

How AWS Pricing Works May 2015

How AWS Pricing Works May 2015 How AWS Pricing Works May 2015 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 15 Table of Contents Table of Contents... 2 Abstract... 3 Introduction...

More information

Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools

Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools A Software White Paper December 2013 Virtual Desktop Infrastructure Optimization with SysTrack Monitoring Tools and Login VSI Testing Tools A Joint White Paper from Login VSI and Software 2 Virtual Desktop

More information

Software-Defined Networks Powered by VellOS

Software-Defined Networks Powered by VellOS WHITE PAPER Software-Defined Networks Powered by VellOS Agile, Flexible Networking for Distributed Applications Vello s SDN enables a low-latency, programmable solution resulting in a faster and more flexible

More information

16.1 MAPREDUCE. For personal use only, not for distribution. 333

16.1 MAPREDUCE. For personal use only, not for distribution. 333 For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several

More information

A SURVEY ON WORKFLOW SCHEDULING IN CLOUD USING ANT COLONY OPTIMIZATION

A SURVEY ON WORKFLOW SCHEDULING IN CLOUD USING ANT COLONY OPTIMIZATION Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,

More information

CLEVER: a CLoud-Enabled Virtual EnviRonment

CLEVER: a CLoud-Enabled Virtual EnviRonment CLEVER: a CLoud-Enabled Virtual EnviRonment Francesco Tusa Maurizio Paone Massimo Villari Antonio Puliafito {ftusa,mpaone,mvillari,apuliafito}@unime.it Università degli Studi di Messina, Dipartimento di

More information

11.1 inspectit. 11.1. inspectit

11.1 inspectit. 11.1. inspectit 11.1. inspectit Figure 11.1. Overview on the inspectit components [Siegl and Bouillet 2011] 11.1 inspectit The inspectit monitoring tool (website: http://www.inspectit.eu/) has been developed by NovaTec.

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Business Continuity in Today s Cloud Economy. Balancing regulation and security using Hybrid Cloud while saving your company money.

Business Continuity in Today s Cloud Economy. Balancing regulation and security using Hybrid Cloud while saving your company money. ! Business Continuity in Today s Cloud Economy Balancing regulation and security using Hybrid Cloud while saving your company money.. Business Continuity is an area that every organization, and IT Executive,

More information