Int. J. Computational Science and Engineering, Vol. 1, No. 1/1,

Size: px
Start display at page:

Download "Int. J. Computational Science and Engineering, Vol. 1, No. 1/1, 2012 1"

Transcription

1 Int. J. Computational Science and ering, Vol. 1, No. 1/1, Improving Data Transfer Performance of Web s in the Cloud Environment Donglai Zhang School of Computer Science, Universiy of Adelaide, Adelaide 5000, Australia donglai.zhang@adelaide.edu.au *Corresponding author Paul Coddington eresearch SA and University of Adelaide Thebarton 5031, Australia paul.coddington@ersa.edu.au Andrew Wendelborn School of Computer Science, University of Adelaide, Adelaide 5000, Australia andrew@cs.adelaide.edu.au Abstract: Web Data Forwarding (WSDF) is a framework for centralized web service workflow, in which the intermediate result from a previous service is treated as a resource of the composite service and can be directly used by its subsequent service, without sending it back to the centralized control centre. To improve the data transfer performance of web service workflows in the cloud environment, we carried out a test of the WSDF framework in the ScienceCloud, provided by the Nimbus cloud infrastructure. The experiment showed that, in the cloud environment, the WSDF framework has significant performance advantage over normal web service framework for workflows with large data transfer and the improvement of performance agrees with the expected theoretical value. Keywords: web service workflow; wsrf; stateful; data transfer; cloud. Reference to this paper should be made as follows: Zhang, D., Coddington, P. and Wendelborn, A.L. (2012) Improving Data Transfer Performance of Web s in the Cloud Environment, Int. J. Computational Science and ering, Vol. 1, Nos. 1/1, pp Biographical notes: Andrew L Wendelborn received his PhD degree in computer science from the University of Adelaide in He was a software engineer for ICL in Australia and the UK. He commenced PhD studies in 1978, took up a position in Computer Science Department at the University of Adelaide in 1985, and is currently a Senior Lecturer. His research interests are programming models and applications in cloud and grid computing, e- Research and data intensive computing, parallel functional programming, and reflective computing. He is a member of ACM and the IEEE Computer Society, and active on several conference committees. Donglai Zhang is currently a PhD candidate at the Computer Science Department of the University of Adelaide. He graduated as a Master of Computer Science from the University of Adelaide in After that, he worked as a programmer in South Australia Partnership of Advanced Computing (SAPAC). He commenced his PhD study in His search interests are scientific workflow and applications in cloud and grid environment; data transfer and management in distributed environment. Dr. Paul Coddington is Deputy Director of eresearch SA, a South Australian eresearch service provider. He has a PhD in computational physics from the University of

2 2 Zhang D. et al. Southampton. He subsequently worked at Caltech, Syracuse University and the University of Adelaide on research and development projects focussing on highperformance and distributed computing and the Web, particularly their application to a variety of scientific problems, including the development of online scientific data repositories. 1 Introduction E-Science aims to serve scientists from various disciplines for their research works. For researchers, an ideal scientific workflow is designed to automatically carry out a complete set of data processing for their scientific research by simply clicking a run button (Atkinson et al., 2007; Bramley et al, 2006). For example, the data can be generated from instruments located at remote sites and processed by a sequential series of data analysis steps. Finally, the final result is returned to the user or stored in a location where it can be accessed by collaborators from different organizations. The increasing size of data being processed in workflows has led to significant overhead for data transfer between different services within a workflow, especially when the services composed into a workflow are web services. Research work focuses on both centralized workflow systems and decentralized workflow systems. In a centralized workflow, it is relatively easy for the user to control the workflow, but the data generated from each service needs to be sent back to the centralized control point before it is finally forwarded to the next service provider as input, which increases both time and resource (network bandwidth, CPU and memory) consumption. On the other hand, the decentralized workflow can have the data shared directly between different services in a workflow. However, it is often hard to control the whole workflow in a decentralized manner. To overcome the efficiency problems in the centralized workflow model, in our previous work (Zhang et al., 2011), we introduced the WSDF framework to improve data transfer. With a single web service, if the state of a specific service instance can be kept between invocations, it is recognized as a stateful web service. In a WSDF framework, we define the concept of stateful workflow, by which we mean that the intermediate data is preserved between successive services in the same workflow (Zhang et al., 2011). In a stateful workflow, all atomic services need to be stateful and all intermediate data is directly shared between atomic services, which is the same as what happened with the decentralized workflow model, as discussed further in section 4.2. Within a WSDF workflow, when the client invokes a service, data for processing is passed to the service together with the resource forwarding information, which contains the information about where to forward the result data after the data is processed. After the current service processes the input data, it can forward the result data to the next web service according to the resource forwarding information and the data is saved on the successor service as a WSRF (Web Resource Framework) resource. The resource is given a unique resource reference, presented by an Endpoint Reference (EPR) 1, is sent back to the workflow engine (client), and will be used by the client later to invoke the next service in the workflow. In previous work (Zhang et al., 2011), we use a simulated environment to test the feasibility of the WSDF framework. The time used for intermediate result transfer between different services has been significantly reduced. The emerging cloud technology provides another platform for hosting web services involved in a workflow. Cloud computing provides IT related services via network connections. First of all, cloud computing is another distributed computing paradigm, in which the service provider provides large scale, scalable computational resources (e.g. virtual machines) and storage capacity (Foster, I. et al., 2005; Geelan, J., 1991). Different types of cloud services have been provided: Infrastructure as a (IaaS), Platform as a (PaaS) and Software as a (SaaS). Cloud as a platform has been widely applied in workflow execution. Research works have been carried out to verify the cost and effectiveness of cloud computing platform for scientific workflows in (Deelman et al. and Marcellin, 2008; Kondo et al., 2009; Hoffa et al., 2008). According to (Hoffa et al., 2008), workflows can suffer from wide-area communications, particularly when the data transfer time constitutes a large part of the overall computation time. When not using WSDF, centralized workflows suffer the same shortcomings as the workflows in a standard distributed environment: the data needs to go via the workflow engine, which is often located remotely from the cloud, e.g. on the users desktop. By using WSDF in a cloud environment, this shortcoming can be eliminated. Consider the following scenario, a large set of data is generated from the users experiment site and to be processed by a workflow. To take the advantage of computational power provided by the cloud platform, e.g. Amazon EC2 2, before the experiment, the scientists carefully select useful services to be used in the workflow that is going to process the scientific data. s are built into images and stored in the cloud. Before the experiment starts, these images are instantiated on virtual machines, according to the expected resources, such as CPU, memory size, and network connections.

3 Improving Data Transfer Performance of Web Workows in the Cloud Environment 3 Therefore the services are initiated in the cloud. By controlling a centralized workflow for which the workflow engine sits on a desktop machine, the scientists initiate the generation of the data from scientific instrument(s) and the generated data will be processed by the services within the cloud. Finally, the result data is sent back to the workflow engine on client side. Within a cloud environment, the data sharing between consecutive services is even better comparing with other distributed environment. In a IaaS cloud 345, a basic IaaS type cloud service is provided to the client to acquire the necessary computational resources, from a single Linux virtual machine to a virtual cluster which has over a hundred nodes. So the data sharing can be completed within the cloud environment without getting out the cloud. For different processing components within the workflow, users can submit an image that contains the service, by initiating the image as a virtual machine, the user can share the software within a cloud environment. However, the client would still be able to control the whole workflow from his/her desktop for a centralized workflow. In this environment, the composite service is hosted within the cloud. At the beginning of the workflow, the data is sent to the service that is hosted within the cloud environment. After the web service processing the input data, the outcome data is forwarded to the next web service in the workflow. As the intermediate data does not need to be sent back to the workflow engine, it saves bandwidth and execution time. Furthermore, as the forwarding of the data is carried out between different web services within the workflow in the cloud, the web services can be configured in a local network. If they are in the same cloud, they can actually hosted within the same data centre and connected by high bandwidth network, which is physically nearly located. This article is organized as following. The second part of this paper we review the WSDF framework and our previous work; in the third part, we compare the cloud environment with the previous distributed environment for hosting web services workflow under WSDF framework; in section four, the experiment environment is introduced; in section five, we explain the model and the equations used to calculate the expected performance improvement and give the result of our experiments. We also compare the performance improvement for WSDF framework in the cloud (i.e. the ScienceCloud 6 ) and in a normal distributed environment; in section six, we review the related work; finally, we will present the conclusion and discuss future work. 2 Web Data Forwarding (WSDF) Framework With the development of information technology, more and more scientific research work is utilizing workflow as a power tool to carry out scientific research work. Figure 1 One Two Remote Distributed Environment Three Composite One Five Composite Composite Four Six Composite Two These works are often data oriented, therefore involve large scale data processing. Data transfer speed between different component in the workflow is one of the key aspects that affects the overall performance of a scientific workflow. There are two different kind of workflows according to the control point: centralized workflow vs. distributed workflow. In most real world cases, users want to have the full control of the running workflow. This is natural for security and convenience reasons. One well known centralized workflow system is Kepler 7 (Ludascher et al., 2006). To save the data transfer time while give the user full control of the workflow, we introduced the WSDF framework (Zhang et al., 2011). 2.1 WSDF Model Assumptions The primary target of WSDF framework is to address the data sharing issue within a web service workflow, as the current web service framework is based on the client-server model. The WSDF framework is based on the following assumptions: Participants (e.g. researchers in a common research project) share computational resources and data to solve a large scale, collaborative task. The service workflow is composed of web services and the current service output can be directly consumed by the successor service. 2.2 The Stateful workflow With normal web service workflows, a centralized workflow engine works as the client of different web services: sends request to invoke each web service in the workflow. The intermediate result from the participant services will be sent back to its requester, the workflow engine, even if the result will be used by the successor service. On the other hand, the services can be seen

4 4 Zhang D. et al. as a composite service, as shown in Figure 1. To the workflow engine, the combination of different services can be seen as a composite service from the remote side. If the workflow engine provides the initial data and does not need to get the intermediate data, then the composite service as a whole is as simple as a normal web service, get input, process the data and sends back the result. To achieve this, the intermediate data needs to be stored on the composite service side and sent to the next service. We define workflows that can provide this functionality as a stateful workflow. In a stateful workflow, each web service is a stateful web service. The de facto standard of stateful web service is Web Resource Framework (Zhang et al., 2011). In a stateful workflow, successive web services share the intermediate data, without sending back to the workflow engine, but instead, it is stored as a resource in the individual web services, and returned to the workflow engine by creating and giving back a reference. In this way, the state of the whole workflow is saved and stored on the composite service side. In our design, the workflow engine has the complete control of the workflow execution and uses the returned resource reference to invoke the next service in the composite service. 2.3 Data Forwarding between stateful web services Within the composite service, a stateful web service (current service) which is invoked not only processes the data forwarded to it, but also needs to store the result data on the composite service side as resource and returns the resource reference. To allow data sharing between current service and the successor service, either push or pull mechanism can be applied. We implemented push model as we suppose the workflow engine already knows which service will be invoked next, that will need the output of the current service. By applying push model, the current service needs to forward the result to the successor service, the workflow engine should send the data (if it is the first service in the workflow, or extra data as input) as well as information about the successor service (forwarding information) to the current service. To distinguish the normal data parameter from the resource forwarding information, we introduced a specific namespace, wsdf (Zhang et al., 2011). When the current service is invoked, the normal data and the resource forwarding information, which is under the namespace of wsdf are both sent to that service. 3 Distributed Environment Comparison The cooperation between different participants in a cooperative research work often involves building a workflow to process scientific data. Each participant can provide software, data and/or host the services. To support these services, there are two general approaches: Each software provider hosts their programs/services on their local site. Any user can access these services by composing them into a workflow. This is normally hard for the services that require non-trivial resources to run. The software providers are often unable to predict and supply enough computational/storage/network resources to host these services. It also needs more system administrations to monitor and administrate the servers. Software and services are all hosted by a regional computational centre or a computer centre for a specific research discipline. These service providers can provide better outcomes comparing with the first approach, as these entities often have better network connection, more powerful computational facilities and larger storage capacity. All the servers are run under the single administrate domain, which is the other great advantage over different administrations. The second approach has shown significant advantages over the first one. The organization and management of the services is maintained within a single administration scope, the availability of services and interconnection between the services can be greatly improved. Different services can be connected within a local network connection, typically, Ethernet connection, which can save significant amount of data transfer time, particularly when using the WSDF framework. However, there are also problems for the second approach. Under these circumstances, the cloud infrastructure is a better alternative. The administration is not completely automatic. Human intervention and communication is necessary to provide service hosting services. For example, a service provider asks the system administrator to maintain the service by providing a Linux box that has necessary service image installed on the machine. A system administrator will take care of the machine, including providing necessary network configuration and firewall setting for these machines. With very large data storage, which possibly exceeds the capacity of the single machine, it may take some time for system administrators to provide sufficient storage, which may require procuring and installing additional disksand could potentially introduce uncertainty as well as delay of the deployment of the whole workflow. On the other hand, the cloud is built to essentially eliminate these problems: the resource instantiation and allocation are controlled by resource management software to achieve the highest efficiency. These services almost cover all requests that a workflow might require. Furthermore, within a cloud, it is not necessary to talk to a system administrator, since the provisioning or resources is all automated.

5 Improving Data Transfer Performance of Web Workows in the Cloud Environment 5 Resources are limited in a specific organization. The service provider has a reasonable amount resources, in terms of machines, network connection, for example, to provide services. But comparing with the cloud provider, such as Amazon 3 and Apple 8 which have hundreds or even thousands of servers in their resource pool; more stable power supply; better network connection and system administration expertise, the cloud often provides a better solution for these kind of services. Web One Cloud Environment Web Two Web Three WSDF One Cloud Environment WSDF Two WSDF Three We utilize the cloud infrastructure provided by ScienceCloud to test the performance of WSDF workflow to see how effective this new framework can be within the cloud environment. Data and Control Flow (a) Web Figure 2 s in Cloud Data Flow Control Flow (b) WSDF 4 Facilities and Experiments Setting 4.1 Facilities We use ScienceCloud as our cloud platform to carry out data transfer experiments of scientific workflows. ScienceCloud is an instance of Nimbus 9 cloud management software, provided by the University of Chicago. It provides free access to its computational resources to the academic society. According to the cloud configuration requirement, we use the default configuration file provided by ScienceCloud. We initialize three virtual machines vm01, vm02 and vm03. Each of these virtual machine are allocated 2 CPU cores with 3 gigabytes memory. The firewall settings allow the three virtual machines to communicate with each other directly. By using iperf to test the connection between the virtual machines, we found the average result is about 910Mbits per second. The network connection between the cloud virtual machine and the client desktop in Adelaide University was also tested and the average speed by using default configuration with iperf was 54.0 Mbits/sec. Based on a simple cloud image hello-world which is provided by ScienceCloud 6, we built an image for our workflow experiments. The new image is named wsdfhello-world image and the size is 10 Gigabytes. The ScienceCloud did not provide data storage service as Amazon 4 does when the experiment was carried out. To avoid extra overhead that could be introduced by the data storage service, we enlarge the hello-world image to 10 Gigabytes. The new image is submitted to the cloud and saved into the user s repository by using client side tool provided by nimbus cloud. On the client side, a Linux box with kernel version is used as the workflow engine. 4.2 Setting The following figure shows the relationship between the normal web service workflow and a WSDF workflow in a cloud environment. Figure 2 illustrates that within a workflow, for any web service which is hosted by the cloud, all its input/output data and the control information (overlapped with the data flow) between this web service and the client engine needs to be passed between the workflow engine and the cloud. This is not efficient as the cloud provider can be far from its client. Within a WSDF workflow, on the other hand, only the initial data needs to be transferred between the client and cloud. Within the cloud, as network connection for data transfer between different services is very efficient, and the WSDF framework has provided the functionality of direct data transfer between different participants within the web service workflow, it will be much more efficient to apply WSDF workflow in a cloud environment. We use the workflows we built from the previous experiment (Zhang et al., 2009) to test its performance in the cloud environment.every workflow has two versions: a WSDF version and a normal web service version. The WSDF workflow has utilized WSDF services as the service providers, and the normal web service workflow use normal web services as its service providers. Each workflow is made up of a set of instances of the same service, called. A WSDF service provides create, setattachasresource and convert operations. Convert is a functional operation that takes the content of a.bmp image file as input and changes the color of its pixels: red to green, green to blue and blue to red; the create operation is used to create an EPR for a service instance on this service and the reference is sent back to the client; The setattachasresource operation is used to set the attachment of the request as a resource which is to be processed by the convert operation.the convert operation of a web service also consumes an input file (.bmp format) and changes the color of each pixel in the file. The updated content will be returned. In the cloud environment, services run on two different web service servers: normal web service server or WSDF web service server. Figure 3 shows steps in a WSDF workflow. In this figure, there are three services involved in the workflow. For each of them, first, a request to create

6 6 Zhang D. et al. _B _B _B _B _A _C _A _C _A _C _A _C Request /EPR Save File as Resource Invoke convert Step 1. Create EPR Step 2. Set Resource Step 3. Invoke Convert Operation Step 4. Process saved Resource _A _B Request /EPR _C _A _B _C _A _B Invoke convert _C _A _B _C EPR Step 5. Create Resource Instance Step 6. Set Resource Step 7. Invoke Convert Operation Step 8. Process saved Resource _B Request /EPR _B _B _B _A _C _A _C _A _C _A _C EPR Invoke convert Step 9. Create EPR Step 10. Set Resource Step 11. Invoke Convert Operation Step 12. Process saved Resource _B _A _C Control Flow Data Flow Send File Back Figure 3 Step 13. Return Processed Content Steps in a WSDF Framework an EPR is sent to the service and the created EPR is returned (steps 1, 5 and 9); then the data to be processed is sent to the service and saved as a resource referenced by the EPR created in the previous step (steps 2, 6 and 10); after that, a convert request is sent to the service to process the saved resource (steps 3, 7 and 11); finally, the result is sent back to the workflow engine (step 13). 5 Experiments In the Cloud We carried out testing within the cloud and give performance results of both normal web service workflow and WSDF workflow. We also give the expected performance improvement for WSDF workflow in the cloud environment according to the formula we derived from our previous work (Zhang. D, 2011) and verify that the performance improvement of WSDF workflows meets our expectation. 5.1 Total Time Consuming Figure 4 provides the total time consuming information of the web service workflow and WSDF workflow. From this figure we can see, with different file sizes, as well as different web services involved, the WSDF workflow always has significant advantages over normal web service workflow. This has been the same case as the experiment we have tested in the simulated distributed environment. 5.2 Performance Improvement Expectation In our previous work, we built equations to represent the theoretical time saving on data transfer for a WSDF workflow. We define the percentage of time saving from WSDF to be: P = T T 100 (1) T

7 Improving Data Transfer Performance of Web Workows in the Cloud Environment 7 the same size, which is represented by D. In this case, equation (3) becomes: P = D n 1 i=1 (2/BW C,S 1/BW S,S ) D n i=1 (2/BW 100 (4) C,S) Equation (4) can be further simplified to: P = n 1 n (1 0.5 BW C,S BW S,S ) (5) (a) We measured the network bandwidth on the cloud instances that we obtained. The bandwidth between the client and the servers was 54.0Mbits/sec, and the connections between different servers were 910Mbits/sec. If there are total 6 services in the workflow, by using equation (5), the theoretical result will be: P = ( ) = 81% (6) 910 In the following section, we will compare this expected data transfer performance improvement with the real performance improvement we measured from the experiments to verify if the practical result agrees with our proposed theory. (b) Figure 4 Total time consuming in cloud for different file sizes and different number of web services for both WSDF and normal web services. In equation (1), T is the overall transfer time for normal web service workflow, T is the overall transfer time for the WSDF workflow. A derivation of the expected theoretical values of T, T and P is given in (Zhang et al., 2011). For services hosted in a cloud, we can assume that bandwidths between the client and all the services are the same (represented by BW C,S ) and bandwidths between all services are the same (BW S,S ). Based on these assumptions, the performance improvement is given by : P = n 1 i=1 ( DOi+DIi+1 BW C,S DOi BW S,S ) n i=1 ( DIi+DOi BW C,S ) 100 (2) Within a workflow, the output data DO i (i (1, n 1)) of one service is often used as the input data DI i+1 (i (1, n 1)) of the next service. If we use DO i to replace DI i+1, then equation ( 2) can be simplified to: P = n 1 i=1 DOi ((2 BW C,S ) ( DOi n i=1 ( DIi+DOi BW C,S ) BW S,S )) 100 (3) In our experiment, the WSDF workflow is built from n instances of services, where the input data DI i, i (1, n), and the output data DO i, i (1, n), have 5.3 Performance Improvement Analysis Our interest is not limited to the general trend of performance improvement by applying WSDF workflow. We also analysis the performance impact brought by file size and number of web service involved in the workflow. The time consumption of a workflow can be classified into two categories: functional processing time and I/O time. The WSDF workflow provides the same computational functionalities which consume the same amount of time. However, it saves time in the data transferring (I/O) part. We compare the performance improvement by eliminating the processing time from the total time consuming for both workflows. In Figure 5, 6 and 8, the performance of WSDF vs. normal web service workflow is shown. For each workflow, we also apply the workflow with different numbers of web services within the workflow. The BST time (Basic Time) represents the time used by the web service for functional processing of the input data. In Figure 5, the experiment is based on files with sizes ranging from 100K bytes to 1M bytes. As we can see from this figure, the BST takes very little part in this processing, as the data size is very small. Majority of the time used by both workflows is for transferring data between participants in the workflow. According to the result, when there are 3 web services involved within the processing, the 100K bytes workflow gets 14% time saving on data transfer, the 500K Bytes file gets 30.72% of improvement and for a 1M bytes input file, the transfer time has been saved up to 41.58%. It shows the following trend: the larger the file, the higher improvement.

8 8 Zhang D. et al. (a) (b) (c) Figure 5 Comparison of performance of WSDF and normal web services in the cloud for small file sizes (a) (b) (c) Figure 6 Comparison of performance of WSDF and normal web services in the cloud for medium file sizes As we have seen, for a WSDF workflow, a WSDF service manages the result generated from the current service, and forwards it to the successor service, also saving it on the latter server as a resource. This means the successor service needs to create a resource reference for the result and save it, which involves extra resource management cost. With small files, the resource creation and management cost of time is steady and relatively high, comparing with the total time consumption. When the input file size increases, the ratio of time used in resource management is decreasing quickly, so the overall performance of improvement is also significant. In Figure 6, the performance improvement of WSDF in cloud environment is given for medium size files. For workflow with 3 services, the performance improvement for workflow with 5M bytes is 58.71%, with 10M bytes file is 60.33% and with 50M bytes file is about 57.79%. This means the performance improvement by increasing the file size is relatively small and becomes reasonable stable and this also applies to the larger files as shown in Figure 8. The other factor that will affect the performance improvement of a WSDF workflow is the number of services involved within a workflow. As shown in Figure 2(a) and (b), three web services hosted in the cloud that are invoked in a workflow. In Figure 2 (a), there are three data transfers happening between the client and server, all of them dual-directional. In Figure 2 (b), the cloud hosts three WSDF services. In total there are two unidirectional data transfers between the services and the client. This means that for a workflow with three services invocations, the WSDF service workflow needs two single way data transfers, comparing with a normal web service workflow needs six single direction data transfers. If there are more services involved within a workflow, a WSDF workflow still needs two single direction data transfers, only the data transfers between different services, which are all hosted in the cloud will be increased. While the number of data transfers between client and server for normal web services will need a proportional number of data transfers. From this point, when there are more services involved in the WSDF workflow, the performance of the workflow will also be increased accordingly. For example, as shown in Figure 6, with a file size of 10M bytes, when there are three services involved, (a) (b) (c) Figure 7 WSDF Performance Improve in Normal Distributed Environment (a) (b) (c) Figure 8 WSDF vs. Normal Web service Performance in Cloud (with large files) the time saving is 58.71%, when there are 6, 9 and 12 services involved, the time savings are 78.44% 81.97% and 85.09% respectively. According to our experiment, the same trend also happens with different file sizes. As we discussed, the amount of time saved by using WSDF has been increasing as more data transfer happens between web services that are in the cloud. Here we suppose that web services hosted in the cloud are located within a single data centre, ideally, the physical distance are not more than a few racks. 5.4 WSDF performance improvement comparison In our previous work (Zhang et al., 2009), we have run the WSDF performance tests within a simulated wide-area network environment, by using WANem 10 simulation software. Some of the results are shown in Figure 7. These results were obtained with WANem configured to have a network performance of 100Mbits/sec between the web services (under the

9 Improving Data Transfer Performance of Web Workows in the Cloud Environment 9 assumption that the services were all on a local area network connected by 100Mbits/sec Fast Ethernet) and the simulated network between the client and the services configured to match measured intercontinental (Australia to the USA) latencies and bandwidths. Hence this simulated network is a close match to the real network environment for the experiments we have done using the ScienceCloud. The only significant difference is that for the ScienceCloud the bandwidth that we measured between the services was significantly higher than our simulation, 941 Mbits/sec rather than 100 Mbits/sec. The performance of the WSDF workflows in the cloud is moderately better than the performance as measured in the simulated distributed environment, and the data transfer performance improvement shows that the performance difference is very small. For example, with a 5M bytes file as the input file size, with 3, 6, 9 and 12 services, comparing with normal web services, the WSDF get 56.21%, 72.51%, 77.40% and 80.85% performance improvement in the simulated environment, and the same WSDF workflow in the cloud gets 60.33% 76.37% 82.35% 83.25%. For a workflow with 6 services and different file sizes: 5M, 10M, 50M and 100M bytes, in the normal distributed environment the performance improvements are: 72.51%, 72.68%, 68.27% and 68.30%. In the cloud, the same WSDF workflow gets 76.37%, 78.44, 79.37% and 79.94% performance improvement. These two groups of data imply two facts: first, the WSDF performance improvement in a cloud environment is pretty close to the theoretical result we get from section 5.2, where the expected time saving is about 81% with a workflow of 6 services; second,the measured performance in the cloud was slightly better than the simulated performance. We believe the reason for that is the ScienceCloud provided faster network connections between different WSDF services (910Mbits/sec) than the value we used in the normal simulated environment (100Mbits/sec), which is based on the measured bandwidths of real networks. 6 Related work Scientific workflows often involved in large data sets processing and it is often data flow oriented. It turns out that, within a centralized workflow model, data sharing between different web services can often become the bottleneck of workflows (Barker et al., 2009). Some researches in this area suggest to use decentralized workflow model to solve this problem (Barker et al, 2008). In any workflow, there are control flow as well as data flow aspects. These relationships are explored in (Liu et al, 2005). In (Barker et al, 2008), the authors described algorithms to convert a workflow into smaller units that run on different servers with direct communication between them. The expected benefit from this approach is to avoid the central point of the workflow orchestration becoming a bottleneck and significantly improve the overall throughput. This approach is especially good for data driven workflow. But this algorithm also makes the whole system more complex, whereas our approach does not change the workflow. In (Walter et al, 2006), the proxy model is suggested and a hybrid architecture is built. Here a proxy is defined as a piece of middleware closely coupled to a functional service as a gateway. It delegates the invocation of the functional service; managing input/output data storage and responsible of sending the result data between workflow components. This research work has pointed out some research issues in the data sharing problem between Web services in a centralized workflow, such as result data storage, forward and retrieving. Comparing with our approach, the drawback of the proxy model is that it addresses the data sharing problem from an application level, rather than from the server level, which leaves the workload to the programmer to maintain these services for themselves. 7 Conclusion and Future Work In our previous work, we have proposed a WSDF framework to allow directly data sharing between consecutive web services within a web service workflow. We also built prototype workflows and simulation environment to verify the performance improvement. Based on the previous achievement, we carried out the similar experiments in a cloud based environment. The experiments in the cloud show that the measured improvement in this particular cloud environment was even greater than expected with a similar experiment in a normal distributed environment. However, the advantage of the cloud is more significant in terms of management and stability. In most circumstances, for a user of IT resources who has the basic knowledge of using cloud infrastructure, the hosting of disk images in a research centre rather than in cloud is more time consuming in terms of locating hardware, installing proper software, setting up all services, configuring firewalls and managing other system administration related works. Computational power, network connection and data storage can not be expanded as quickly as the cloud does. And finally, the normal centre-based services need person to person communication (by using , or talking face to face, etc) to set up the services, which is very inefficient and less predictable. Cloud allows the user to specify machine instances the researchers are looking for, all configurations can be operated by users directly. Finally, all these functions are exposed to users via web or web services interface by using APIs provided by the cloud provider, therefore all the work can be automated. From the experiment, we compare the performance of WSDF workflow with a normal web service workflow

10 10 Zhang D. et al. in a cloud environment that is provided remotely and significant performance improvement is achieved. The percentage of performance improvement by using WSDF are very similar in different environment. However, as the cloud can often be more efficient in system administration, network connection and providing storage capacity, cloud environment is often a better choice for applying workflows based on WSDF framework. The other advantage of the cloud is that cloud providers also provide data storage service for its users. For example, Amazon provides data storage service 4 which could be used together with its EC2 service. Within our work, we carried out our experiments without using such services, as the data sets used in the workflows are relatively small and can be saved on the disk image. In the real world, a scientific workflow could possibly process large data set, e.g. in Tera bytes or even Peta bytes scale. Therefore this functionality is vital. On the other hand, most normal service providers, such as small computing centres can not provide this level of data storage in a convenient way. Acknowledgment The staff members who maintain the ScienceCloud gave us extensive support to run our experiments. References Atkinson, I., et al. (2007) Developing cima-based cyberinfrastructure for remote access to scientific instruments and collaborative e-research, Australian Symposium on Grid Computing and Research, Conferences in Research and Practice in Information Technology, Australian Computer Society, Australia, 2007, Vol.68, pp Ludascher, B. et al. (2006) Scientific workflow management and the Kepler system: Research Articles, Concurrent Computing: Practice and Experience, Vol. 18, No. 10, pp Zhang, D., Coddington, P., and Wendelborn, A.L., (2011) Web s workflow with result data forwarding as resources, Future Generation Computer Systems, Vol.27, pp Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., and Good, J. (2008) On the use of cloud computing for scientific workflows, IEEE Fourth International Conference on escience, 2008, pp Bramley, A., Chiu, K., Devadithya, T., Gupta, N., Hart, C., Huffman, J.C., Huffman, K., Ma, Y., and Mcmullen, D.F., (2006) Instrument monitoring, data sharing and archiving using common instrument middleware architecture, Journal of Chemical Information and Modeling, Vol. 46, No. 3 pp Deelman, E., Singh, G., Livny, M.,Berriman, B.,and Good, J., (2008) The cost of doing science on the cloud: the montage example, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, Austin, Texas, 2008 pp. 50: 1 50:12. Kondo, D., Javadi, B., Malecot, P., Cappello, F., and Anderson, D. (2009) Cost- benefit analysis of cloud computing versus desktop grids, Parallel Distributed Processing, 2009, IEEE International Symposium on, pp Geelan, J.,(2011) Twenty-one experts define cloud computing, Foster, I., Zhao, Y., Raicu, I. and Lu, S. (2005) Cloud Computing and Grid Computing 360-Degree Compared, Grid Computing Environments Workshop, GCE 08, pp Zhang, D., Coddington, P., Wendelborn, A.L., (2011) Technical Report: Web s workflow with result data forwarding as resources, pdf. Barker, A, Besana, P., Robertson, D., and Weissman, J. B. The benefits of service choreography for data-intensive computing, Proceedings of the 7th international workshop on Challenges of large applications in distributed environments, CLADE 08, pp Chafle, G., Chandra, S., Mann, V., Nanda, M.G. Orchestrating composite Web services under data flow constraints, Web s, ICWS Proc IEEE Int. Conference on, vol.1 pp Liu, D., and Law, K.H., and Wiederhold, G. Dataflow Distribution in FICAS Composition Infrastructure, In Proceedings of the 15th International Conference on Parallel and Distributed Computing Systems, Barker, A., Weissman, J.B., van Hemert, J.I. Orchestrating Data-Centric s, Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid, pp , Walter, B., Ion, C., and Boi, F., Orchestrating Data- Centric s, ICWS 06: Proc. of the IEEE Int. Conference on Web s, pp Note 1 Oasis open homepage, icloud 9 Nimbus 10 WANem: The Wide Area Network emulator

Grid Computing Vs. Cloud Computing

Grid Computing Vs. Cloud Computing International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid

More information

XMPP A Perfect Protocol for the New Era of Volunteer Cloud Computing

XMPP A Perfect Protocol for the New Era of Volunteer Cloud Computing International Journal of Computational Engineering Research Vol, 03 Issue, 10 XMPP A Perfect Protocol for the New Era of Volunteer Cloud Computing Kamlesh Lakhwani 1, Ruchika Saini 1 1 (Dept. of Computer

More information

Enabling Execution of Service Workflows in Grid/Cloud Hybrid Systems

Enabling Execution of Service Workflows in Grid/Cloud Hybrid Systems Enabling Execution of Service Workflows in Grid/Cloud Hybrid Systems Luiz F. Bittencourt, Carlos R. Senna, and Edmundo R. M. Madeira Institute of Computing University of Campinas - UNICAMP P.O. Box 6196,

More information

Clearing the Clouds. Understanding cloud computing. Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY. Cloud computing

Clearing the Clouds. Understanding cloud computing. Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY. Cloud computing Clearing the Clouds Understanding cloud computing Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY Cloud computing There are many definitions and they all differ Simply put, cloud computing

More information

Cloud Computing For Distributed University Campus: A Prototype Suggestion

Cloud Computing For Distributed University Campus: A Prototype Suggestion Cloud Computing For Distributed University Campus: A Prototype Suggestion Mehmet Fatih Erkoç, Serhat Bahadir Kert mferkoc@yildiz.edu.tr, sbkert@yildiz.edu.tr Yildiz Technical University (Turkey) Abstract

More information

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction Vol. 3 Issue 1, January-2014, pp: (1-5), Impact Factor: 1.252, Available online at: www.erpublications.com Performance evaluation of cloud application with constant data center configuration and variable

More information

CLOUD COMPUTING: A NEW VISION OF THE DISTRIBUTED SYSTEM

CLOUD COMPUTING: A NEW VISION OF THE DISTRIBUTED SYSTEM CLOUD COMPUTING: A NEW VISION OF THE DISTRIBUTED SYSTEM Taha Chaabouni 1 and Maher Khemakhem 2 1 MIRACL Lab, FSEG, University of Sfax, Sfax, Tunisia chaabounitaha@yahoo.fr 2 MIRACL Lab, FSEG, University

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

A Study on Service Oriented Network Virtualization convergence of Cloud Computing

A Study on Service Oriented Network Virtualization convergence of Cloud Computing A Study on Service Oriented Network Virtualization convergence of Cloud Computing 1 Kajjam Vinay Kumar, 2 SANTHOSH BODDUPALLI 1 Scholar(M.Tech),Department of Computer Science Engineering, Brilliant Institute

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

Six Strategies for Building High Performance SOA Applications

Six Strategies for Building High Performance SOA Applications Six Strategies for Building High Performance SOA Applications Uwe Breitenbücher, Oliver Kopp, Frank Leymann, Michael Reiter, Dieter Roller, and Tobias Unger University of Stuttgart, Institute of Architecture

More information

OCRP Implementation to Optimize Resource Provisioning Cost in Cloud Computing

OCRP Implementation to Optimize Resource Provisioning Cost in Cloud Computing OCRP Implementation to Optimize Resource Provisioning Cost in Cloud Computing K. Satheeshkumar PG Scholar K. Senthilkumar PG Scholar A. Selvakumar Assistant Professor Abstract- Cloud computing is a large-scale

More information

Creating A Galactic Plane Atlas With Amazon Web Services

Creating A Galactic Plane Atlas With Amazon Web Services Creating A Galactic Plane Atlas With Amazon Web Services G. Bruce Berriman 1*, Ewa Deelman 2, John Good 1, Gideon Juve 2, Jamie Kinney 3, Ann Merrihew 3, and Mats Rynge 2 1 Infrared Processing and Analysis

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

Efficient Data Management Support for Virtualized Service Providers

Efficient Data Management Support for Virtualized Service Providers Efficient Data Management Support for Virtualized Service Providers Íñigo Goiri, Ferran Julià and Jordi Guitart Barcelona Supercomputing Center - Technical University of Catalonia Jordi Girona 31, 834

More information

Investigation of Cloud Computing: Applications and Challenges

Investigation of Cloud Computing: Applications and Challenges Investigation of Cloud Computing: Applications and Challenges Amid Khatibi Bardsiri Anis Vosoogh Fatemeh Ahoojoosh Research Branch, Islamic Azad University, Sirjan, Iran Research Branch, Islamic Azad University,

More information

Efficient Cloud Management for Parallel Data Processing In Private Cloud

Efficient Cloud Management for Parallel Data Processing In Private Cloud 2012 International Conference on Information and Network Technology (ICINT 2012) IPCSIT vol. 37 (2012) (2012) IACSIT Press, Singapore Efficient Cloud Management for Parallel Data Processing In Private

More information

An Open MPI-based Cloud Computing Service Architecture

An Open MPI-based Cloud Computing Service Architecture An Open MPI-based Cloud Computing Service Architecture WEI-MIN JENG and HSIEH-CHE TSAI Department of Computer Science Information Management Soochow University Taipei, Taiwan {wjeng, 00356001}@csim.scu.edu.tw

More information

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing

Keywords Distributed Computing, On Demand Resources, Cloud Computing, Virtualization, Server Consolidation, Load Balancing Volume 5, Issue 1, January 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Survey on Load

More information

CLEVER: a CLoud-Enabled Virtual EnviRonment

CLEVER: a CLoud-Enabled Virtual EnviRonment CLEVER: a CLoud-Enabled Virtual EnviRonment Francesco Tusa Maurizio Paone Massimo Villari Antonio Puliafito {ftusa,mpaone,mvillari,apuliafito}@unime.it Università degli Studi di Messina, Dipartimento di

More information

AN IMPLEMENTATION OF E- LEARNING SYSTEM IN PRIVATE CLOUD

AN IMPLEMENTATION OF E- LEARNING SYSTEM IN PRIVATE CLOUD AN IMPLEMENTATION OF E- LEARNING SYSTEM IN PRIVATE CLOUD M. Lawanya Shri 1, Dr. S. Subha 2 1 Assistant Professor,School of Information Technology and Engineering, Vellore Institute of Technology, Vellore-632014

More information

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,

More information

Early Cloud Experiences with the Kepler Scientific Workflow System

Early Cloud Experiences with the Kepler Scientific Workflow System Available online at www.sciencedirect.com Procedia Computer Science 9 (2012 ) 1630 1634 International Conference on Computational Science, ICCS 2012 Early Cloud Experiences with the Kepler Scientific Workflow

More information

Cloud Infrastructure Pattern

Cloud Infrastructure Pattern 1 st LACCEI International Symposium on Software Architecture and Patterns (LACCEI-ISAP-MiniPLoP 2012), July 23-27, 2012, Panama City, Panama. Cloud Infrastructure Pattern Keiko Hashizume Florida Atlantic

More information

CLOUD COMPUTING. When It's smarter to rent than to buy

CLOUD COMPUTING. When It's smarter to rent than to buy CLOUD COMPUTING When It's smarter to rent than to buy Is it new concept? Nothing new In 1990 s, WWW itself Grid Technologies- Scientific applications Online banking websites More convenience Not to visit

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS377 Guest Lecture Tian Guo 1 Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing Case Study: Amazon EC2 2 Data Centers

More information

Performance Analysis of Cloud-Based Applications

Performance Analysis of Cloud-Based Applications Performance Analysis of Cloud-Based Applications Peter Budai and Balazs Goldschmidt Budapest University of Technology and Economics, Department of Control Engineering and Informatics, Budapest, Hungary

More information

AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION

AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION Shanmuga Priya.J 1, Sridevi.A 2 1 PG Scholar, Department of Information Technology, J.J College of Engineering and Technology

More information

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004

More information

Cloud computing - Architecting in the cloud

Cloud computing - Architecting in the cloud Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices

More information

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm

ABSTRACT. KEYWORDS: Cloud Computing, Load Balancing, Scheduling Algorithms, FCFS, Group-Based Scheduling Algorithm A REVIEW OF THE LOAD BALANCING TECHNIQUES AT CLOUD SERVER Kiran Bala, Sahil Vashist, Rajwinder Singh, Gagandeep Singh Department of Computer Science & Engineering, Chandigarh Engineering College, Landran(Pb),

More information

ANALYSIS OF WORKFLOW SCHEDULING PROCESS USING ENHANCED SUPERIOR ELEMENT MULTITUDE OPTIMIZATION IN CLOUD

ANALYSIS OF WORKFLOW SCHEDULING PROCESS USING ENHANCED SUPERIOR ELEMENT MULTITUDE OPTIMIZATION IN CLOUD ANALYSIS OF WORKFLOW SCHEDULING PROCESS USING ENHANCED SUPERIOR ELEMENT MULTITUDE OPTIMIZATION IN CLOUD Mrs. D.PONNISELVI, M.Sc., M.Phil., 1 E.SEETHA, 2 ASSISTANT PROFESSOR, M.PHIL FULL-TIME RESEARCH SCHOLAR,

More information

Optimal Deployment of Geographically Distributed Workflow Engines on the Cloud

Optimal Deployment of Geographically Distributed Workflow Engines on the Cloud Optimal Deployment of Geographically Distributed Workflow Engines on the Cloud Long Thai, Adam Barker, Blesson Varghese, Ozgur Akgun and Ian Miguel School of Computer Science, University of St Andrews,

More information

Round Robin with Server Affinity: A VM Load Balancing Algorithm for Cloud Based Infrastructure

Round Robin with Server Affinity: A VM Load Balancing Algorithm for Cloud Based Infrastructure J Inf Process Syst, Vol.9, No.3, September 2013 pissn 1976-913X eissn 2092-805X http://dx.doi.org/10.3745/jips.2013.9.3.379 Round Robin with Server Affinity: A VM Load Balancing Algorithm for Cloud Based

More information

Practical Approach for Achieving Minimum Data Sets Storage Cost In Cloud

Practical Approach for Achieving Minimum Data Sets Storage Cost In Cloud Practical Approach for Achieving Minimum Data Sets Storage Cost In Cloud M.Sasikumar 1, R.Sindhuja 2, R.Santhosh 3 ABSTRACT Traditionally, computing has meant calculating results and then storing those

More information

Cloud Computing. Chapter 1 Introducing Cloud Computing

Cloud Computing. Chapter 1 Introducing Cloud Computing Cloud Computing Chapter 1 Introducing Cloud Computing Learning Objectives Understand the abstract nature of cloud computing. Describe evolutionary factors of computing that led to the cloud. Describe virtualization

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises

More information

A Survey on Cloud Computing

A Survey on Cloud Computing A Survey on Cloud Computing Poulami dalapati* Department of Computer Science Birla Institute of Technology, Mesra Ranchi, India dalapati89@gmail.com G. Sahoo Department of Information Technology Birla

More information

Linstantiation of applications. Docker accelerate

Linstantiation of applications. Docker accelerate Industrial Science Impact Factor : 1.5015(UIF) ISSN 2347-5420 Volume - 1 Issue - 12 Aug - 2015 DOCKER CONTAINER 1 2 3 Sawale Bharati Shankar, Dhoble Manoj Ramchandra and Sawale Nitin Shankar images. ABSTRACT

More information

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee Abstract Virtualization platform solutions

More information

AUTOMATED AND ADAPTIVE DOWNLOAD SERVICE USING P2P APPROACH IN CLOUD

AUTOMATED AND ADAPTIVE DOWNLOAD SERVICE USING P2P APPROACH IN CLOUD IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 4, Apr 2014, 63-68 Impact Journals AUTOMATED AND ADAPTIVE DOWNLOAD

More information

Evaluation Methodology of Converged Cloud Environments

Evaluation Methodology of Converged Cloud Environments Krzysztof Zieliński Marcin Jarząb Sławomir Zieliński Karol Grzegorczyk Maciej Malawski Mariusz Zyśk Evaluation Methodology of Converged Cloud Environments Cloud Computing Cloud Computing enables convenient,

More information

Secure Cloud Computing through IT Auditing

Secure Cloud Computing through IT Auditing Secure Cloud Computing through IT Auditing 75 Navita Agarwal Department of CSIT Moradabad Institute of Technology, Moradabad, U.P., INDIA Email: nvgrwl06@gmail.com ABSTRACT In this paper we discuss the

More information

A Middleware Strategy to Survive Compute Peak Loads in Cloud

A Middleware Strategy to Survive Compute Peak Loads in Cloud A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk

More information

High Performance Computing Cloud Computing. Dr. Rami YARED

High Performance Computing Cloud Computing. Dr. Rami YARED High Performance Computing Cloud Computing Dr. Rami YARED Outline High Performance Computing Parallel Computing Cloud Computing Definitions Advantages and drawbacks Cloud Computing vs Grid Computing Outline

More information

Performance of the Cloud-Based Commodity Cluster. School of Computer Science and Engineering, International University, Hochiminh City 70000, Vietnam

Performance of the Cloud-Based Commodity Cluster. School of Computer Science and Engineering, International University, Hochiminh City 70000, Vietnam Computer Technology and Application 4 (2013) 532-537 D DAVID PUBLISHING Performance of the Cloud-Based Commodity Cluster Van-Hau Pham, Duc-Cuong Nguyen and Tien-Dung Nguyen School of Computer Science and

More information

Data-Aware Service Choreographies through Transparent Data Exchange

Data-Aware Service Choreographies through Transparent Data Exchange Institute of Architecture of Application Systems Data-Aware Service Choreographies through Transparent Data Exchange Michael Hahn, Dimka Karastoyanova, and Frank Leymann Institute of Architecture of Application

More information

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research. Young Choon Lee How to Do/Evaluate Cloud Computing Research Young Choon Lee Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing

More information

Efficient Data Replication Scheme based on Hadoop Distributed File System

Efficient Data Replication Scheme based on Hadoop Distributed File System , pp. 177-186 http://dx.doi.org/10.14257/ijseia.2015.9.12.16 Efficient Data Replication Scheme based on Hadoop Distributed File System Jungha Lee 1, Jaehwa Chung 2 and Daewon Lee 3* 1 Division of Supercomputing,

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Dynamic Resource management with VM layer and Resource prediction algorithms in Cloud Architecture

Dynamic Resource management with VM layer and Resource prediction algorithms in Cloud Architecture Dynamic Resource management with VM layer and Resource prediction algorithms in Cloud Architecture 1 Shaik Fayaz, 2 Dr.V.N.Srinivasu, 3 Tata Venkateswarlu #1 M.Tech (CSE) from P.N.C & Vijai Institute of

More information

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications Open System Laboratory of University of Illinois at Urbana Champaign presents: Outline: IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications A Fine-Grained Adaptive

More information

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network ASEE 2014 Zone I Conference, April 3-5, 2014, University of Bridgeport, Bridgpeort, CT, USA A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network Mohammad Naimur Rahman

More information

On Cloud Computing Technology in the Construction of Digital Campus

On Cloud Computing Technology in the Construction of Digital Campus 2012 International Conference on Innovation and Information Management (ICIIM 2012) IPCSIT vol. 36 (2012) (2012) IACSIT Press, Singapore On Cloud Computing Technology in the Construction of Digital Campus

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Cloud/SaaS enablement of existing applications

Cloud/SaaS enablement of existing applications Cloud/SaaS enablement of existing applications GigaSpaces: Nati Shalom, CTO & Founder About GigaSpaces Technologies Enabling applications to run a distributed cluster as if it was a single machine 75+

More information

Deploying Business Virtual Appliances on Open Source Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and

More information

Workflow Partitioning and Deployment on the Cloud using Orchestra

Workflow Partitioning and Deployment on the Cloud using Orchestra Workflow Partitioning and Deployment on the Cloud using Orchestra Ward Jaradat, Alan Dearle, and Adam Barker School of Computer Science, University of St Andrews, North Haugh, St Andrews, Fife, KY16 9SX,

More information

Virtualization and Cloud Computing

Virtualization and Cloud Computing Written by Zakir Hossain, CS Graduate (OSU) CEO, Data Group Fed Certifications: PFA (Programming Foreign Assistance), COR (Contracting Officer), AOR (Assistance Officer) Oracle Certifications: OCP (Oracle

More information

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input

More information

Virtual Machine Based Resource Allocation For Cloud Computing Environment

Virtual Machine Based Resource Allocation For Cloud Computing Environment Virtual Machine Based Resource Allocation For Cloud Computing Environment D.Udaya Sree M.Tech (CSE) Department Of CSE SVCET,Chittoor. Andra Pradesh, India Dr.J.Janet Head of Department Department of CSE

More information

Grid Computing vs Cloud

Grid Computing vs Cloud Chapter 3 Grid Computing vs Cloud Computing 3.1 Grid Computing Grid computing [8, 23, 25] is based on the philosophy of sharing information and power, which gives us access to another type of heterogeneous

More information

Cost-Benefit Analysis of Cloud Computing versus Desktop Grids

Cost-Benefit Analysis of Cloud Computing versus Desktop Grids Cost-Benefit Analysis of Cloud Computing versus Desktop Grids Derrick Kondo, Bahman Javadi, Paul Malécot, Franck Cappello INRIA, France David P. Anderson UC Berkeley, USA Cloud Background Vision Hide complexity

More information

Optimal Service Pricing for a Cloud Cache

Optimal Service Pricing for a Cloud Cache Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,

More information

Enhancing the Scalability of Virtual Machines in Cloud

Enhancing the Scalability of Virtual Machines in Cloud Enhancing the Scalability of Virtual Machines in Cloud Chippy.A #1, Ashok Kumar.P #2, Deepak.S #3, Ananthi.S #4 # Department of Computer Science and Engineering, SNS College of Technology Coimbatore, Tamil

More information

A Survey on Load Balancing and Scheduling in Cloud Computing

A Survey on Load Balancing and Scheduling in Cloud Computing IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 7 December 2014 ISSN (online): 2349-6010 A Survey on Load Balancing and Scheduling in Cloud Computing Niraj Patel

More information

Infrastructure as a Service (IaaS)

Infrastructure as a Service (IaaS) Infrastructure as a Service (IaaS) (ENCS 691K Chapter 4) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ References 1. R. Moreno et al.,

More information

Using Proxies to Accelerate Cloud Applications

Using Proxies to Accelerate Cloud Applications Using Proxies to Accelerate Cloud Applications Jon Weissman and Siddharth Ramakrishnan Department of Computer Science and Engineering University of Minnesota, Twin Cities Abstract A rich cloud ecosystem

More information

Is Cloud Computing the Solution for Brazilian Researchers? *

Is Cloud Computing the Solution for Brazilian Researchers? * Is Cloud Computing the Solution for Brazilian Researchers? * Daniel de Oliveira Federal University of Rio de Janeiro UFRJ Rio de Janeiro, Brazil Eduardo Ogasawara Federal University of Rio de Janeiro Federal

More information

On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds

On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds Thiago A. L. Genez, Luiz F. Bittencourt, Edmundo R. M. Madeira Institute of Computing University of Campinas UNICAMP Av. Albert

More information

Saving Mobile Battery Over Cloud Using Image Processing

Saving Mobile Battery Over Cloud Using Image Processing Saving Mobile Battery Over Cloud Using Image Processing Khandekar Dipendra J. Student PDEA S College of Engineering,Manjari (BK) Pune Maharasthra Phadatare Dnyanesh J. Student PDEA S College of Engineering,Manjari

More information

How To Understand Cloud Computing

How To Understand Cloud Computing Overview of Cloud Computing (ENCS 691K Chapter 1) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ Overview of Cloud Computing Towards a definition

More information

Cloud-pilot.doc 12-12-2010 SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A

Cloud-pilot.doc 12-12-2010 SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A Identifier: Date: Activity: Authors: Status: Link: Cloud-pilot.doc 12-12-2010 SA1 Marcus Hardt, Marcin Plociennik, Ahmad Hammad, Bartek Palak E U F O R I A J O I N T A C T I O N ( S A 1, J R A 3 ) F I

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION Kirandeep Kaur Khushdeep Kaur Research Scholar Assistant Professor, Department Of Cse, Bhai Maha Singh College Of Engineering, Bhai Maha Singh

More information

Research on Operation Management under the Environment of Cloud Computing Data Center

Research on Operation Management under the Environment of Cloud Computing Data Center , pp.185-192 http://dx.doi.org/10.14257/ijdta.2015.8.2.17 Research on Operation Management under the Environment of Cloud Computing Data Center Wei Bai and Wenli Geng Computer and information engineering

More information

Graduated Student: José O. Nogueras Colón Adviser: Yahya M. Masalmah, Ph.D.

Graduated Student: José O. Nogueras Colón Adviser: Yahya M. Masalmah, Ph.D. Graduated Student: José O. Nogueras Colón Adviser: Yahya M. Masalmah, Ph.D. Introduction Problem Statement Objectives Hyperspectral Imagery Background Grid Computing Desktop Grids DG Advantages Green Desktop

More information

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE Sudha M 1, Harish G M 2, Nandan A 3, Usha J 4 1 Department of MCA, R V College of Engineering, Bangalore : 560059, India sudha.mooki@gmail.com 2 Department

More information

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091 Citation: Alhamad, Mohammed and Dillon, Tharam S. and Wu, Chen and Chang, Elizabeth. 2010. Response time for cloud computing providers, in Kotsis, G. and Taniar, D. and Pardede, E. and Saleh, I. and Khalil,

More information

Figure 1. The cloud scales: Amazon EC2 growth [2].

Figure 1. The cloud scales: Amazon EC2 growth [2]. - Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 shinji10343@hotmail.com, kwang@cs.nctu.edu.tw Abstract One of the most important issues

More information

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises

More information

OW2 Meeting Towards Building a Cloud Platform for Service Oriented Software Development

OW2 Meeting Towards Building a Cloud Platform for Service Oriented Software Development Towards Building a Cloud Platform for Service Oriented Software Development Hailong Sun (Assistant Professor) Beihang University Sept. 21, 2010 Motivation Existing work The state of the art Conclusion

More information

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available

More information

CloudAnalyst: A CloudSim-based Visual Modeller for Analysing Cloud Computing Environments and Applications

CloudAnalyst: A CloudSim-based Visual Modeller for Analysing Cloud Computing Environments and Applications CloudAnalyst: A CloudSim-based Visual Modeller for Analysing Cloud Computing Environments and Applications Bhathiya Wickremasinghe 1, Rodrigo N. Calheiros 2, and Rajkumar Buyya 1 1 The Cloud Computing

More information

Masters Project Proposal

Masters Project Proposal Masters Project Proposal Virtual Machine Storage Performance Using SR-IOV by Michael J. Kopps Committee Members and Signatures Approved By Date Advisor: Dr. Jia Rao Committee Member: Dr. Xiabo Zhou Committee

More information

Distribution transparency. Degree of transparency. Openness of distributed systems

Distribution transparency. Degree of transparency. Openness of distributed systems Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed

More information

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study Journal of Algorithms & Computational Technology Vol. 6 No. 3 483 Real-Time Analysis of CDN in an Academic Institute: A Simulation Study N. Ramachandran * and P. Sivaprakasam + *Indian Institute of Management

More information

How To Balance In Cloud Computing

How To Balance In Cloud Computing A Review on Load Balancing Algorithms in Cloud Hareesh M J Dept. of CSE, RSET, Kochi hareeshmjoseph@ gmail.com John P Martin Dept. of CSE, RSET, Kochi johnpm12@gmail.com Yedhu Sastri Dept. of IT, RSET,

More information

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES Table of Contents Introduction... 1 Network Virtualization Overview... 1 Network Virtualization Key Requirements to be validated...

More information

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 An Efficient Approach for Load Balancing in Cloud Environment Balasundaram Ananthakrishnan Abstract Cloud computing

More information

Ad hoc Cloud Computing

Ad hoc Cloud Computing Ad hoc Cloud Computing Gary A. McGilvary, Adam Barker, Malcolm Atkinson Edinburgh Data-Intensive Research Group, School of Informatics, The University of Edinburgh Email: gary.mcgilvary@ed.ac.uk, mpa@staffmail.ed.ac.uk

More information

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD M.Rajeswari 1, M.Savuri Raja 2, M.Suganthy 3 1 Master of Technology, Department of Computer Science & Engineering, Dr. S.J.S Paul Memorial

More information

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of

More information

Exploring Resource Provisioning Cost Models in Cloud Computing

Exploring Resource Provisioning Cost Models in Cloud Computing Exploring Resource Provisioning Cost Models in Cloud Computing P.Aradhya #1, K.Shivaranjani *2 #1 M.Tech, CSE, SR Engineering College, Warangal, Andhra Pradesh, India # Assistant Professor, Department

More information

Auto-Scaling Model for Cloud Computing System

Auto-Scaling Model for Cloud Computing System Auto-Scaling Model for Cloud Computing System Che-Lun Hung 1*, Yu-Chen Hu 2 and Kuan-Ching Li 3 1 Dept. of Computer Science & Communication Engineering, Providence University 2 Dept. of Computer Science

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

Minimal Cost Data Sets Storage in the Cloud

Minimal Cost Data Sets Storage in the Cloud Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.1091

More information