CLOUD COMPRUTING AND SOFTWARE-AS-A-SERVICE WEAKNESSES STRENGTHS AND PROPOSED ARCHITECTURES MARIA VASILEIADI Master of Science in Networking and Data Communications THESIS
Thesis Title CLOUD COMPUTING AND SOFTWARE-AS-A-SERVICE WEAKNESSES STRENGTHS AND PROPOSED ARCHITECTURES Dissertation submitted for the Degree of Master of Science in Networking and Data Communications By MARIA VASILEIADI SUPERVISOR DIONISIOS ADAMOPOULOS KINGSTON UNIVERSITY, FACULTY OF COMPUTING, INFORMATION SYSTEMS & MATHEMATICS ΤEI OF PIRAEUS, DEPARTMENTS OF ELECTRONICS AND AUTOMATION JULY 2010
1. Introduction... 5 2 Cloud Computing... 7 2.1 Why Now?... 11 2.2 Cloud Computing Providers... 12 2.3 Cloud Computing Categories and Providers... 13 2.4 A Few Words on Grid Computing... 15 2.5 Cloud computing vs. Grid... 15 2.6 Security Considerations... 17 3 Eucalyptus an Open Cloud Computing Architecture... 20 3.1 Node Controller... 21 3.2 Cluster Controller... 21 3.3 Virtual Network Overlay... 22 3.4 Storage Controller (Walrus)... 22 3.5 Cloud Controller... 24 4 Software as a Services (SaaS)... 26 4.1 Web Services... 26 4.2 SaaS Definition... 27 4.3 Advantages of SaaS... 28 5 Service-Oriented Architecture (SOA)... 33 5.1 SOA Roles... 33 5.2 SOA Operations... 35 5.3 SOA Stack... 36 6 Comparing SaaS and SOA... 39 6.1 SaaS vs SOA... 39 6.1 A Financial Perspective... 43 Conclusions... 46 References... 47 Figure 1 NIST Cloud computing definition... 7 Figure 2 (Marinos 2009) Cloud computing architectures... 9 Figure 3 Eucalyptus Hierarchical Structure... 20 Figure 4 Walrus Service... 23 Figure 5 Cloud Controller... 24 Figure 6 Web Services Architecture... 26 Figure 7 SaaS adoption by Enterprise... 28 Figure 8 SOA Roles and Operations... 34 Figure 9 SOA Stack... 36 Figure 10 Similarities and differences of SOA and SaaS... 42
Abstract This projects initial intent is to go in depth on the current state of the art of Cloud computing and provide a clear view and definitions of what Cloud Computing is as well as other concepts related to it. Explain how architectures like Grid and Clusters can be used in order to implement Cloud computing as well as what Cloud computing contributes on top of those in order to implement integrated solutions. Further more discusses Eucalyptus an open source Cloud computing implementation, which is currently available on the market. Later on, will identify what the pros and cons of Cloud computing, SOA and SaaS are, and continues by discussing how Web Services can be used in order to facilitate SOA and SaaS. Finally special emphasis is given on spotting the differences and similarities between SaaS and SOA from a couple of different perspectives. 4
1. Introduction Cloud computing, even until now that began to evolve and spread in a wider way and researches take place on whether it will constitute the next killer application, does not have a definition jointly accepted by the IT community. An attempt to provide a definition though is made by (Wang 2008), but it is indeed based on writer s experience: A computing Cloud is a set of network enabled services, providing scalable, QoS guaranteed, normally personalized, inexpensive computing platforms on demand, which could be accessed in a simple and pervasive way. (Wang 2008). Prior to cloud computing grid was a pretty well known architecture that would be able to deliver distributed services. (Mc Evoy 2008) Grids are systems which fulfil the following checkpoints: Coordinate resources that are not subject to centralized control; use standard, open, general-purpose protocols and interfaces; deliver non-trivial qualities of service. They often present additional features like single sign-on, information service for discovery and monitoring of resources and smart job scheduling. It is obvious from the above that cloud computing is service oriented whereas grids give an emphasis on building transparent to the end user distributed systems. Hence one can say that those two technologies come hand in hand. Close related to cloud computing is the term Software as a Service (SaaS). By it should be clear that we are talking about applications hosted as services and delivered to the end user across the Internet on demand. SaaS allows users to run applications that are not installed locally on their computers and hence makes software maintenance less complicated and time consuming. Furthermore since end users are charged on a per-use basis of for those services, this reduces the cost of purchasing software and licenses. Application Service Provider (ASP) (Wang 2008) can be seen as an early attempt to implement SaaS. On that model users subscribe for software hosted on third party 5
servers and finally those are delivered over the Internet. Microsoft has also made a first attempt to adopt SaaS with Software + Service driven by the idea of a combining local software and Internet services that interact with each other in order to deliver the required results. Chrome a web browser provided by Google is a software implementation of SaaS in which outside the traditional web browsing experience that a user can have, allows the usage of a new desktop, that delivers applications no matter where those reside (locally or remotely). It is worth mentioning at this point that SaaS sometimes is confused with SOA, though in reality those are two completely different concepts. By SOA (Laplante 2008) we define a set of components (interfaces, protocols) that allow various services to interact with each other. Finally the Chrome web browser provided by Google goes a step forward introducing on top of the web browsing a desktop though which applications can be contacted either locally or remotely. Chapter 2 goes in depth in the theory of Cloud computing, unfolds the reasons Cloud computing became popular nowadays and the parties that take place in this process. Presents Cloud computing ancestor (Grid) in a few words and compares those to technologies and summarises by presenting various security considerations that arise with the use of Cloud computing. Chapter 3 presents the main components of Eucalyptus an open source Cloud computing architecture. Chapter 4 discusses in detail SaaS. Chapter 5 gives the definition of SOA and discusses its architecture and finally Chapter 6 presents the similarities and differences of SOA and SaaS. 6
2 Cloud Computing Moving a step ahead of the definition (Wang 2008) gave on cloud computing NIST (Mell 2009) also tries to provide a definition on the topic of cloud computing. In general Cloud computing is defined as a pool of shared computing resources, which a user can provision and release without the need to communicate those needs to the service provider. From that point of view a Cloud is composed of five essential characteristics, three service models and four deployment models. Figure 1 NIST Cloud computing definition The five essential characteristics are the following: On-demand self-service. A user is capable to plan and the computing resources, such as storage needs, network capabilities, server time, that are needed in order to perform a task successfully. Those resources are assigned automatically and no human interaction is required with the service provider. Broad network access. All the computing resources are available over the network and can be accessed with the use of standardised mechanisms, which are capable to server 7
request by heterogeneous devices thin or thick client platforms, such as PDAs, laptops and so on. Resource pooling. Each CP has a pool of available resources that can be used in order to serve multiple users with the aid of a multi-tenancy scheme. Those physical and/or virtual resources are dynamically allocated and released upon users demand. There is a sense of location independence in that the user in general does not control or know at any time the location of the provided resources. User should be capable though to explicitly specify that at a higher level of abstraction by providing information lets say for the country, province or datacenter. The range of the available resources varies from storage, processing, memory, network bandwidth, to virtual machines. Rapid elasticity. It is essential for the Cloud computing mechanism to have the capability to scale in and out its resources automatically and rapidly. In that way the Cloud users get the impression that the Cloud has unlimited capabilities and hence the impression that they can purchase any amount of computing resources at any time. Measured Service. A cloud is capable to perform automatic control and optimization of the amount of resources that are currently used by services with the use of a mechanism that allows it to estimate the needs of a service according to each type (storage, processing, network bandwidth, number of active user accounts). In that way Cloud resources can be monitored, controlled, and reported providing transparency both for the CP and the user of each of the services. There are three service models on cloud computing: Software as a service (SaaS): application provided to users by third party providers, over the Internet. Is available to the final user on demand with the use of thin clients. The most popular implementations at this time are Salesforce CRM, Google Docs. We should note here that the user has no control on the 8
Cloud infrastructure. It is just offered the option to manipulate just a few application specific settings. More on SaaS can be found in chapter 4. Platform as a service (PaaS): developers are being enabled to develop applications with the use of APIs. The platforms that are available provide to the user development tools, configuration management, and deployment platforms. Again the implementations of this of category of Cloud computing are Microsoft Azure, Force and Google App engine. Users have control on the deployed applications as well as the environment that hosts those applications. Infrastructure as service (IaaS): provides virtual machines and other abstracted hardware and operating systems which may be controlled through a service API. Well known implementations include Amazon EC2 and S3 and Rackspace Cloud. In this service model the user can have control over the OS, network and storage. Figure 2 (Marinos 2009) Cloud computing architectures Figure 2 above gives an overview of Cloud computing architecture, and where each of the vendors, developers and end users, reside in it. Finally the four deployment models are described below: Private cloud When an organization makes use of a Cloud computing infrastructure that is operated exclusively by it. This infrastructure can be managed either by the 9
organization itself or by a third party. The Cloud computing resources can be located on premise or off premise. Community cloud Is the case where the Cloud computing infrastructure is shared among numerous parties. It is structured in such a way that serves communities with similar concerns, such as security requirements, policies and so on. This model can be managed either by the organizations that share it or again by a third party. As in private clouds the Cloud resources might exist on or off premise. Public cloud The resources of a Cloud infrastructure are available to public in general or to large industry groups and are administered by organizations that hold the role of the Cloud provider that sell cloud services. Hybrid cloud In this deployment model the cloud infrastructure is a composition of two or more model private, community, or public. They are still considered separate entities but they are tied with each other using standard technologies or proprietary protocols that will allow the to transfer data and applications among them. The following features, when looking from a hardware perspective, are new in Cloud Computing: Provide the impression that unlimited computing resources exist and for that reason, the need of system provisioning is limited. The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs. Pay-as-you-go for all the computing resources (e.g., software licences, processors, storage and so on). 10
2.1 Why Now? Although the creation and operation of a wide range datacenter contributes a lot in the development of Cloud computing, the reasons that in reality helped Cloud computing to dominate the market nowadays are today s technological evolution and the synchronous business models. With the use of Cloud computing it is believed that new applications will arise were simultaneously some of the applications that are used nowadays will take advantage of the Cloud computing contributing in that way to its further spread. Such applications are: Mobile interactive applications it is believed that the future of computing belongs to applications that respond in real time. Such services served better by Cloud computing infrastructures not only because those have to be available for as long as possible but also because such services use huge amounts of data which are easy to store on large datacenters. Parallel data processing Cloud computing provides a unique opportunity for data processing application that use TeraBytes of data and require several hours to complete. If it is possible to run an application in parallel, users can take advantage of the provider s charges, since using hundreds of computers for a short period costs the same as the usage of just a few computers for large periods. Handoop and Google MapReduce allow users to implement such applications without knowing the details behind the task distribution to the hundreds of processors of the Cloud. Commercial applications The latest versions of commercial applications such as Matlab and Mathematica allow the use of Cloud computing to perform calculations that usually would require large computational cost. In this case though we should take under consideration the cost of transferring the data and also the cost of using the Cloud nodes in comparison with the time profit. 11
Applications the require short response time Applications that would be able to take advantage of the parallelism that Cloud computing offers are not able to do so due to the data transfer cost and the high response time. Such applications are not expected to make use of Cloud computing at least not until the response time and the cost of transferring data decrease. 2.2 Cloud Computing Providers It seems easy to find Cloud computing customer, but on the other hand who can become cloud computing provider and why? It is costly to create and maintain such facilities and that is the reason that only companies such as Google and Microsoft have their own Cloud computing facilities. At the same time they also have to develop the appropriate software that would allow the hardware management and help to protect their investment from attacks. Hence this is the main reason for which a company that likes to become a Cloud computing provider needs to invest not only on datacenters but also to the right software and at the same time needs to have the know how to maintain those facilities. Under those conditions the reasons that can lead a company to make the decision to become Cloud computing provider are: Profit Though $0.10 dollars per hour seems as a small price it is estimated that large datacenters can buy hardware, internet connections and power supply at the 1/5 of the price those are offered to medium size datacenters. Moreover the cost of the development, installation and maintenance of the necessary software pays off due to the large number of computers. Current investment amplification Adding Cloud computing services on the top of the existing infrastructure can become a new source of income at a low cost, which helps to redeem earlier investments on datacenters. For example many of the technologies that Amazon Web Services use had initially created in order to cover internal functions. 12
Going against the flow There are times that companies would like to provide to their clients alternative solutions. For example Google with GoogleApps provides to their clients dynamic scalability and load balancing. Customer loyalty IT firms that want to keep their customer loyalty high on top of the services they already provide to them, they also offer Cloud computing services. 2.3 Cloud Computing Categories and Providers Each application requires a specific computational model for storing the data and furthermore in the case of distributed applications a communication model is also needed to be defined. In the case of Cloud computing the use of Virtual Machines (VM) is necessary since the processes of the computational resource management that take place in the cloud must be transparent to the final user. A way to separate among the utility computing that is offered today is by taking into account the amount of control that allow to the user on the computational resources. In the case that we would want to rate the utility computing services that are available today according to the above criteria EC2 would lead this list. EC2 is really close to the hardware and for that reason allows users to control even the kernel of the operating system (OS) that they use. There is no limit on the number of applications a user can run but on the other hand auto-scaling and rollback capabilities are not offered. Google AppEngine (Google 2010) on the contrary, does not have as main target general purpose applications, but focuses on web applications. In this case auto-scaling is offered by Google. Microsoft Azure (Microsoft 2010), stands somewhere in the middle. Like EC2 (Amazon 2010) can serve general purpose applications, allows users to choose the programming 13
language they will use but this is not the case with the OS as well. Libraries can be autoscaled up to a point and network adjustment is also possible. Focusing on the three utility computing platforms that we discussed above, Table 1 below summarises the main characteristics of them. Amazon Web Services Virtual Machine x86 architecture on top of Xen VM Scalability on computational resources Storage Model Varies from Block store (EBS) to key/blob store Network Model Supports security groups Can isolate network outages Flexible with IP addresses Google AppEngine Predefined framework and application structure Scaling of computational resources and storage resources Megastore/Big Table Fixed network topology capable to support web applications Scaling is dynamic and transparent to the developer Microsoft Azure Microsoft Common Language Runtime (CRL) VM Load balancing SQL data services Azure Storage Service Dynamic depending on the application components description Table 1 Summary of the three most popular utility computing platforms 14
2.4 A Few Words on Grid Computing Grids (Wang 2008) started off in the mid-90s. Grid has its origins on high performance distributed systems, which purpose were to share distributed resource in order to execute tasks remotely. That would result on being able to solve large scale computational problems. The following features characterise the Grid infrastructure: Is a decentralized and spans across geographically distributed, while at the same time lacks of central control. It is composed by heterogeneous resources (hardware/software configurations, access interfaces and management policies) The development of new middleware was necessary as Grid computing were evolving. Some known examples are, Unicore, Globus Toolkit, WSRF and glite. Those are able to provide all the needed functionalities, such as resource management, security control and monitoring & discovery. The main idea behind Grid computing is to provide its users with dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. However developers that lack the experience of implementing applications for Grid computing can find it pretty difficult to adjust on its logic. Moreover, with Grid computing is not easy to achieve guaranteed performance at any time. 2.5 Cloud computing vs. Grid In contrast with the characteristics of Grid computing, as those described above, Cloud computing, from its definition, provides user-centric functionalities and services to users in order to build customized computing environments. Cloud Computing, operates like a central compute server with single access point. Cloud infrastructures could span several 15
computing centres, like Google and Amazon, in general contain homogeneous resources, operated under central control (Wang 2008). The middleware or the operating system that Cloud computing uses, is not yet fully developed and acceptable standards have not been developed yet. Moreover there are several topics open for the researchers on distributed virtual machine management, Cloud service orchestration, and distributed storage management. In Addition to the above Cloud computing can provide QoS to the users allowing them in that way to have easy, pervasive and uninterrupted access to their resources at any time with guarantied quality. Table 2Grid vs. Cloud characteristics (Vaquero 2009) 16
To conclude with, Grid computing placed strong funds with the development of middleware and infrastructures that will allow a pleasant application experience, but Cloud computing on the other hand overtook Grid computing due to its capability to provide QoS and customisable computing environment upon user request. It is well understood from the above that building Cloud computing on top of the stability that Grid computing offers via its infrastructure and middleware is a wise option that will provide users satisfying services. 2.6 Security Considerations Among others (ENISA 2009) created a report in which identifies and lists the most important categories of Cloud-computing security considerations. Loss of governance: as the user makes use of Cloud infrastructure, allows the Cloud Provider (CP) to hold the control on numerous resources (sensitive information). If that is combined with an SLA that will not provide any commitment towards information confidentiality major security threats arise. Lock-in: at this stage of cloud computing portability is not assured, Hence there is no guarantee that upon users will, she can move data and applications from one CP to another or move back to in-house environments. For that reason the users depends on this specific CP. Isolation Failure: As resources (e.g. storage, memory) on Cloud computing are shared among many users there is always the risk for the resource allocation mechanism to fail to separate those resources. This is called guest-hopping attack. Though for now this kind of attacks are not widely spread and more difficult to achieve, as is the case with traditional OSs, we can relay on that fact for the future. 17
Compliance risks: when migrating on Cloud computing there is the possibility to rut at risk investments made in order to comply with industry standards and regulations when: the CP cannot provide evidence of their own compliance with the relevant requirements the CP does not permit audit by the cloud customer (CC). Finally notice that not all public Cloud computing infrastructures are able to support certain level of compliance (e.g., PCI DSS) Management interface compromise: since remote access is allowed to various customer management interfaces, an increased risk of exposing larger portions of users resources via those interfaces becomes higher. A contribution to this security risk is added by web browsers vulnerabilities. Data protection: data protection is as important to the users as is to CPs. From the users perspective, that posses the role of data controller, it is not always easy to check what are the data handling practises that CP uses, in order to assure that those are being used in a lawful manner. Upon lacking this assurance CPs might face complains or even legal accusations. For that reason many CPs are open to provide details on how data are handled by them and also publish certificates such as SAS70 certification. Incomplete data deletion: when deleting data from a storage device, the complete deletion of those is not guaranteed. Furthermore in the case of Cloud computing deletion of data might not be able to execute in a timely manner due to resource sharing. Another issue is that multiple copies of the data might exist for backup reasons; this adds a higher risk to the customer than dedicated hardware does. Malicious Insider: another thing that is likely to happen is damage to be caused by a malicious insider. For that reason setting roles is a necessity when implementing Cloud 18
computing architectures. Such roles may include CP system administrators and managed security service providers. 19
3 Eucalyptus an Open Cloud Computing Architecture Eucalyptus (Eucalyptus 2009) is an open source application with main purpose to offer cloud computing services over computer clusters. Eucalyptus interface is compatible with that of Amazon s EC2 and S3. Started as a project within the University of California, Santa Barbara, but today is supported by Eucalyptus Systems, a company funded by Eucalyptus creators. The architecture of Eucalyptus is simple, flexible and follows a hierarchical design. Eucalyptus allows users to control a VM using a simulation of the EC2 interface. So far supports VMs that run on Xen and KVM hypervisors but it is expected that soon will support VMware and more. Each Eucalyptus subsystem (four in total) is designed as a distinguished web service. Figure 3 provides a graphical representation of the 4 subsystems. Figure 3 Eucalyptus Hierarchical Structure 20
3.1 Node Controller The Node Controller (NC) runs on each node which is destined to run a VM instance. The NC controls and uses the software of the nodes according to the commands that accepts from the Cluster Controller (CC). The NC collects information for the resources of the node on which resides and also controls the VM instances on the node. Finally send the information collected to the CC when requested. 3.2 Cluster Controller The Cluster Controller (CC) runs oat front end computers of a cluster or on any computer that has a network connection both with the nodes that run the NC and the computers that run the Cloud Controller CLC. Mary of the functions that the CC performs are similar to those of NC. The responsibilities of CC are first to route the incoming requests to the appropriate NC in order to start an instance and begin to control the layer of the virtual network and secondly to collect and report information to a group of NCs. When the CC receives a sequence of instances that have to run, communicates with each of the NCs in order to find the availability of their resources. Then, the CC sends the command to the NC that has available the appropriate resources to host the instance. In addition to this and knowing the NC resource availability the CC estimates the number of the simultaneous instances that a NC can run and reports this number to the Cloud Controller (CLC) 21
3.3 Virtual Network Overlay Each VM within the Eucalyptus architecture is connected with each other while at the same time at least one VM also has internet connection so that the owner can connect and interact with it. In Eucalyptus the CC decides when a network interface instance starts and stops in four different ways, which are defined by the system administrator. The first case is when the interfaces of the VM are bridged to the real network node allowing the administrator to control the DHCP traffic of the virtual network the same way as if this traffic were not originated from the Eucalyptus. The second choice allows the administrator to define pairs of IP addresses and MAC addresses. When this is the case each new instance makes use of such a pair and then releases it when the instance terminates. Those two option as described above maximize the network performance when all the VMs run within the same cluster but their is no isolation among the networks of then VMs. The third choice manages and controls the VM network at its fullest extend, by isolating the traffic among the VMs and also introducing firewalls among the VM groups. At the same time allows dynamic allocation of IP addresses to each VM when those start their operation. Each VM group network is a different VLAN. This way each VM group that belongs to a subnet is isolated from the rest. In this case the CC operates as a packet filtering router as well among the different VM subnets. 3.4 Storage Controller (Walrus) Walrus is a data storage service used by Eucalyptus. Walrus makes use of web service technologies such as Axis and Mule and that makes it compatible with Amazon s S3. Walrus uses HTTP in order to implement the REST and SOAP interfaces. Walrus provides two functions: 22
Data transfer to and from the Cloud and the instances that have started on the nodes. Operates as a storage service of the VM instances. Similarly to S3, Walrus supports both parallel and serial data transfer. Walrus does not warranty the integrity of the transferred data in order to strengthen the parallelisation of tasks; hence the user is responsible to perform this task. Figure 4 Walrus Service In addition to the above Walrus also operates as a service for storing and administer the VM images. The file system of each VM as well as the kernel of the OS and the memory are encapsulated in packets and transferred with the use of EC2 tool that Amazon provides. This tool compress the image, encrypts is and fragment it into pieces described by the image descriptor or manifest. Walrus has to verify the integrity of the images. When a NC asks for an image, before Walrus instantiate it sends a request for transmission that is verified with the use of the appropriate certificates. Then the image integrity is also verified, afterwards decrypted and finally sent to the node. In order to improve its performance Walrus keeps the decrypted images in a cache for an amount of time or till the manifest of the image is rewritten. 23
3.5 Cloud Controller The components of Eucalyptus for which we talked earlier are administered by the Cloud Controller (CLC). The CLC is a collection of web services which can be categorised as follows: Resource Services - Are those services responsible for allocating the resources of the system to various functions and allow users to control the settings of the VMs. They also keep track of the components and the virtual resources of the system. Data Services Control the data of the system and the user and furthermore provide a configurable environment to the user in order to execute requests for resource allocation. Interface Services Handles the conversion of the protocols, the authentication of the users using the appropriate interfaces and provide the tools for system administration. Figure 5 Cloud Controller The resource services, process the user requests that require control over the VM and communicate with the CC in order to allocate and release the physical resources. A status of the System Resources State (SRS) is useful in order to contact the CC and verify whether a request can be fulfilled or not based on a Service Level Agreement 24
(SLA). The SRS control has two steps. When the user requests arrive the SRS information is used to decide whether the request will be accepted or not. The creation of the VM afterwards includes the resource reservation on SRS and the request transport for the creation of the VM which is followed by the confirmation of the state of the resources in SRS in the case of the successful request, or the cancelation of it in the case of error. Later on the SRS monitors the resources state and is responsible to implement changes on the running VMs. The SRS information changes because of events that happen to a system and are based on the current SLA. Eucalyptus allows users to choose the cluster which they desire to use in order to start running their VMs defining the availability zone of the Cloud they want to use but it is also possible to leave this choice on Eucalyptus. The Data services have the control of the creation, modification and data store of the system and users information. Users connect on those services in order to reveal information for the available resources and to configure parameters that have to deal with the VMs and the network distribution. The resource services interact with the data services to decrypt the parameters that users enter. The data services are of dynamic nature. For example a user is able to change the firewall settings while a VM is in use or not. As a result the services that control the network and the integrity of the data should change the state of a VM group after a user request. To finish with, Interface services, offer a web interface to the user and the administrators of the cloud. With the use of a web browser the users can apply for access on a cloud in order to download the encryption certificates and to interact with the system that will allow seeing resources availability. The administrator can additionally administer the user accounts and to supervise the availability and the functionality of the system components. 25
4 Software as a Services (SaaS) A small introduction of the main idea behind SaaS has been given in chapter 2. At this point it is useful before going into detail both with SaaS and SOA, to describe briefly what is meant by the term Web Services. 4.1 Web Services Features that are currently provided by both SOA and SaaS, in order to add functionality on applications, are still on conceptual level model. To enable the functionality that those models promise nowadays, the most popular enabler are Web services technologies (Ferris 2003). Web services are programmable web applications with standard interface descriptions that provide universal accessibility through standard communication protocols. Web services provide an integrated set of XML-based, ad hoc, industry-standard languages and protocols to support Web services descriptions using Web Services Description Language (WSDL), publication and discovery and transportation. Figure 6 Web Services Architecture 26
To make that simple, Web services are able to provide a set of standards that allow us to implement SaaS and SOA. 4.2 SaaS Definition When we talk about Cloud Computing we refer to both the applications that are being delivered as services as well as the hardware that delivers them. Services are referred as Software as a Service (SaaS) (Armbrus 2009). On top of that it is worth mentioning that SaaS delivery model (Turner 2003) distinguishes software ownership from the user. In this case the owner of the software is the CP who hosts the software and allows the usage of it on-demand via the Internet (e.g. Public Clouds) or via intranet (e.g. Private Clouds). That allows the delivery of the software on a per-use basis. Currently the most popular SaaS product is the Salesforce.com tool for customer relationship management. Note at this point that not all applications can benefit from SaaS. Figure 7, show what are the applications that can take advantage of SaaS architecture, also which applications are expected to move towards SaaS within the next three years and finally which applications can not benefit at all from it. SaaS characteristics include (Traudt 2005): Network-based access to, and management of, commercially available software. Activities managed from central locations rather than at each customer's site, enabling customers to access applications remotely via the Web. Application delivery typically closer to a one-to-many model (single instance, multi-tenant architecture) than to a one-to-one model, including architecture, pricing, partnering, and management characteristics. Centralized feature updating, which obviates the need for end-users to download patches and upgrades. 27
Frequent integration into a larger network of communicating software either as part of a mash up or a plug-in to a platform as a service. Figure 7 SaaS adoption by Enterprise 4.3 Advantages of SaaS SaaS platform is used to eliminate problems that occur due to the constant need for software licences. SaaS tries to take the current industry standards a step ahead by offering significant advantages to both users and software developers. Moreover SaaS encourages growth and innovation and also creates opportunities for selling and using software. Bellow are listed the primary benefits that arise as a result of the use of SaaS both from the user and provider perspectives. 28
User Benefits Lower Cost of Ownership Due to the fact that the use of software/applications is paid only for the time it is used, the users do not have to pay large amount of money on software licences in advance. There is also no need for hardware infrastructure and for that reason the user spends a lot less on hardware, maintenance, and administration. That gives the chance to the final user to access resource that would never be able to use otherwise. In addition to this and cause of the way that pricing is implemented on SaaS the user is able to know at any time cost oh the software purchase. Focus on Core Competency The SaaS platform liberates the users from the numerous administrative (installation and maintenance of software). That only means that the user is no longer obliged to have IT resources dedicated for specific use in order to ensure the proper functionality of applications. Looking over the cost saving, SaaS approach allows users to remain focused on their goals and make use of those resources in more strategic areas. Access Anywhere A core user advantage when using SaaS is the availability of the resources from everywhere. With SaaS users can have access to the applications they would want to use at any time by just having access on a PC and an Internet connection. It is not necessary that they are in their office or even connected on the corporate network through a VPN or any other method. This strengthens the user experience and makes it easier for the user to get the work done. Additionally, users can take immediate advantage of the features and functionality of an application simply by launching a browser. Freedom of Choice SaaS has multitenant and pay-as-you-go nature. That enables SaaS to be more flexible when it comes to technological choices that are allowed. Users are allowed to, at any time, select from variety of applications to use and they are also able to stop using those applications that are no longer interested in. Enterprises are able to avoid in that way to pay for applications that are used by just a few or in some cases 29
one user. All the above, results in the development of better software applications. Finally in order for the vendors to hold their customers they should be capable to easily adopt the customer needs and wants. New Application Types SaaS shortens the obstacles of the first time use of software. That makes it easier to start developing applications with incidental use model. Such an implementation will allow the development collaborative environment where some users rarely access some application but on the other hand those are important to the overall experience. Faster Product Cycles With SaaS the development and testing processes is by far more frequent that those of the typical releases. But the new features that each release introduces are fewer in that way. The advantage thought that such an approach has is that bug fixes are released faster and in addition to it users have the time to adjust the new features something that results on having increased productivity from the user s side. As it is obvious this was not the case when the previous model was used. Another advantage is that users will not have to continuously upgrade their software to the latest version. This process is transparent to the users and the users are able to benefit of the new features and fixes of an application are available each time they access the software. Provider Benefits Increased Total Available Market Someone can name this as the number one reason why a software vendor would want to embrace the SaaS model. All the products that are provided to users in that way attract by far larger audience than those provided in the classical way. This is happening because the user experiences lower cost of ownership both for hardware and software and furthermore users who lack the IT skills to support the necessary infrastructure are attracted by the intergraded solutions that are being offered to them. Another benefit derives from the fact that the decision for software purchase (SaaS application) is usually taken at a department level rather than the enterprise level. That contributes on having shorter sales cycles. 30
Enhanced Competitive Differentiation For a provider to be able to deliver software to its customers via the SaaS platform amplifies provider s competitive differentiation with its competitors. Moreover creates opportunities for companies that are new to the market and not that established, to compete effectively with larger provider. The cost of the use of an application that the users are able to realise when using SaaS in comparison to the traditional license model is an attractive selling point. Having a look in the future, software companies that persist to use the old traditional distribution model will accept increasing pressure from their competitors in order to move towards the SaaS platform. Those providers that will remain behind will eventually find it difficult to catch up since the software industry will continue to rapidly evolve. Lower Development Costs & Quicker Time-to-Market The SaaS platform allows reduces development cost of the applications. This come as a result of the testing time an application needs compared to the testing time traditional applications need. This is the result of a combination of things that take place at this process when SaaS is used. Such factors are the agile software development, development of small and frequent releases, and the finally the applications are developed in order to be deployed on specific hardware infrastructures. That has as a result that testing is performed for smaller releases and those release are also tested on the actual hardware that they are going to be used when they will become available to the customers. In traditional schemes that was not possible since each client has different hardware platforms on which the application was installed. For that reason the complexity of the testing, the time that the testing face takes place and the cost are reduced. Hence the overall software development cost is lower and the time-to-market quicker. Predictable MRR Revenue Software providers that rely on the traditional methods of software delivery produce major releases every 12 to 18 months in order to initiate a new revenue stream from the sale or upgrades of it. That means that the software provider is under a lot pressure in order to catch a milestone (both from a financial point of you and development) that they have set at a specific date. In the SaaS architecture 31
this financial target is typically in the form of Monthly Recurring Revenue (MRR). MRR. This short term target makes it easier to predict the expected revenue and less tied to the development schedule of the next release of the software. The MRR measurement also reduces the tension of end-of-quarter discounting for the software provider. Improved Customer Relationships SaaS has the ability to improve the relationships between the software providers and users. With the traditional software delivery model, the providers were less liable for installation, performance maintenance and correct operation their software to the client s side. Once the software was sold, it is exclusively user s responsibility to make it work. The SaaS platform makes this relationship stronger between providers and users and in addition gives the chance to providers the opportunity to have more satisfied customers. 32
5 Service-Oriented Architecture (SOA) In a SOA model (Laplante 2008), the constituent components of the software system are reusable services. Those services interact one with another, via well defined interfaces and communication protocols. This architectural strategy goes hand in glove with software applications that are close to business objects and help to create an abstraction layer. Companies that invest on SOA and have implement applications that will enable it include Oracle Web Services Manager, HP Systinet Registry 6.0, Microsoft.NET, Sun Java Composite Application Platform Suite, and IBM WebSphere. The technologies applications listed above result to a list of technology architecture, process architecture, and application architecture. What SOA guaranty is to combine all those elements together in order to achieve the desired result. This task can be daunting when that many heterogeneous applications are involved. Focus is given on designing the interface of the service, in a similar way to other component-based architectures, this time though the major difference is that with SOA services are designed in such a way that those can be used over a network. Hence the developer is will not design a function which will have a single purpose but a service with well defined interfaces, capable to use them for several business purposes. 5.1 SOA Roles SOA includes three different roles, the service requestor, the service provider and finally the service registry. The service provider is responsible to describe a service, so that this service can be developed in an environment that will allow its execution and furthermore will allow access on that service to whoever requests it. Finally the service provider should 33
administer the service delivery requests that derive from the users. The role of the service provider can hold a company that hosts a web service. On a client-server architecture the service provider is positioned at the server side. The service requestor is whoever is trying to find the description of a service which is published in some registry in order to use it. Again, using as a paradigm the client-server architecture, a service requestor possesses the client side. The service registry is responsible to publish a service description which is given by the service provider and also allows service requestors to search among the entries for the description that will cover their needs. Figure 8 SOA Roles and Operations None of the above roles is applies exclusively on one entity. It is possible and also pretty common a service to be simultaneously provider as well as requestor when has the need from other services to provide their output in order to proceed its operation. 34
5.2 SOA Operations Earlier we discussed the three basic SOA roles of the SOA architecture. It is now the time to talk of the way in which those roles interact with each other, making use of a set of operations. The first operation is this of publishing a service. This is something covered by the service registry, by declaring and advertising a service. It is similar to an agreement between the service provider and the registry. When the service provider, publishes the description of its service, this becomes immediately available to a number of requestors. Details of the publish interface depend exclusively on the implementation of the registry. In some simple cases of publishing a service the role of the registry can hold the same server which the service runs on, making each service description available in one of its folders. There are of course available more sophisticated publish methods, such as the UDDI registry. The second operation is this of finding a service. In reality this is the binary equivalent of the publish operation; an arrangement between the service requester and the registry. With this operation the requestor sets a set of search criteria such as the category of the service, the quality of it and more, and later on the registry processes this information and returns all the service descriptions that match the set of criteria. The complexity of this operation depends on the implementation that the service registry chose to use. For example this can vary from a simple HTTP GET request to a UDDI system. Finally the third operation is this of bind. This operation embodies the clients server relationship between the provider and the requestor, which we mentioned earlier. This relationship can be really dynamic, as in the case of the creation of a client proxy based on the service description which is used to activate a web service or on the other hand, can follow a static model. At the latter the developer defines the way in which the web service will activate. 35
From the above is clear that the service description is the key element of this architecture. The description published from the provider to a registry and it is the description that the requestor receives as a result of its search. Finally the service description is what will let the developer know what actions should take in order to connect or activate a web service. 5.3 SOA Stack As we already know SOA architecture presents a way to build systems in which applications and functions are implemented as services to the final user of to some other application. SOA is consisted of elements that can be classified into two categories, functional and Qualiy of Service (QoS). Figure 9 SOA Stack Functional category includes the following: 36
Transport is the mechanism that performs the transportation of a service from the requestor to the provider and of course the response of the service towards the opposite direction. Service communication protocol is the agreed mechanism that both sides (requestor and provider) use in order to communicate the requests and replies. Such protocol is SOAP. Service description is a graphical representation that describes each service, how is is activated and what are the data needed in order to achieve its activation correctly. WSDL is a well known example of a description language. Service describes the service which is created and is ready to be used. Business Process is a sett of services that someone can activate in a particular order and by following rules in order to perform complex business specifications. It is worth mentioning that a service can be thought as a business process in its own, hence we can assume that business processes can be created by heterogeneous services. Service Registry is storage of data descriptions that can be used by requestors on by providers in order to activate or publish services respectively. The QoS category is comprised of Policy a set of rules and treaties under which a provider delivers is services to the requestors. There are policy matters thigh o the functionality and the QoS of the services and for that reason policy sits in the between of the two categories, functional and QoS. Security describes the rules that have to do with the authentication, authorization of those who desire to have access on a service. 37
Transaction concerns the characteristics that should be applied to a set of services so that those can warranty the result. For example when three services compose a business process all three should perform successfully or none of those should take place. Management defines a set of rules which decide the correct use of the services that are offered and consumed. 38
6 Comparing SaaS and SOA According to (Zachman 1987), (Laplante 2008), tries to present the differences and similarities of SaaS and SOA. Six perspectives exist (objectives, business model, information system model, technology model, detailed representation and functioning system) into which we can categorise the differences between SOA and SaaS. 6.1 SaaS vs SOA Objectives/scope, SOA provides a set of services that end users can take advantage of, while SaaS lists the services that can actually deliver. Business model, SOA implies a list of found business services to be used in the system; SaaS implies a list of business services to be provided. Using existing business services could significantly eliminate software design and development expenses. In SaaS a software system can be delivered either as just one service, or as a part of a greater collection of services that eventually compose the entire system. Information system model, while SOA uses an architectural model that describes how interaction among service components happen, SaaS does almost the same but it does not limit its range just on services. Technology model, In order to specify the Information system model the technology that SOA and SaaS are going to use should be stated (such is Web Services). Detailed representation, WSDL, UDDI and SOAP are used for description, publishing and communication respectively. Those are used both by SOA and SaaS when implementing Web services. The platform on which SOA and SaaS will implement their services should be stated. 39
Functioning system, SOA systems demand the existence of monitoring and management of all communication, coordination, and collaboration among service components while SaaS systems require the same but just among its internal components. Table 3 bellow summarizes the differences between SOA and SaaS Table 3 (Laplante 2008) The main target of SaaS is businesses driven and is also directed at the end users, when SOA on the other hand is developed within various IT environments and is used to offer application driven services. Although SaaS and SOA have different perspectives, there are still really closely related. In fact SaaS and SOA platforms are complement to each other and they cannot stand separately on their own. As mentioned earlier the SaaS platform is characterised by a set of core components. These are: multi-tenancy, ordering and provisioning, user authentication and authorization, service catalogue and pricing, service monitoring, SLA management, usage metering, billing, invoicing and payments. In addition to the above the need for the SaaS platform to support various business functions arises. Those business functions may vary from marketing, lead tracking, sales, customer support, revenue and financial management, partner settlement, to business intelligence and so on. 40
A SOA implementation is composed by service providers and service consumers that belong to an enterprise. SOA platform is used by the service providers in order to publish their services, which consumers in their turn can take advantage of. Having a closer look at SOA platform technical implementation, we can see that it is composed by a set of well defined characteristics some of which are: service bus, communication protocols (e.g. SOAP), service interface definitions (e.g. WSDL), service discovery (e.g. UDDI) and so on. It is also well understood how important service monitoring, management and governance are but it is also clear that but this is not enough. In a typical large enterprise, the service producers and consumers could be applications or systems belonging to different departments, organizations or even subsidiaries within the enterprise. In such environments, services cannot be produced and consumed informally without proper service management in place since there is a cost associated to hosting and exposing a service by the service producer. In order to derive this cost, the total cost of operations or ownership (TCO) needs to be taken into account besides the cost to create the service. Also, there are security concerns around publishing the services openly. So at this point the need for a service catalogue management, provisioning, authentication, authorization, usage metering and cross-department charging, arises. It is obvious that these elements are exactly the same as those of SaaS platform. Hence someone can come up to the conclusion that as enterprise SOA deployment become more and more mature, they are in more need of the core functions of a SaaS platform. Back to SaaS platform the needs of offering new services of modify existing ones in the service catalogue with as little changes as possible to the core components of the SaaS platform is obvious. Those modifications should implement in such a way that the SaaS platform that derives from them is not a totally new platform for the existing services. On the contrary the SaaS platform basic functionality e.g. ordering and provisioning, authentication and authorization, service catalogue and pricing, metering, billing and invoicing, payments and so on, should allow reusability for as many service offerings as possible (if not all).furthermore the combined use of SOA and SaaS platforms is an add 41
on, on flexibility of the architecture and also contributes to keep the cost of ownership low. It should be already clear that SaaS architecture as well as other similarly complex architectures benefits of SOA capabilities. The opposite is not always true meaning that a SOA platform will not always be need of the capabilities that an SaaS platform can offer. At the beginning of SOA there have been lot of expectations that unfortunately over the years did not accomplish. Implementations of SOA that took place in large enterprises has not been as successful as expected or have not provided the expected Return of Investment (ROI) because and that was caused by missing SaaS elements in these deployments. SaaS management functionality is considered essential in order to make people aware of the way they can fully benefit from large-scale SOA deployments. This is also the point where SOA and SaaS enable the concept of "IT as a service" and become capable take IT to the next natural step in its evolution. Figure 10 bellow summarises the difference and similarities of SaaS and SOA platforms. Figure 10 Similarities and differences of SOA and SaaS 42
6.1 A Financial Perspective (Thomas 2008) used in his research a wide range of statistical methods, aggregate firms along industry groups and performed univariate and multivariate regression, t-tests, nonparametric median tests, and correlation analyses in order to make a financial comparison between SaaS and SOA. Initially there are evidences that the SaaS pricing model compared to traditional perpetual license model leaves a detectable imprint on relationships among variables constructed from financial statements. Specifically, for companies that benefit from the use of SaaS there is no statistical significance to the relationship between operating margins and inventory divided by the costs of goods sold. This would be consistent with the use of a subscription pricing model for SaaS providers, such that inventory is an essentially meaningless accounting item for them. On average it seems that software providers that use the SaaS platform are newer in the market and smaller of size. Due to that fact they possess less financial leverage (debt). On the other hand they hold their sales at high levels and have high general, and administrative costs (SGA) compared to the net sales that other business do. Moving a step ahead the use of analysis of cost structures among industries, makes it easier for the proponents of SOA to find a key talking point and to argue in behalf of SOA that the development costs will decrease, because the applications are hosted only on one platform (generally, the Internet). The companies that do not make use of SOA have the applications located typically on their employee s personal computers which most of the times have varied hardware platforms and sometimes even operating system that requires lot of maintenance. From this reason the conclusion is that the costs per unit sold is by far lower for companies that use SOA than for providers that create custom made software. 43
Compared with companies that use SaaS becomes clear that those benefit from significantly lower costs of products sold, expressed as a portion of net sales. Of course this is only the case where SaaS enabled companies are compared to traditional IT consultants and software providers that need to invest their time on developing customized solutions for their clients. So, it is well understood that requires lot of development time. When SOA software providers are capable to deliver equal or even better functionality standards using the cheap way of configuring their application as opposed to expensive customization development, then they can considered a serious threat to mane of the traditional software providers that exist within the IT consulting and business process outsourcing (BPO) market. On the other hand when the cost of goods sold is compared between companies that have adopt the SaaS platform and those who still operate with the traditional way it is not possible to see what are the competitive advantages that SOA model would introduce, in none of the two major industry categories providers compete (applications software and systems software). In fact, perhaps due to the very large volume of retail customers, software companies that will not follow the SaaS model appear to have significantly lower costs of goods sold as a portion of sales. This amplifies the belief that in the mass-market software space, the level of market share that would lead to the ubiquity of hosted, subscription software delivery along the SaaS model will have to be very high indeed before such firms are able to compete on a cost-per-unit basis with the likes of Microsoft or other providers of retail, mass-market software. Although the concept of hosted services seems to be attractive and though the per unit cost of production is fairly small, it is at the same time equally inexpensive to order a CD-ROM to a retail customer, or at time to have the software pre-install on a new desktop or laptop. Additional parameters that might affect negatively the spread of hosted services is the switching cost, the affect that monopoly might have to the market that will try to defence with lower pricing policies, frequent upgrades, functionality related to compatibility among integrated software applications might be more important 44
than the advantage of the per-unit lower cost that is only warranted for the foreseeable future at least on some areas of the information technology and telecoms economic sectors. 45
Conclusions As a conclusion to the analysis that took place in this paper, it is suggested that developers should design their future system implementations moving towards Cloud computing. Those implementations should allow horizontal scalability among a number of virtual machines as opposed to single virtual machine. It seems that in the future applications software will run partly on client and partly on the Cloud. The part of the application that will run on the Cloud should scale up and down rapidly. On the other hand the part of the software that runs on the client would be useful when users are disconnected from the cloud. The hardware systems should be designed assuming that the lowest level software relies on VMs and not on the traditional form on native operating system. That implies that it is needed to facilitate flash as if it were the memory between DRAM and the hard disk. What is expected to change over the year is not just the technology used for Cloud computing implementation but also the pricing scheme hence questions such as "What will billing units be like for the higher-level virtualization clouds?" arise. Further more the virtualization level of a Cloud implementation will determine whether Cloud Computing be dominated by low-level hardware virtual machines or high-level frameworks. 46
References [Amazon 2010] Amazon Web Services (WS) [Internet] http:// aws.amazon.com/ [Accessed 21 May 2010] [Armbrus 2009] Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, et al. (2009). Above the Clouds: A Berkeley View of Cloud Computing. EECS Department, University of California, Berkeley <http://www.eecs.berkeley.edu/pubs/techrpts/2009/eecs-2009-28.pdf> [Accessed 25 March 2010] [ENISA 2009] ENISA editors. (2009). Cloud Computing Benefits, risks and recommendations for information security. <http://www.enisa.europa.eu/act/rm/files/deliverables/cloud-computing-riskassessment/at_download/fullreport>. [Accessed 25 March 2010] [Eucalyptus 2009] Eucalyptus Systems Inc (2009). [Internet] Eucalyptus Open-Source Cloud Computing Infrastructure - An Overview <http://www.eucalyptus.com/pdf/whitepapers/eucalyptus_overview.pdf> [Accessed 20 June 2010] [Ferris 2003] C. Ferris and J. Farrell, (2003). What Are Web Services? Comm. ACM, vol. 46, no. 6, p. 31. < http://portal.acm.org/citation.cfm?id=777335> [Laplante 2008] Phillip A. Laplante, Jia Zhang and Jeffrey Voas. (2008). Distinguishing between SaaS and SOA. IT Professional p. 46-50 <http://portal.acm.org/citation.cfm?id=1373186>. [Accessed 12th January 2010]. Volume 10, Issue 3 (May 2008). [Google 2010] Google App Engine [Internet] http://code.google.com/intl/el- GR/appengine/ [Accessed 21 May 2010] [Turner 2003] M. Turner, D. Budgen, and P. Brereton. (2003) Turning Software into a Service. Computer, vol. 36, no. 10, pp. 38 44 <http://www.computer.org/portal/web/csdl/doi/10.1109/mc.2003.1236470> 47
[Marinos 2009] Alexandros Marinos, Gerard Briscoe. (2009). Community Cloud Computing <http://arxiv.org/ps_cache/arxiv/pdf/0907/0907.2485v3.pdf> [Accessed 21 March 2010] [Mell, 2009] Peter Mell and Tim Grance (2009). The NIST Definition of Cloud Computing, Version 15. <http://csrc.nist.gov/groups/sns/cloud-computing/cloud-defv15.doc> [Accessed 26 March 2010] [Microsoft 2010] Microsoft Azure [Internet] http://www.microsoft.com/azure/ [Accessed 21 May 2010] [Thomas 2008] Thomas W. Hall, Joseph Luter, Christopher Newport (2008). Is SOA Superior? Evidence from SaaS Financial Statements. Journal of Software, vol 3, no 5 [Traudt 2005] Traudt, E., Konary, A (2005). Software-as-a-Service Taxonomy and Research Guide. [Vaquero 2009] Luis M. Vaquero, Luis Rodero-Merino, Juan Caceres, Maik Lindner (2009). A Break in the Clouds: Towards a Cloud Definition, ACM SIGCOMM Computer Communication Review Volume 39, no 1. [Wang 2008] Lizhe Wang, Jie Tao, Marcel Kunze, Alvaro Canales Castellanos, David Kramer, Wolfgang Karl. (2008). Scientific Cloud Computing: Early Definition and Experience. p. 825-830 <http://www.computer.org/portal/web/csdl/doi/10.1109/hpcc.2008.38>. [Weisinger 2006] Dick Weisinger (2006). Viability of the SaaS Document Management Model <http://www.formtek.com/blog/?p=64> [Accessed 14 July 2010] [Zachman 1987] J.A Zachman (1987). A Framework for Information Systems Architecture. p. 276-292 IBM Systems. Vol 26 no. 3. 48