Masaryk University Faculty of Informatics. Master Thesis. Database management as a cloud based service for small and medium organizations

Size: px
Start display at page:

Download "Masaryk University Faculty of Informatics. Master Thesis. Database management as a cloud based service for small and medium organizations"


1 Masaryk University Faculty of Informatics Master Thesis Database management as a cloud based service for small and medium organizations Dime Dimovski Brno, 2013

2 2

3 Statement I declare that I have worked on this thesis independently using only the sources listed in the bibliography. All resources, sources, and literature, which I used in preparing or I drew on them, I quote in the thesis properly with stating the full reference to the source. Dime Dimovski 3

4 Resume The goal of this thesis is to explore the cloud computing, manly focusing on database management systems as a cloud service and to propose general scope breakdown structure of a project for migrating company s database to a cloud based solution. It will focus on explaining the key deliverables to migrate to a database in the cloud and illustrate the tasks required to fulfill each part of the project. The potential challenges and risk that must be taken in consideration are discussed and a comparison between some of the current available solutions of SQL and NOSQL based database management systems as a cloud service, considering the advantages and disadvantages of the cloud computing in general and the common considerations is provided. Keywords Cloud computing, SaaS, PaaS, Database management, SQL, NOSQL, DBaaS, Work breakdown structure, WBS, Scope breakdown structure. 4

5 Contents Statement... 3 Resume... 4 Keywords Introduction Introduction to Cloud Computing Cloud computing definition Cloud Types NIST model Cloud computing architecture Infrastructure Platform Application Scalability Elasticity Database Management Systems in the cloud (Database as a service) Scope Breakdown Structure of project for migration of a database to a cloud based solution SBS Deliverables, Challenges and Risks Will cloud computing reduce the budget? Conclusion Appendix Some of the currently available RDBM DBaaS, comparison and common considerations Understand various available storage options NOSQL options data models Amazon DynamoDB DataModel Amazon SimpleDB Document oriented database

6 List of Abbreviations References

7 1. Introduction The boom of the cloud computing over the past few years has led to situation that it is common to many innovations and new technologies. It became common for enterprises and a person to acknowledge that the cloud computing is a big deal and to use the services that are offered in the cloud, even though they are not clear why that is so. Even the phrase in the cloud has been used in our colloquial language. Many developers in the world are currently working on some cloud-related products. Therefore the cloud is this amorphous entity that is supposed (citation [1]) to represent the future of modern computing. In an attempt to gain a competitive edge, businesses are looking for new innovative ways to cut costs while maximizing value. They recognize the need to grow but at the same time they are under pressure to save money. The cloud gave this opportunity for the business allowing them to focus on their core business by offering hardware and software solution without having to develop them on their own. In this thesis I will give an overview of what cloud computing is. I will describe its main concepts and architecture; and take a look at the paradigm XaaS (something/everything as a service) and the current available options in the cloud mostly focusing on Database in the cloud or Database as a service. Good planning and preparation are one of the most important parts before migration to the cloud. As a part of this thesis I will create a proposal of a general Scope Breakdown structure of a project for moving companies databases in the cloud based solution describing the key deliverables and the potential challenges and risk connected with them, taking in consideration some of the current available options. I will also give a closer look on how the cloud computing in general and database as a service can be used for small and medium enterprises, and what are the main benefits that it offers and whether it will really help businesses to reduce the budget and focus on their core business. 7

8 2. Introduction to Cloud Computing In reality the cloud is something that we have been using for a long time, it is the Internet, with all the standards and protocols that provide Web services to us. Usually the Internet is drawn as a cloud; this is abstraction, one of the essential characteristics of cloud computing. The cloud computing is distinguished by its concept of virtual resources that appear to be limitless and the details of the physical system on which software runs are abstracted from user. Cloud computing refers to services and applications that run on a distributed network using virtualized resources that are accessed by common Internet protocols and networking standards. [1] The advancements in the past few years in connectivity and wireless network speed is one of the main things that make driving cloud computing practical or even possible. In some way, cloud computing is an eventuality. The boom of the mobile devices, smartphones, tablets, etc. is pushing cloud computing even faster. This represents a major breakthrough not only in computing but also communication. With the popularization of the internet and the growing number of large service companies a massive scale of cloud computing systems were enabled. The cloud computing brings a real paradigm shift of the way the systems are deployed. Cloud computing can be compared to the standard utility companies. Cloud computing makes the dream of utility computing possible with a universally available, pay-as-you-go, infinitely scalable system. In other words, everything comes from one central location; the things are just turned on and off. This will give more people access to a much larger pool or resources at a highly reduced cost. The ability of cloud computing to offer users access to off-site hardware and software, is one of its biggest benefits. All the networks, processors, hardware and software combined will give individuals much more computing power. With keeping things light and simple individual access devices are going to last a lot longer, and losing or breaking a device is no longer going to be of any particular concern, as they can be replaced and there s no danger of losing your files or information as they are in the cloud (citation [1]). With cloud computing, you can start very small and become big very fast. That's why cloud (citation [3]) computing is revolutionary, even if the technology it is built on is evolutionary. 8

9 2.1 Cloud computing definition The use of the word cloud makes reference to the two essential concepts: Abstraction Virtualization Abstraction Cloud computing is abstracting the details of the system implementation from the users and the developers. Applications run on physical systems that aren't specified, data is stored in locations that are unknown, administration of systems is outsourced to others, and access by users is ubiquitous. [1] Virtualization Cloud computing virtualizes systems by pooling and sharing resources. Systems and storage can be provisioned on-demand from a centralized infrastructure, the resources are scalable, there is enabled multi-tenancy and costs are assessed on a metered basis. Cloud computing is an abstraction based on the idea of pooling physical resources and manifest them as virtual resources. It is a represents a model for provisioning resources for platform independent user access to services and applications. There are many different types of clouds and it is important to define what kind of clouds we are working with. The applications and services that run on cloud are not necessarily delivered by a cloud service provider. 2.2 Cloud Types Usually the cloud computing is separated into two distinct sets of models: Deployment models refers to location and management of the cloud s infrastructure. Service models particular types of services that can be accessed on a cloud computing platform. 9

10 2.2.1 NIST model The NIST model is set of working definitions published by the U.S. National Institute of Standards and Technology. The following section presents part of the NIST model definition of cloud computing. The whole content of this section is taken as it is and defined in the paper NIST Definition of cloud computing published by the U.S. National Institute of Standards. This cloud model is composed of five essential characteristics, three service models, and four deployment models. [2] Essential Characteristics: On-demand self-service - A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider. Broad network access - Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations). Resource pooling - The provider s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). Examples of resources include storage, processing, memory, and network bandwidth. Rapid elasticity - Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time. Measured service - Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g. storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service. Service Models: Software as a Service (SaaS) - The capability provided to the consumer is to use the provider s applications running on a cloud infrastructure 2. The applications are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based ), or a program interface. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user specific application configuration settings. 10

11 Platform as a Service (PaaS) - The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment. Infrastructure as a Service (IaaS) - The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls). Deployment Models: Private cloud - The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers (e.g., business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises. Community cloud - The cloud infrastructure is provisioned for exclusive use by a specific community of consumers from organizations that have shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be owned, managed, and operated by one or more of the organizations in the community, a third party, or some combination of them, and it may exist on or off premises. Public cloud - The cloud infrastructure is provisioned for open use by the general public. It is usually open system available to general public via WWW or Internet. It may be owned, managed, and operated by a business, academic, or government organization, or some combination of them. It exists on the premises of the cloud provider. Examples of public cloud: Google application engine, Amazon elastic compute cloud, Microsoft Azure. Hybrid cloud - The cloud infrastructure is a composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds). [2] 11

12 2.3 Cloud computing architecture Cloud computing architecture is essentially a series of levels that function together in various ways to create a system. The cloud itself creates a system where resources can be pooled and distributed as needed. Cloud architecture can combine software running in multiple locations on hardware that is virtualized in order to provide an on-demand service to user facing hardware and software. A cloud can be created within an organization's own infrastructure or it is possible to be outsourced to another datacenter. Because virtual resources are easier to optimize and modify the resources in the cloud are mostly virtualized resources. A compute cloud requires virtualized storage to support the staging and storage of data. From a user's perspective, it is important that the resources appear to be infinitely scalable, that the service be measurable, and that the pricing be metered. [1] Figure 1 Cloud computing stack Applications in the cloud are usually composable systems, this means that they are using standard component so assemble services that are tailored for a specific purpose. A composable component must be: Modular: It is a self-contained and independent unit that is cooperative, reusable, and replaceable. It can be deployed independently. 12

13 Stateless: A transaction is executed independently, without regard to other transactions or requests In general cloud computing does not require that hardware and software to be composable but it is a highly desirable characteristic. Composable systems are much easier to implement and solutions are more portable and interoperable. Some of the benefits from composable system are [1] : Easier to assemble systems Cheaper system development More reliable operation A larger pool of qualified developers A logical design methodology There is a trend toward designing composable systems in cloud computing in the widespread adoption of what is called the Service Oriented Architecture (SOA). The essence of a Service Oriented Architecture is designing application using services or components, building application in modular fashion. The services are constructed from modules that are using standard communications and service interfaces that collectively provide the complete functionality of a large software application. An example of widely used XML based standards describes the services themselves in terms of: Web Services Description Language (WSDL) describes the web service, how to invoke it and what exactly it does. Simple Object Access Protocol (SOAP) describes the communications between the services, the message format. Universal Discovery, Description, and Integration (UDDI) Directory of web services that are available to be used. There are, of course, alternative sets of standards. The nature of the module itself it is not specified and it can be developed in any programming language. From the standpoint of the system, the module is a black box, and only the interface is well specified. This independence of the way how the internals of the module or component works means it can be easily replaced with different model, relocated, or replaced at will, provided that the interface specification remains unchanged. 13

14 Essentially there are 3 tiers in a basic cloud computing architecture: Infrastructure Platform Application If we further break down the standard cloud computing architecture there are really two (Citation [1]) areas to deal with; the front end and back end. Front End - The front end includes all client (user) devices and hardware in addition to their computer network and the application that they actually use to make a connection with the (Citation [1]) cloud. Back End - The back end is populated with the various servers, data storage devices and (Citation [1]) hardware that facilitate the functionality of a cloud computing network Infrastructure The infrastructure of cloud computing architecture is basically all the hardware, including virtualized hardware, data storage devices, networking equipment, applications and software that drives the cloud. Most Infrastructure as a Service (IaaS) providers use virtual machines to deliver servers that run applications. Virtual machines images or instances are containers that have assigned specific resources (number of CPU cycles, memory access, network bandwidth, etc.). Figure 2 shows the cloud computing stack that is defined as the server. The Virtual Machine Monitor, also called a hypervisor is the low level software or hardware that allows different guest operating systems to run in their own memory space and manages I/O for the virtual machines. [1] 14

15 Figure 2 "Server" stack Platform A cloud computing platform is the software layer that is used to create higher level of services. It is the programming code and implemented systems of interfacing that allows users and applications connect and use the available hardware and software resources of the cloud. A cloud computing platform is generally divided up between the front end and back end of a network. Its job is to provide a communication and access portal for the client, so that they may effectively utilize the resources of the cloud network. The platform may only be a set of directions, but it is in all actuality the most integral part of a cloud computing network; without it cloud computing would not be possible (citation [3]). There are many different Platform as a Service (PaaS) providers, we will mention some of them: s and Platforms Windows Azure Platform Google Apps and Google AppEngine Amazon Web services 15

16 All platform services offer hosted hardware and software that is needed to build and deploy Web application or other custom services. Many of the operating system vendors already provide their development environments into the cloud with the same technologies that have been successfully used to create Web applications. [1] Thus, you might find a platform based on an Oracle xvm hypervisor virtual machine that includes NetBeans IDE and that supports the Oracle GlassFish Web stack programmable using Perl or Ruby. For Windows, Microsoft with its Azure cloud provides a platform that allows Windows users to run on a Hyper-V VM, use the ASP.NET application framework, supports SQL Server or some of the other enterprise applications, and be programmable within Visual Studio. With this approach developers can develop a program in the cloud that can be used by many others. Platforms usually come with tools and utilities to support application design and deployment. Depending on the vendor they can be: tools for team collaboration, testing tools, versioning tools, database and web service integration, and storage tools. Platform providers begin with creation of developer s community to support the work that is being done in the environment. The platform is exposed to users through an API, also an application built in the cloud using a platform service would encapsulates the service through its own API. An API can control data flow, communications, and other important features of the cloud application. Until now there are is no standard API and each cloud vendor has their own Application This area is composed of the client hardware and the interface used to connect to the cloud. It is from the design of Internet protocols to treat each request to a server as an independent transaction (stateless service) that crucial problems arise from. The standard HTTP commands are all atomic in nature. While stateless servers are easier to architect and stateless transactions are more resilient and can survive outages, much of the useful work (citation [1]) that computer systems need to accomplish are stateful. Usage of transaction servers, message queuing servers and other similar middleware is meant to bridge this problem. Standard methods that are part of Service oriented Architecture that help to solve this issue and that are used in cloud computing are: Orchestration process flow can be choreographed as a service Use of service bus that controls cloud components There are many ways how clients can connect to a cloud service. The most common are: 16

17 Web browser Proprietary application This application can run on number of different devices, PC, Servers, Smartphones, and Tablets. They all need a secure way to communicate with the cloud. Some of the basic methods to secure the connection are: Secure protocol such as SSL (HTTPS). FTPS, IPSec or SSH Virtual connection using a virtual private network (VPN) Remote data transfer such as Microsoft RDP or Citrix ICA that are using tunneling mechanism Data encryption 17

18 3. Scalability The scalability is the ability of a system to handle growing amount of work in a capable manner or its ability to improve when additional resources are added. The scalability requirement arises due to the constant load fluctuations that are common in the context of Web-based services. In fact these load fluctuations occur at varying frequencies: daily, weekly, and over longer periods. The other source of load variation is due to unpredictable growth (or decline) in usage. The need for scalable design is to ensure that the system capacity can be augmented by adding additional hardware resources whenever warranted by load fluctuations. Thus, scalability has emerged both as a critical requirement as well as a fundamental challenge in the context of cloud computing. [1][4] Typically there are two ways to increase scalability: Vertical scalability by adding hardware resources, usually addition of CPU, memory etc. This vertical scaling (scaling-up) enables them to use virtualizations technologies more effectively by providing more resources for the hosted operating systems and applications to share. Horizontal scalability means to add more nodes to a system, such as adding new node to a distributed software application or adding more access points within the current system. Hundreds of small computers may be configured in a cluster to obtain aggregate computing power. The Horizontal scalability (scale-out) model also increases the demand for shared data storage with high I/O performance especially in cases where processing of large amounts of data is required. In general, the scale-out paradigm has served as the fundamental design paradigm for the large-scale datacenters of today. Integrating multiple load balancers into the system is probably one of the best solution for dealing with scalability issues. There are many different forms of load balancers to choose from; server farms, software and even hardware that have been designed to handle and distribute increased traffic. Items that interfere with scalability: Too much software clutter (no organization) within the hardware stack(s) Overuse of third-party scaling Reliance on the use of synchronous calls Not enough caching 18

19 Database not being used properly Creating a cloud network that offers the maximum level of scalability potential is entirely possible if we apply a more diagonal solution. By incorporating the best solutions present in both vertical and horizontal scaling, it is possible to reap the benefits of both models. (citation [3] ) In order to keep a consistent architecture when adding new components, once the servers reach their limits (no possibility of growth), we should simply start cloning them. Usually most of the problems arise from lack of resources not the inherent architecture of their cloud itself. A more diagonal approach should help the business to deal with the current and growing demands that it is facing. 19

20 4. Elasticity One of the most important attributes of the cloud computing is certainly its elasticity, the ability to upgrade resources and capacities on the fly, on the moment notice, instantly. The cloud computing creates an illusion of virtually infinite computing resources available on demand. Scalability of applications and storage are all elastic in the cloud. It must be noted that there is a subtle difference between elasticity and scalability when used to express a system s behavior. Scalability is a static property that specifies the behavior of the system on a static configuration. For example, a system design might scale to hundreds or even to thousands of nodes. On the other hand, elasticity is dynamic property that allows the system s to scale up or down on-demand all in a live system without service disruption while the system is operational. For example, a system design is elastic if it can scale from 5 servers to 10 servers (or vice-versa) on-demand. A system can have any combination of these two properties. The real-time infrastructure that actively responds on user requests for resources is the most remarkable thing about cloud computing. It is this elastic ability that allows service providers to offer their users access to cloud computing services at highly reduced costs. The pay for what you use model allows users to save money. As an example with the traditional computing network users have their own hardware setup of which most of the users rarely use more than 50% of the capacity. What cloud computing is offering is the possibility for the users to keep their expectations and current standards while still having the opportunity for expansion open when they will need it. This also improves the get more energy efficient computing while still providing the same computer experience plus adding the benefit of virtually limitless resources. In other word elasticity allows both user and provider to do more with less. 20

21 5. Database Management Systems in the cloud (Database as a service) Data and database management are integral part of wide variety of applications. Particularly relation DBMSs had been massively used due to many futures that they offer: Overall functionality offering intuitive and relatively simple model for modeling different types of applications. Consistency, dealing with concurrent workloads without worrying about the data getting out of sync Performance, low latency and high throughput combined with many years of engineering and development Reliability, persistence of data in the presence of different types of failures and ensuring safety. The main concern is that the DBMSs and RDBMSs are not cloud-friendly because they are not as scalable as the web-servers and application servers, which can scale from a few machines to hundreds. The traditional DBMSs are not design to run on top of the sharednothing architecture (where a set of independent machines accomplish a task with minimal resource overlap) and they do not provide the tools needed to scale-out from a few to a large number of machines. Technology leaders such as Google, Amazon, and Microsoft have demonstrated that data centers comprising thousands to hundreds of thousands compute nodes, provide unprecedented economies-of-scale since multiple applications can share a common infrastructure. All three companies have provided frameworks such as Amazon s AWS, Google s AppEngine and Microsoft Azure for hosting third party application in their clouds (data-center infrastructures). Because the RDBMs or transactional data management databases that back banking, airline reservation, online e-commerce, and supply chain management applications typically rely on the ACID (Atomicity, Consistency, Isolation, Durability) guarantees that databases provide and It is hard to maintain ACID guarantees in the face of data replication over large geographic distances 1, they even have developed propriety data management 1 CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it was successful or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system) 21

22 technologies referred to as key-value stores or informally called NO-SQL database management systems. [6] The need for web-based application to support virtually unlimited number of users and to be able to respond to sudden load fluctuations raises the requirement to make them scalable in cloud computing platforms. There is a need that such scalability can be provisioned dynamically without causing any interruption in the service. Key-value stores and other NOSQL database solutions, such as Google Datastore offered with Google AppEngine, Amazon SimpleDB and DynamoDB, MongoDB and others, have been designed so that they can be elastic or can be dynamically provisioned in the presence of load fluctuations. We will explain some of these systems in more details later on. As we move to the cloud-computing arena which typically comprises data-centers with thousands of servers, the manual approach of database administration is no longer feasible. Instead, there is a growing need to make the underlying data management layer autonomic or self-managing especially when it comes to load redistribution, scalability, and elasticity. [7] Figure 3 Traditional VS Cloud Data Services This issue becomes especially acute in the context of pay-per-use cloud-computing platforms hosting multi-tenant applications. In this model, the service provider is According to the theorem, a distributed system can satisfy any two of these guarantees at the same time, but not all three. 22

23 interested in minimizing its operational cost by consolidating multiple tenants on as few machines as possible during periods of low activity and distributing these tenants on a larger number of servers during peak usage [7]. Due to the above desirable properties of key-value stores in the context of cloud computing and large-scale data-centers, they are being widely used as the data management tier for cloud-enabled Web applications. Although it is claimed that atomicity at a single key is adequate in the context of many Web-oriented applications, evidence is emerging that indicates that in many application scenarios this is not enough. In such cases, the responsibility to ensure atomicity and consistency of multiple data entities falls on the application developers. This results in the duplication of multi-entity synchronization mechanisms many times in the application software. In addition, as it is widely recognized that concurrent programs are highly vulnerable to subtle bugs and errors, this approach impacts the application reliability adversely. The realization of providing atomicity beyond single entities is widely discussed in developer blogs. Recently, this problem has also been recognized by the senior architects from Amazon and Google, leading to systems like MegaStore [10] that provide transactional guarantees on key-value stores. In the next part I will offer a general Work breakdown structure of a project for migrating a database to a cloud based solutions, the key project activities will be explained including the potential challenges encountered in each activity are discussed and compered between the current available solutions considering both RDBMs and NOSQL DBMs offerings in the cloud explained to considerable level of detail how they work and how they are provisioned. 23

24 6. Scope Breakdown Structure of project for migration of a database to a cloud based solution. A work breakdown structure is a project management tool which organizes the project deliverables in hierarchical structure. According to the IPMA (International Project Management Association) definition, the work breakdown structure is document containing a hierarchical breakdown of the project goal into individual deliverables and further into individual products and sub-products to the level of individual work packages to be delivered in the course of project implementation. It defines 100% of the overall scope of the project. Subsequent levels list increasingly detailed definitions of project products. Because some methodologies describe this term as hierarchical breakdown of activities or tasks, sub-tasks, phases, as recommended by RNDr. Zdenko Stanicek in this Thesis and in the next release, version four, of the IPMA Competency Baseline (ICB) the term Scope Breakdown structure will be used. The scope represents the total content included in the project. The project should deliver all that is described within its scope. The Scope definition is expressed in scope breakdown structure (tree structure) where each node of this tree represents a deliverable or subdeliverable on a particular level. [citation Stanicek] Each deliverable or sub-deliverable that is part of the SBS should be absolutely clear without any reference how it should be delivered. The definition of Manage Scope encompasses the state which has to be achieved by the project without any reference to a possible way how it can be achieved. What is more the scope includes the totality of the goals and deliverables of the project. The scope is refined as the project develops. The continuous refinement of the Scope definition is visualized through documents that define those deliverables and subdeliverables in step-by-step improved manner, i.e. with growing detail and precision, as [citation Stanicek] the knowledge of the solved problem progresses. Because a deliverable is any product of a project, it is not concerned with the tasks and activities that lead to its completion. The Scope embraces the totality of the goals and deliverables of the project, defines the boundaries, i.e., what is included in the project and, [citation Stanicek] moreover, what is not included in the project. Although almost identical and very often confused in terms of terminology it is of crucial importance to properly understand WBS and Scope In order to avoid further confusion and 24

25 misleading it is suggested to use SBS instead of WBS. Therefore the SBS terminology will be applied This section describes the Scope Breakdown Structure (SBS) of a project for migration of databases to cloud based solution. In Figure 4 the SBS is presented to the first level. As I will describe the deliverables in more detail the lower levels will be presented. Figure 4 Scope Breakdown Structure 25

26 7. SBS Deliverables, Challenges and Risks In this section, the SBS project deliverables will be described. The first and second level deliverables will be in focus of this section as well as the potential challenges that could emerge during the execution of each of them. Management and support provided This deliverable should provide the project preparation and all the intial planing. The steps taken here will help us to indentify the primary focus areas: scope, objectives, plan, definition of project team and risk management. The project charter and the project strategy are defined here. Some of the challenges and risks that might be presented are: Not clearly defined project goals or lack on agreement on the project goals, Lack of senior management involvement, Not defined effective project management methodology. 26

27 Planning completed The point of this deliverable is to provide preliminary cloud assessment. Evaluate the offerings from the different cloud providers on the market, get clear understanding of the separation of the responsibilities between cloud provider and the client, agree on it and define a responsible team. Also the legacy application should be evaluated at this point to get the overall picture, early enough if the cloud based solution should be considered for these applications. The cloud computing represents a shared responsibility between the provider and the client. The demarcation line that separates the responsibilities between the client and the provider varies according to the area and according to the delivery model that is being evaluated. Sometimes this demarcation line is referred to as trust boundary, it illustrates that for those areas that fall under cloud provider responsibility the client must trust the execution and implementation by the provider. Most of the cloud provider s service agreements limit the provider s responsibility, typically to a refund fees, whether the application will fail in availability causing financial loss to the client or failure to comply with some of the important compliance requirements, these agreements shift the primary risk responsibility to the client. Having this in mind it is from outmost importance to clearly understand where this trust boundary lays. The Table 1, presented below can help in better understanding the trust boundary and provide some guidance regarding the security responsibilities of the various delivery models. The zigzag line indicates where the trust boundary lies for each delivery model. Creating a comparison table from the available solutions can be also useful and helpful. In Appendix section there is presented an example of a table, Table 2, with comparison based on the common considerations of some of the currently available RDBMs as a service on the market. 27

28 Table 1 Trust boundary in cloud security IaaS PaaS SaaS Application Responsibility User User Provider Action Apply the best practices and certification Apply the best practices and certification Evaluation certification and OS/Middleware Responsibility User Provider Provider Action Apply the best practices and certification Evaluation certification and Evaluation certification and Infra-Structure Responsibility Provider Provider Provider Action Evaluation and certification Evaluation certification and Evaluation certification and Potential challenges and risks are: Provide accurate cost analysis - Weighing the cost considerations of owning and operating data center against going to a cloud provider and choosing the one that meets your requirements requires careful and detailed analysis. Businesses have to take multiple options in consideration in order to get a valid comparison between the alternatives. Most of the cloud providers like Amazon and Microsoft have already published whitepapers that can help in the process of gathering data for the appropriate comparison. Also they have implemented cost calculators that can help with the analysis of the decision makers. Failure to involve company security advisors early in the process during the cloud security assessment Some organizations require specific IT security policies and compliance and it is very important to include the company s security advisors in the cloud assessment process to help you with the decision. First the information needs to be classified. The organization data has to be evaluated, its value has to be understood, and what are the risks if the data is compromised. Key challenges and risks here are: 28

29 o Identification and correct classification of the data, o Where the data currently resides, o Is there are any obligation to store the data in specific jurisdiction For example Microsoft Azure, Amazon and Google allow the users to designate in which region the data is stored. o Clarify the options to retrieve the all the data from the cloud provider and the option to move it to a different provider. This also covers interoperabilty (being able to communicate and work with multiple cloud service providers) and poratbilty (to be able to move the system to a different cloud provider, not to be dependent on one cloud service provider). It must be taken in consideration that some of the cloud database offerings like for example Google Cloud SQL are only accessible through their platform Google AppEngine. Data security can be a big issue but if it is properly understood analyzed and classified, with the proper understanding of the risks and threats it can help to identify which databases can be moved in the cloud and which one should be kept in house. Technical architecture assessment Application dependency tree can help to identify which applications are suitable to be moved into the cloud. The main considerations here should be: o Will the cloud provide the entire infrastructure that we require? o Is it possible to reuse the management and configuration tools that we have? o Will it allow us to cancel the support contracts for software, network and hardware? Creating a dependency tree based on detailed examination of the construction of the enterprise applications will help to classify applications based on their dependences. This dependency tree should highlights all the different parts of the applications and identify their upward and downstream dependencies to other applications. This diagram should be an accurate snapshot of the enterprise application assets. In order to identify good candidates for the cloud we should look for applications with under-utilized assets; applications that have an immediate 29

30 business need to scale and are running out of capacity; applications that have architectural flexibility; applications that utilize traditional tape drives to backup data. Avoid applications that require specialized hardware to function (for example, mainframe or specialized encryption hardware). Another important point during this activity is to evaluate the possibility of migrating licensed products. For example Amazon is offering the possibility to Bring your own license. If the organization has purchased the license in the traditional way or already has a purchased license it can be applied to the products that are available as preconfigured Amazon Machine Image. [42] Similarly Microsoft is offering Azure VM with included license for the operating system and also VMs with included license for SQL server. Not dedicating a team. Expecting that the IT staff will be able to do their BAU job while moving to the cloud. A dedicated team should be created that will focus on the challenges that will come, overcome them and succeed. Inability to move or link legacy applications Focus on the applications that provide the maximum benefit for the minimum cost/risk. Assess legacy application compatibility and how much re-work is needed for their migration. Prioritize which applications to migrate to the cloud and in which order. Understanding the SLA - Small business owners usually do not have much experience with these types of agreements and by failing to review them fully, might open up big problems for the future. Business impact in the SLA must be carefully considered and analyzed. Close attention should be paid to the availability guarantees and penalty clauses: o Does the availability fit in with organization business model? o What do you need to do to receive the credits when the hosting provider failed to achieve the guaranteed service levels? o Are they automatically processes, or do you need to ask for them in writing? Usually the cloud providers have one SLA for all users and do not provide customization of the SLA. All of the above mentioned considerations must be evaluated carefully before moving to a cloud based solutions in order to mitigate the risk and be confident to choose the right cloud services that will support and insure growth of the business. 30

31 Proof of concept created Once the cloud assessment is compete and the possible candidates are identified it is time to test the cloud solution with a small proof of concept. The main goal here is to learn the chosen cloud provider/solution and to test the assumptions regarding sustainability for migration to the cloud are feasible and accurate. At a minimum you should get familiar with the APIs, Tools, SDKs, plugins that cloud provider is offering. It is a good idea at this time to deploy some small application as a test and in the process get really involved in the cloud. This can be easily achieved as most of the cloud providers offer some limited free account or free trial period. The proof of concept should represent the real application in small; it should test its critical functionalities in the cloud environment. You should start with a small database and users should not be afraid to play around with the offered possibilities, for example lunching and terminating instances, etc. In order to gather all the the necessary benchmarks stress testing of the cloud system should be included here too. During this building of proof of concept there is a lot that can be learn about the capabilities and applicability of the chosen cloud solution and it can quickly broaden the set of the applications that can be migrated. The proof of concept should raise the awareness of the power of the cloud within the organization and it can help to set expectation, validate the technology and perform the necessary benchmarks. It provides the hands-on experience with the new cloud environment and will give more inside what challenges you might face and need to overcome in order to move ahead with the migration. Possible challenges and risks might be: Unclear and misunderstood requirements, Lack of effective methodology to build the correct proof of concept failure to build appropriate proof of concept test that is missing some of the key functionalities 31

32 or/and is using data that is too simple and doesn t correspond with the real-time production data might be one of the biggest risk for the project. Poor estimation and failure to perform all the needed activities, Not documenting the lessons learned capturing the lessons learned in a form of whitepaper or a presentation and sharing it within the company is one of the most important deliverables. Data migration completed First and most important the different available storage options should be carefully evaluated. There several points that have to be considered to make sure the solution will meet the need for easily scaling the applications. Cost, query ability, relational SQL, size of the objects, update frequency, read vs write, consistency (strictly consistent vs eventual), all this point have to be taken in consideration and right tradeoffs have to be made. Creating table from the available storage options with examples for what can be used can be very beneficial. An example of such table, Table3, is presented in the Appendix section. At this point it should be decided how the data will be migrated, whether it will be migrated to the cloud native solution (for example for MySQL implementation can be migrated to Amazon RDS or Google SQL, MS SQL can be migrated to SQL Azure etc.) or the migration will be done to a VM preloaded with the desired product (Oracle, MS SQL, DB2 etc.). In a case where cloud native solution is chosen there might be need to develop new DB architecture specific for that cloud solution. (For example migrating to will require complete re-engineering of the database). In case of migration of large amount of data (multiple terabytes) options such as Import/Export service that some cloud providers provide should be considered. For example Amazon with AWS Import/Export Service, offers the ability to load the data on USB 2.0 or esata storage devices and ship them via a carrier to AWS. AWS then uploads the data into the designated buckets in Amazon S3. [41] 32

33 Another point that it is very important at this time is to set the backups and retention period for the already migrated data and also consider the possibility to move backups that are done on tape to cloud based storage. Data migration challenges and risks include: Failure to identify the right storage option Failure to implement good migration strategy - usually underestimated and overlooked is the time that is needed to move an exsisting workload to the cloud. Tipcaly overlooked is the bandwith cost of moving large amount of data to the cloud provider. The time thaken to transfer the data and the business process involved in the migration. Cloud tested and leveraged After the data migration is complete and the data is successfully set and working in the cloud, tests are run and confirmed that everything is working, it is necessary to invest some time and resources in determining how to draw additional benefits from the cloud. The following questions can be asked: What needs to be changed in order to leverage and implement the scalability and elasticity that the cloud is offering? What processes can be automated for easier management and maintenance? What steps should be taken to secure the organization in the event of failure? Even though the data is migrated to the cloud you still have the responsibility for securing the data. Security best practices should be always implemented. Password should be changed on regular basis; Users should have restricted access to the resources; Users and groups with different access privileges should be created; It is advisable to encrypt the data no meter if it is at-rest (AES) or during transfer (SSL) 33

34 With moving into the cloud arena it is necessary to revise the software development lifecycle and upgrade process that is already in place. Having the possibility to request the infrastructure minutes before it is needed and the scriptable environment the software deployment process can be fully automated. The development, testing, staging and production environments can be managed by creating re-usable configuration tools and launching specific VMs for each environment on demand. The upgrade process can be also automated and simplified, with the cloud under our hands there is no need to upgrade the software version on the old machines, instead new pre-configured instances can be launched and old ones can be thrown away. This also gives the opportunity for a quick rollback, minimizing the downtime, in case of upgrade problems. It is highly advisable to create Business Continuity Plan as a part of this deliverable. The business continuity plan should include: Data replication strategy for the databases Data backup and retention policy Using VMs with latest patches implemented Disaster recovery plan in the cloud Disaster recovery plan to fail back to in-house or corporate center Smaller organizations usually do not have disaster recovery plan in place because it is prohibitively costly to maintain separate hardware or datacenter for disaster recovery. With the use of virtualization and data snapshots the cloud makes the disaster recovery plan implementation noticeably less expensive and much simpler. The process of launching cloud resources on which the entire cloud environment can be brought up within couple of minutes can be fully automated. Potential challenges and risk here may include: Discovering that substantial amount of refactoring and decomposing of the application needs to be done to make it more scalable Lack of understanding of the cloud Failure to realize the importance of creating business continuity plan. 34

35 Documentation created In order to achieve efficient planning, operation and reporting every solution require reliable and understandable documentation. The goal is to create central documentation for the implemented cloud solution. One of the main challenges is to preserve the knowledge gathered during the project and to keep it up to date. The information stored in the documentation is needed in the operation and continuous improvement and optimization of your application. The documentation can include on-line help, user guides, whitepapers, quick reference guides. The usual challenges and risks during this activity are: Failure to understand the importance of clear and complete documentation Lack of user commitment and willingness to write documentation Training provided Training involves imparting knowledge of the implemented solution to the users before the system goes into live operation. This activity entails defining business processes for the respective roles and defining business scenarios to suit these processes. These scenarios enable the users to understand the system functionality better. Also training should be provided for the application and database administrators. They will need to learn the tools and the automation possibilities that specific cloud providers offer. As it was mentioned earlier the cloud changes a lot of things from a point of view of management and maintenance of the system. In order to completely exploit all the possibilities that the cloud promises the IT support stuff should be properly educated. 35

36 Potential challenges and risk can be: Ineffective communication Lack of user commitment Lack of understanding of the cloud and its benefits Failure to provide proper training for the decision makers Decision makers must be able to understand the cloud and the functional and management changes that it brings in order to buy-in into the solution. Resistance to change Conflicts between departments Operation switched over/cutover to the cloud This means going live from the in-house solution to the new cloud based one. With this the cloud database and application became operational in live environment. This includes final migration of the live data from the old system to the new cloud based system. Typical challenges and risks that might be presented: Underestimating time that is needed to move an exsisting workload to the cloud The time thaken to transfer the data and the business process involved in the migration resulting in prolonged downtime Lack of testing with the whole production data transferred in the cloud Lack of business redines Failure of timelly education of the IT support stuff 36

37 Monitoring set and optimization complete Proper optimization of the cloud based solution can have immediate visible improvement in increasing the cost savings. Having the pay for what you use model in place you should always strive to optimize the system in whatever way possible. A small optimization can result in huge amount of savings in the next month bills. [42] To achieve this you should: Understand you usage patterns With the cloud ability to create automated elastic environment and good understanding of the usage patterns you can easily scale down your infrastructure during inactive time periods and reduce costs. For example with proper monitoring and log inspections you can easily identify underutilized instances and eliminate them or scale them down to a smaller and cheaper VM instance instead. Improve efficiency and reduce waste during deployment As all cloud providers charge based on the traffic, compressing the data before transmitting could result in significant cost savings. Evaluate if you have all the cloud aware system administration tools required for management and maintenance of the database and application Implement advance monitoring Proper monitoring gives the must have visibility for the business critical applications and services. It is important to keep in mind that the end-user response time of the databases and applications in the cloud does not depend only on the cloud infrastructure. Various factors such as internet connectivity, browsers, third party services, just to name a few, can have significant impact. By measuring and monitoring the performance of your cloud applications can help you identify any performance issues and diagnose the root causes so appropriate actions can be taken. 37

Best Practices in Scalable Web Development

Best Practices in Scalable Web Development MASARYK UNIVERSITY FACULTY OF INFORMATICS Best Practices in Scalable Web Development MASTER THESIS Martin Novák May, 2014 Brno, Czech Republic Declaration Hereby I declare that this paper is my original

More information



More information

Kent State University s Cloud Strategy

Kent State University s Cloud Strategy Kent State University s Cloud Strategy Table of Contents Item Page 1. From the CIO 3 2. Strategic Direction for Cloud Computing at Kent State 4 3. Cloud Computing at Kent State University 5 4. Methodology

More information

Semester: Title: Cloud computing - impact on business

Semester: Title: Cloud computing - impact on business Semester: Title: Cloud computing - impact on business Project Period: September 2014- January 2015 Aalborg University Copenhagen A.C. Meyers Vænge 15 2450 København SV Semester Coordinator: Henning Olesen

More information

19 Knowledge-Intensive

19 Knowledge-Intensive 19 Knowledge-Intensive Cloud Services Transforming the Cloud Delivery Stack Michael P. Papazoglou and Luis M. Vaquero Contents 19.1 Introduction...450 19.2 Cloud Computing Overview...452 19.3 Cloud APIs...455

More information

Ensuring a Thriving Cloud market: Why interoperability matters for business and government

Ensuring a Thriving Cloud market: Why interoperability matters for business and government Ensuring a Thriving Cloud market: Why interoperability matters for business and government An Executive Summary Businesses, public administrations and individuals are eagerly embracing cloud computing

More information

Cyber Security and Reliability in a Digital Cloud

Cyber Security and Reliability in a Digital Cloud JANUARY 2013 REPORT OF THE DEFENSE SCIENCE BOARD TASK FORCE ON Cyber Security and Reliability in a Digital Cloud JANUARY 2013 Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics

More information

DRAFT Cloud Computing Synopsis and Recommendations

DRAFT Cloud Computing Synopsis and Recommendations Special Publication 800-146 DRAFT Cloud Computing Synopsis and Recommendations Recommendations of the National Institute of Standards and Technology Lee Badger Tim Grance Robert Patt-Corner Jeff Voas NIST

More information

Cloud-Based Software Engineering

Cloud-Based Software Engineering Cloud-Based Software Engineering PROCEEDINGS OF THE SEMINAR NO. 58312107 DR. JÜRGEN MÜNCH 5.8.2013 Professor Faculty of Science Department of Computer Science EDITORS Prof. Dr. Jürgen Münch Simo Mäkinen,

More information

Challenges and Opportunities of Cloud Computing

Challenges and Opportunities of Cloud Computing Challenges and Opportunities of Cloud Computing Trade-off Decisions in Cloud Computing Architecture Michael Hauck, Matthias Huber, Markus Klems, Samuel Kounev, Jörn Müller-Quade, Alexander Pretschner,

More information

Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success

Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success June, 2013 Contents Executive Overview...4 Business Innovation & Transformation...5 Roadmap for Social, Mobile and Cloud Solutions...7

More information

Cloud Computing. Chapter. A Fresh Graduate s Guide to Software Development Tools and Technologies

Cloud Computing. Chapter. A Fresh Graduate s Guide to Software Development Tools and Technologies A Fresh Graduate s Guide to Software Development Tools and Technologies Chapter 1 Cloud Computing CHAPTER AUTHORS Wong Tsz Lai Hoang Trancong Steven Goh PREVIOUS CONTRIBUTORS: Boa Ho Man; Goh Hao Yu Gerald;

More information

Architecting for the Cloud: Best Practices January 2010 Last updated - May 2010

Architecting for the Cloud: Best Practices January 2010 Last updated - May 2010 Architecting for the Cloud: Best Practices January 2010 Last updated - May 2010 Jinesh Varia Page 1 of 21 Introduction For several years, software architects have discovered and implemented

More information

Scalability and Performance Management of Internet Applications in the Cloud

Scalability and Performance Management of Internet Applications in the Cloud Hasso-Plattner-Institut University of Potsdam Internet Technology and Systems Group Scalability and Performance Management of Internet Applications in the Cloud A thesis submitted for the degree of "Doktors

More information

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures

Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Green-Cloud: Economics-inspired Scheduling, Energy and Resource Management in Cloud Infrastructures Rodrigo Tavares Fernandes Instituto Superior Técnico Avenida Rovisco

More information

Cloud Computing Models

Cloud Computing Models Cloud Computing Models Eugene Gorelik Working Paper CISL# 2013-01 January 2013 Composite Information Systems Laboratory (CISL) Sloan School of Management, Room E62-422 Massachusetts Institute of Technology

More information



More information

Guidelines for Building a Private Cloud Infrastructure

Guidelines for Building a Private Cloud Infrastructure Guidelines for Building a Private Cloud Infrastructure Zoran Pantić and Muhammad Ali Babar Tech Report TR-2012-153 ISBN: 978-87-7949-254-7 IT University of Copenhagen, Denmark, 2012 ITU Technical Report

More information

Integrating Conventional ERP System with Cloud Services

Integrating Conventional ERP System with Cloud Services 1 Integrating Conventional ERP System with Cloud Services From the Perspective of Cloud Service Type Shi Jia Department of Computer and Systems Sciences Degree subject (EMIS) Degree project at the master

More information

EXIN Cloud Computing Foundation

EXIN Cloud Computing Foundation Workbook EXIN Cloud Computing Foundation Edition May 2012 2 Colophon Title: EXIN CLOUD Computing Foundation Workbook Authors: Johannes W. van den Bent (CLOUD-linguistics) and Martine van der Steeg (The

More information

Optimizing the Resource utilization of Enterprise Content management workloads through measured performance baselines and dynamic topology adaptation

Optimizing the Resource utilization of Enterprise Content management workloads through measured performance baselines and dynamic topology adaptation Institute of Parallel and Distributed Systems University of Stuttgart Universitätsstraße 38 D 70569 Stuttgart Masterarbeit Nr. 3663 Optimizing the Resource utilization of Enterprise Content management

More information

Cloud Computing: Transforming the Enterprise

Cloud Computing: Transforming the Enterprise Cloud Computing: Transforming the Enterprise Cloud computing is not just a trend. It is changing the way IT organizations drive business value. THINK SMART. ACT FAST. FLEX YOUR BUSINESS. EXECUTIVE SUMMARY

More information

FEDERAL CLOUD COMPUTING STRATEGY. Vivek Kundra U.S. Chief Information Officer

FEDERAL CLOUD COMPUTING STRATEGY. Vivek Kundra U.S. Chief Information Officer FEDERAL CLOUD COMPUTING STRATEGY Vivek Kundra U.S. Chief Information Officer FEBRUARY 8, 2011 TABLE OF CONTENTS Executive Summary 1 I. Unleashing the Power of Cloud 5 1. Defining cloud computing 5 2.

More information

An introduction and guide to buying Cloud Services

An introduction and guide to buying Cloud Services An introduction and guide to buying Cloud Services DEFINITION Cloud Computing definition Cloud Computing is a term that relates to the IT infrastructure and environment required to develop/ host/run IT

More information

Security Recommendations for Cloud Computing Providers

Security Recommendations for Cloud Computing Providers White Paper Security Recommendations for Cloud Computing Providers (Minimum information security requirements) Contents Contents Preamble 3 The BSI Serving the Public 5 1 Introduction 7

More information

Moving from Legacy Systems to Cloud Computing

Moving from Legacy Systems to Cloud Computing Moving from Legacy Systems to Cloud Computing A Tata Communications White Paper October, 2010 White Paper 2010 Tata Communications Table of Contents 1 Executive Summary... 4 2 Introduction... 5 2.1 Definition

More information

Best Practices for Building an Enterprise Private Cloud

Best Practices for Building an Enterprise Private Cloud IT@Intel White Paper Intel IT IT Best Practices Private Cloud and Cloud Architecture December 2011 Best Practices for Building an Enterprise Private Cloud Executive Overview As we begin the final phases

More information

OPEN DATA CENTER ALLIANCE : The Private Cloud Strategy at BMW

OPEN DATA CENTER ALLIANCE : The Private Cloud Strategy at BMW sm OPEN DATA CENTER ALLIANCE : The Private Cloud Strategy at BMW SM Table of Contents Legal Notice...3 Executive Summary...4 The Mission of IT-Infrastructure at BMW...5 Objectives for the Private Cloud...6

More information

Analysis on Cloud Computing Database in Cloud Environment Concept and Adoption Paradigm

Analysis on Cloud Computing Database in Cloud Environment Concept and Adoption Paradigm Database Systems Journal vol. III, no. 2/2012 41 Analysis on Cloud Computing Database in Cloud Environment Concept and Adoption Paradigm Elena-Geanina ULARU 1, Florina PUICAN 2, Manole VELICANU 3 1, 2

More information

A Review on Cloud Computing: Design Challenges in Architecture and Security

A Review on Cloud Computing: Design Challenges in Architecture and Security Journal of Computing and Information Technology - CIT 19, 2011, 1, 25 55 doi:10.2498/cit.1001864 25 A Review on Cloud Computing: Design Challenges in Architecture and Security Fei Hu 1, Meikang Qiu 2,JiayinLi

More information