Proc. of Int. Conf. on Multimedia Processing, Communication& Info. Tech., MPCIT Hosting Transaction Based Applications on Cloud A.N.Diggikar 1, Dr. D.H.Rao 2 1 Jain College of Engineering, Belgaum, India Email: vm4anand@gmail.com 2 Dean, Faculty of Engineering, Visvesvaraya Technological University, Belgaum, India Email: dr.raodh@gmail.com Abstract Cloud Computing is an established and accepted paradigm of computing both in industry and academia. Primary success factors include elasticity of computing resources and flexible payment model. The challenge spectrum created by this computing model is wide. One of the challenges is to determine the type of software applications that can be hosted by the compute cloud. Traditionally applications have been broadly classified into Analytical and transactional oriented systems. The features of compute cloud are more inclined to facilitate hosting of analytical systems. In this paper we analyze the support for hosting transaction oriented applications on compute cloud. We propose a model that identifies the components required to enable transactional applications on cloud. This model presents a model with well defined components that will help designers make important design decisions. This can be considered as a pattern for hosting cloud based applications. Index Terms cloud computing, distributed transactions, data store, data distribution I. INTRODUCTION The Cloud Computing paradigm has revolutionised the way in which the Information Technology giants like Google, Amazon, and Yahoo are doing business. This revolution has been similar to electric grids that liberated the corporations from generating electricity on their own, which meant that they focussed on their business goals [1]. The data and programs are driven away from the personal computers and corporate infrastructure into the cloud. Compute cloud paradigm provides variety of resource pool that includes storage and servers, which are offered as a service. Any computer with an internet access can avail the service. A subscriber can access the service when required and unsubscribe when no longer needed, and pay only for the time for which the service was used. This on demand subscribing of computing resources without having to buy any hardware or software, along with a flexible payment model makes compute cloud a successful paradigm. To state the impact of compute cloud, the New York Times converted 4TB of data containing images of articles into sorted PDF format which was made available online, in just 24 hours for 300$ using cloud computing services [2]. The services offered by cloud can be at different levels of abstractions namely, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS provides hardware as service that includes network, server, and storage. PaaS along with the hardware, it embeds software such as operating system and middleware software. In SaaS, cloud service providers maintain everything including the data. As we traverse from IaaS to SaaS, the risk and control shifts from users to cloud service providers as illustrated below as shown in fig.1 [3]. The highlighted parts in the figure indicate the parts that are in control of the cloud service providers. The important characteristics that motivate enterprises to host applications on compute cloud are: Elasticity: ensures that applications deployed in cloud can handle spikes in user requests for resources on DOI: 03.AETS.2013.4.86 Association of Computer Electronics and Electrical Engineers, 2013
Organization s IaaS PaaS SaaS dedicated IT infrastructure Data Data Data Data Applications Applications Applications + Applications Environments Servers Servers Servers Servers Storage Storage Storage Storage Network Network Network Network Control with Organization Control with Service Providers Figure 1: Shift of Control and Risk in the Compute Cloud demand. For example, in Amazon EC2 a server can be added in minutes when required. Zero maintenance: enterprises maintaining internal IT infrastructure invest time and money for running upgrade routines, service packs and to apply service packs, which is taken care by the cloud service providers [4]. This allows the enterprises to focus on the core competencies and lets the maintenance to be handled by the service providers. Reliability: compute cloud ensures higher reliability as the service providers have redundant data centres for backup and on demand access to resources to handle failures. The risk is also transferred to the provider who is better equipped to manage the security with dedicated expert staff, leading to enhanced reliability. Efficient pricing: usage based pricing allows enterprises to start small with an option to scale up with increase in requirements. This avoids huge initial capital investments to set up the infrastructure. As the service providers maintain data centres in locations that require lower cost for cooling, electricity, taxes, property value, and labour, this will ensure efficient pricing. The characteristics of compute cloud attract enterprises to deploy their applications on cloud to exploit the advantages posed by it. However it has been observed that cloud cannot support all types of applications. A study of data management tools on cloud suggests that it is difficult to deploy transaction oriented systems and is more suitable for analytical and batch processing systems. Currently cloud is more suitable for applications that require Share Nothing architecture (SN) that contain multiple nodes, with each node having its own input, output devices, memory, and disks [5]. Hence to support transaction oriented systems on cloud; it requires the distributed database to implement SN architecture. We can consider two options of data management techniques for deploying distributed database on cloud. One is to use the traditional relational databases and another is to implement distributed database using available cloud data management services. The relational databases from Microsoft SQL, Sybase, IBM, and Oracle either do not implement SN architecture or suitable for data warehousing systems. Prominent cloud data management services include Google s Bigtable [6], Amazon s SimpleDB [7], and PNUTS from Yahoo [8]. The design choices of the cloud data management tools include SN architecture but these are developed for their internal systems with availability of systems as higher priority than the data consistency. The transaction oriented systems have strong data consistency as priority with support for atomicity, consistency, isolation, and durability (ACID) properties. However none of the cloud data management tools consider support for ACID properties. Hence the study on data management techniques concludes that neither the traditional databases nor the cloud data management services are suitable for supporting transaction oriented systems [1]. In this paper we propose a model that identifies the components required to support transactions on compute cloud. We provide the important design choices for these components that will help enterprises in designing a transaction oriented system on cloud. The design choices for the components ensure ACID properties in the cloud and also ensuring that the system can leverage the advantages of compute cloud. The decision choices for the components are a combination of established principles of traditional database systems and the distributed data management principles to deal with the drawbacks of cloud data management services. II. RELATED STUDY With the advent of cloud computing, many data management services on cloud also evolved to leverage the cloud characteristics. Some important services include Google s Bigtable, Amazon SimpleDB, and Yahoo s Pnuts. Bigtable data model uses a multidimensional sorted map which is indexed by row and column keys along with a timestamp as {row: string, column: string, timestamp}. Read and write operations are based on row keys which are sorted. Tables are horizontally and dynamically partitioned into tablets, which store an 152
ordered range of row keys. Simple operations such as insert, delete, and update are possible on single row using APIs. The APIs support only single row transactions which means multiple row update operations is not possible. Hence it cannot provide atomic distributed transactions. As Bigtable was created for internal applications at Google, it is ideal for massive parallel computations like MapReduce and not for distributed transactions. It uses distributed lock service called Chubby to control data replication and to coordinate different components of Bigtable [9]. It uses Paxos algorithm for ensuring consistency among different replicas [10]. The chubby lock service is the critical part of the system which has the control and coordination information of system and its failure will bring down the entire system [6]. Amazon s SimpleDB is designed to support other Amazon services such as EC2 (server on demand) and S3 (on demand storage) which form the Amazon cloud infrastructure [7]. Tables of data are composed as domains that allow a column to hold multiple values. It provides APIs for operations on data in domains. All values are stored as string which is the only data type supported. Replicas are managed by different clusters. It supports asynchronous updating of replicas. The eventual consistency model is not suitable for enabling transactions on cloud. SimpleDB runs on top of Amazon s infrastructure which means that database will not be portable to other database service provider on cloud [4]. Yahoo s PNUTS [8] was created to aid its internal applications. Database is horizontally partitioned, with each partition replicated in different regions. It uses Yahoo Message Broker to enable asynchronous replication management that keeps all the replicas consistent. The record level mastering scheme is used for consistency model, which means that only one region at a time will have the master record to serve data requests. It allows the migrating the record mastership based on request loads. It has three components namely Storage units, router, and controller. Storage unit component stores data and serves data requests. Router has configuration details of record distribution across the different regions. B + tree data structure is used to store mapping information of record to region based on the primary key. The controller is used for recovery during failure of storage units, and it enables load balancing between different regions. To summarise, most of the cloud data service providers have opted for a data model that enable horizontal partitioning with data stored as key value pairs and enable access based on row keys. The design choices prioritise availability, scalability, and replication but have relaxed strict constraints of relational model. All have opted for eventual consistency model and the APIs provide data access based on a single row key and cannot support transactions that involve more than one row key. Das, et al. [11], have proposed a scalable system that supports transactions on cloud. In this paper we propose a model that illustrates fundamental blocks of a system that require transaction support on cloud. The following aspects highlight the important challenges that are to be addressed. Data structure, partitioning and distribution of application data, managing data request, and coordination among the components for load balancing and recovery mechanism. III. SYSTEM DESIGN A. Data model Cloud based applications demand a simple data representation for storing and accessing through a programming interface. The data model must enable simple partitioning of application data. We opt for a data model that is similar to the Google s Bigtable with application data stored as key value pairs. The reason behind the choice of this data structure is that we must group the data in the partitions such that transactions are confined to single region. This is possible because most of the applications have good spatial and temporal locality if there is an effective data organisation scheme [12]. B. Dynamics Application data requests are directed to the TP system, which consists of many Transaction and Data Managers (TDM) and Central Coordination service (CCS) to enable distributed transactions. Each TDM contain a partition of database. The application data is persisted on the cloud storage for backup and recovery. CCS will coordinate and monitor all the components of the system. It organizes the meta data information related to application tables, and monitors and coordinates mapping of partitions to TDMs. All the components of the model are set up in the compute cloud. C. Persistent Data Store (PDS) The responsibility of the PDS is to store the application data and logs persistently. There are many alternatives for distributed file storage, some of them are Google s GFS [13], Apache Hadoop HDFS [14], 153
and Amazon S3. One of the advantages of these cloud storages is that they are designed to handle replication and recovery from failures. Here though there are various backups of the system data across different regions, the model will have only one active region. D. Transaction and Data Manager (TDM) Figure 2: Overview of model. The database is partitioned across TDMs which handles the operations on the data partition it stores. TDM is a combination of data store and the transaction manager. Transaction manager is used on top of a data store such as Bigtable or HBase [15], to execute transactions on the partition allocated to it. TDM is also responsible for transaction logging and concurrency control during execution of transactions. To reduce the latency of distributed transactions during commitment protocol, effective logging technique such as force logging or neighbor main memory logging can be used [16]. The TDMs stores the mapping of partition to TDMs from CCS to identify the TDMs responsible to handle the transaction requests. E. Central Coordination Service (CCS) This component is used to maintain the configuration and metadata information required to coordinate all the components of the model. The critical mapping and metadata information is referred to as system state, which will keep all the others components of the system in consistent state. CCS ensures that all the TDMs have updated copy of the mapping information. It carries out regular maintenance operations to detect failure of components and initiates recovery in event of failure. In the event of failure of a TDM, it creates a new TDM and allocates the partition of failed TDM to it and informs other TDMs. An effective data distribution will reduce the latency of distributed transactions and to ensure this, Improved Range data distribution and an online migrating algorithm can be used [17]. The chubby lock service can be part of CCS but it uses Paxos consensus algorithm which needs 2F+1 servers to manage failure of F servers [18]. The main advantage of CCS is that it is not involved in transaction execution and the transaction requests are forwarded directly to the TDMs and hence it is decoupled from operations of TDM and can efficiently manage the system. A cluster management system is a base component that helps set up a cluster of servers on which other components are built. The CCS builds on the services of the cluster management system to communicate and monitor the TDMs in the cluster. F. Transaction Management In this section we discuss how to ensure ACID transactions on the cloud. Atomicity of transactions is determined by the commitment protocol. The optimized two phase commit protocol such as in ClustRa Telecom Database [16], are suitable for distributed transactions. The TDMs can be used to store log statements during transactions to reduce the latency of commitment protocol. These backup TDMs store logs for speeding the transactions and also help during recovery in case a TDM fails during transaction execution. Since the TDMs are in cloud the communication between nearby backup TDMs can ensure atomicity using transaction logs. The consistency of transaction that ensures referential integrity constraints can be handled by the application logic; hence this is not part of our model. Isolation property ensures that updates are affected by transactions operating on same data simultaneously. In the design of our model we adopt timestamp ordering for concurrency control where any transaction with older timestamp gets the priority 154
[19]. Durability is ensured in our model by writing to the persistent data storage after completion of the update transactions. During failure of TDMs, durability is assured by using the transaction logs in the back up TDMs during the recovery. Recovery component is part of our model that ensures recovery of application data on failure of TDMs. The recovery in our model is based on logging at backup TDMs. The CCS detects the failure of TDMs, and will use the configuration information and services of cluster management system to set up a new TDM. The CCS initiates recovery if data at the failed TDM by using the logs and application data from the persistent data storage. Finally it updates the mapping information to all the TDM of the new TDM. The logging component is embedded in the TDMs as it is part of commitment protocol during transactions. Hence our model consists of following components: data model, persistent data storage, TDM, CCS, logging, commitment protocol, recovery, and cluster management system. Regarding the implementation direction we plan to use the open source framework from Hadoop that will facilitate cluster management and persistent data storage using the distributed database of Hbase which is similar to Bigtable. The CCS and logging can be managed by Zookeeper which is part of Hadoop project. We are in the process of designing a transaction based application which will be prepared as a case study to demonstrate the design and development of a cloud based application based on the components illustrated in this paper. IV. CONCLUSIONS The proposed model consists of components that together ensure distributed transaction on compute cloud. We have identified the set of components that are part of any system that need to support execution of distributed transactions on cloud. We have briefly highlighted implementation directions which indicate the feasibility of our model. We have highlighted some design choices for most of the components. This paper will help in design of a transaction oriented system to be hosted on the compute cloud. REFERENCES [1] D. Abadi, Data Management in the Cloud: Limitations and Opportunities, Bulletin of IEEE Computer Society Technical Committee on Data Engineering, vol. 32, no. 1, pp. 3-12, 2009. [2] D. Blum, Security and Risk Management, BurtonGroup, 2009. [Online]. Available: http://srmsblog.burtongroup.com/2009/06/cloud-computing-who-is-in-control.html. [Accessed 2013]. [3] D. GOTTFRID, The New York Times, The New York Times, 1 11 2007. [Online]. Available: http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/?_r=0. [Accessed 2013]. [4] G. Reese, Cloud Application Architectures, Sebastopol: O Reilly Media, Inc., 2009. [5] M. Mehta and D. DeWitt, Data placement in shared-nothing parallel database systems, The VLDB Journal, pp. 53-72, 1997. [6] F. Chang, J. Dean, S. Ghemawat, C. Hsieh, D. Wallach, M. Burrows, T. Chandra, A. Fikes and E. Gruber, Bigtable: A distributed storage system for structured data, ACM Transactions on Computer Systems (TOCS), vol. 26, no. 2, 2008. [7] Amazon SimpleDB, Amazon SimpleDB, 2009. [Online]. Available: http://aws.amazon.com/simpledb/. [Accessed August 2009]. [8] B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H. Jacobsen, N. Puz, D. Weaver and R. Yerneni, Pnuts: Yahoo!s hosted data serving platform., in Proceedings of the VLDB Endowment, 2008. [9] M. Burrows, The Chubby lock service for loosely coupled distributed systems, in Proceedings of the 7th symposium on Operating systems design and implementation, Seattle, 2006. [10] D. T. Chandra, R. Griesemer and J. Redstone, Paxos made live: an engineering perspective, in Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing, Portland, Oregon, 2007. [11] S. Das, D. Agrawal and A. E. Abbadi, ElasTraS: An Elastic Transactional Data Store in the Cloud, in USENIX HotClouds Workshop, 2009 [12] G. Urdaneta, G. Pierre and M. Steen, Wikipedia workload analysis for decentralized hosting, Computer Networks: The International Journal of Computer and Telecommunications Networking, vol. 53, no. 11, pp. 1830-1845, 2009. [13] S. Ghemawat, H. Gobioff and S.-T. Leung, The Google file system, ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 29-43, 2003 [14] D. Borthakur, The Apache Software Foundation., 2013. [Online]. Available:. http://hadoop.apache.org/ docs/r1.2.1/ hdfs_design. html [Accessed 2013]. [15] Apache HBase, Apache HBase, October 2013. [Online]. Available: http://hbase.apache.org/book.html. [Accessed 2013]. [16] S.-O. Hvasshovd, O. Torbjornsen, S. E. Bratsberg and P. Holager, The ClustRa Telecom Database: High Availability, High Throughput, and Real-Time Response, in Proceedings of the 21th International Conference on Very Large Data Bases, San Francisco, CA, USA, 1995. 155
[17] W. Gong, L. Yang, D. Huang and L. Chen, New Balanced Data Allocating and Online Migrating Algorithms in Database Cluster, in Advances in Data and Web Management, Berlin, Springer Berlin / Heidelberg, 2009, pp. 526-531. [18] Z. Wei, G. Pierre and C.-H. Chi, Scalable Transactions for Web Applications in the Cloud, in Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2009, pp. 442-453. [19] P. A. Bernstein and N. Goodman, Concurrency Control in Distributed Database Systems, ACM Computing Surveys (CSUR), vol. 13, no. 2, pp. 185-221, 1981. 156