Transactions Management in Cloud Computing
|
|
|
- Georgiana Bradley
- 10 years ago
- Views:
Transcription
1 Transactions Management in Cloud Computing Nesrine Ali Abd-El Azim 1, Ali Hamed El Bastawissy 2 1 Computer Science & information Dept., Institute of Statistical Studies & Research, Cairo, Egypt 2 Faculty of Computers and Information, Cairo University, Cairo, Egypt [email protected] Abstract Cloud computing has emerged as a successful paradigm for web application deployment. Economies-of-scale, elasticity, and pay-per use pricing are the biggest promises of cloud. Database management systems serving these web applications form a critical component of the cloud environment. In order to serve thousands and a variety of applications and their huge amounts of data, these database management systems must not only scale-out to clusters of commodity servers, but also be self-managing, fault-tolerant, and highly available. In this paper we survey, analyze the currently applied transaction management techniques and we propose a paradigm according to which, transaction management could be depicted and handled. Keywords: Cloud computing, NoSQL, CAP theorem, Multi nodes access, Consistency. 1. Introduction Cloud computing has emerged as an extremely successful paradigm for deploying web applications. The major reasons for the successful and widespread adoption of cloud infrastructures are scalability, elasticity, pay-per-use pricing, and economies of scale [2]. Since the majority of cloud applications are data driven, database management systems (DBMSs) is an integral technology component in the overall service architecture [3]. The reason for the propagation of DBMS, in the cloud computing space is due to the features offered by DBMSs such as: overall functionality (modeling diverse types of application using the relational model), consistency (dealing with concurrent workloads without worrying about data becoming out-of-sync), performance (both high-throughput, low-latency), and reliability (ensuring safety and persistence of data in the presence of different types of failures). In spite of this success, DBMSs are not cloud-friendly, because unlike other technology components for cloud services, such as the web servers and application servers which can easily scale from a few machines to hundreds or even thousands of machines, DBMSs cannot be scaled very easily and often become the overall system scalability bottleneck [14, 16]. A new trend has arisen that abandoned the traditional DBMSs and instead has developed new data management technologies referred to as not only SQL (NoSQL) databases. The main distinction is that in traditional DBMSs, all data within a database is treated as a whole and it is the responsibility of the DBMS to guarantee the consistency of the entire data. In the context of NoSQL consistency is represented in different ways such as key-values [6, 7, 12, 19] where each entity is considered an independent unit of data or information and hence can be freely moved from one machine to another. Furthermore, the atomicity of application and user accesses is guaranteed only at a single-key level. This single-key level semantics of modern applications allow data to be less correlated. As a result,
2 modern systems can tolerate non-availability of certain portions of data while still providing reasonable service to the rest of data. Scalability has emerged as a critical requirement in cloud computing and it is the need to ensure that the system capacity can be augmented by adding additional hardware resources whenever needed (scale out) without causing any interruption in the service. Cloud applications deployed on top of cheap, commodity machines for which failures are common, and hence, fault-tolerance, high availability, eventual consistency, and ease of administration are essential features of data management systems in the cloud [1]. As a result, data management systems in the cloud should be able to scale out using commodity servers, be fault-tolerant, highly available, and easy to administer features which are provided by keyvalue stores, thus making them the preferred data management solutions in the cloud. The scalability and high availability properties of key-value stores however come at a cost. First, it allows data query only by primary key rather than join queries. Second, it only provides eventual consistency: any data update becomes visible after a finite amount of time. Despite of the weak consistency property, it is suitable for a wide range of applications. However, many other applications such as, payment services, flight reservation, online auctions, web 2.0 applications, and collaborative editing cannot afford any data inconsistency. So these applications either have to fall back to traditional databases, or to rely on various ad-hoc solutions [1, 2, 3]. It is now clear that neither Key-value stores, nor traditional databases can fit all types of applications [2, 3, 16]. As a result, there is a huge demand for a solution that can bridge the gap between scalable key-value stores (to deal with consistency issue) and traditional database systems (to overcome the scalability problem). In this paper, we state the serious effort done to bridge the data management gap following a proposed paradigm, and we also figure the future research trends in the indicated area. Section 2 provides a survey of the CAP theorem. Section 3 proposes our more detailed Multi layers Paradigm for Cloud Transaction Management and Section 4 presents Paradigmbased Systems Classifications. Section 5 provides Paradigm-based systems transactional features and Section 6 concludes the paper and outlines some research directions. 2. The CAP Theorem Cloud based applications need consistency, availability, and partition tolerance in order to work properly. However, CAP theorem states that any distributed system can only satisfy at most two out of three of the following properties: Consistency (all records are the same in all replicas), Availability (all replicas can accept updates or inserts), and Partition tolerance (the system still functions when distributed replicas cannot talk to each other). Therefore, when data is replicated over a wide area, the system has to choose between consistency and availability as shown in Figure 1. If the consistency (C) part of the ACID is relaxed then the system will implement various forms of weaker consistency models (e.g. eventual consistency, timeline consistency) so that all replicas do not have to agree on the same value of a data item at every moment of time (e.g. PNUTS [7], ecstore [17, 20]). If the availability (A) part is relaxed then the system will implement strict consistency (e.g. MySQL)
3 CAP theorem cluster all types of applications according to the three previously mentioned properties [13], however we need to analyze systems according to other characteristics such as system type, data structure type, transaction execution type, and type. Relational Data models (e.g.mysql, Postgres) A Key Value / Document based Stores Data models ( e.g.pnuts, Dynamo, SimpleDB) C P Column / graph based data models (e.g.bigtable, MongoDB) Figure 1: CAP Theorem and different data models [13] 3. Multi layers Paradigm for Cloud Transaction Management After surveying the area of transaction management in cloud, we can suggest three layers paradigm that projects all the research trials in the area as shown in Figure 2. The first layer depicts the system type in which there are two types of systems: shared nothing and decoupled storage. In the shared nothing system, the persistent data is stored on disks locally attached to the nodes or DBMS servers. In the decoupled storage, the persistent data is stored in a network addressable storage accessible from all the database nodes that are logically separate from the servers executing the. In decoupled storage the transaction execution logic (ACID) is decoupled from the storage logic (replication, caching, scalability, and fault tolerance). The second layer depicts two data structure type: highly structure and less structured. ly structured data stores (DBMSs) support a simple data model to store and manage data, this type have been extremely successful in classical enterprise settings due to its rich functionality, data consistency, high performance, high reliability, durability, and transactional access to data. Less structured data stores provide a mechanism for storage and retrieval of a huge amount of data that uses looser consistency models than highly structured data stores in order to achieve horizontal scaling and higher availability. They may be schema free or support simple flexible data model. They support simple functionality based on singlekey operations (e.g. key value stores) The third layer depicts the transaction execution type which can be classified as single node or multi nodes. In single node type, can be executed at a single node without the need for distributed synchronization and the Two Phase Commit Protocol (2PC). This type of storage allows the system to scale-out by horizontally the data, it also limits the effect of a failure to only the data served by the failed component and does not
4 affect the operation of the remaining components, thus allowing graceful performance degradation in the event of failures. In multi node types, execution cannot be contained in a single node. Therefore, the execution of a transaction could span multiple servers. Multi nodes are using 2PC to permit atomic transaction commitment across the involved servers. Scalability of this type is weak due to: 1-The waiting time to get the commit from all participant servers. 2- notification overhead. This is why we do not consider dynamic for multi nodes. System type Shared nothing Decoupled Storage Data Structure type ly structured (DBMS) Less structured Transaction execution type Single node Multi nodes Partitioning type Static Dynamic Figure 2: Multi-layers paradigm for cloud Transaction Management The fourth layer is concerned with the types which are static and dynamic. Partitioning in general is a technique to scale-out a database by splitting the individual tables among a cluster of nodes and provides transactional guarantees among these nodes. Therefore, the challenge is to partition the individual tables in such way that most accesses are limited to a single partition. In static, the system co-locates the relevant data items to the most frequently used application needs, in a single partition. The static partitions are then distributed among a cluster of nodes and a transaction is allowed to access one and only one cluster to guarantee scaling out efficiently. Many techniques could be applied to conform with static such as, range [17], hash [7], entity (hierarchical) group [4], and schema based [9, 10]. In dynamic, the system periodically analyzes the relationships between and data items, upon this analysis; it relocates the data items to the nodes the most suitable for these. Many techniques could be applied to conform to dynamic such as, graph based data technique [8], and key group abstraction [11]
5 In brief, our paradigm in Figure 2 is able to describe all transactional systems. Systems may use the classical shared nothing or the decoupled storage architecture (first layer). In both cases data can be modeled and managed using highly structured (DBMS) or less structured data stores (second layer). In both cases the may be constrained to run over data on one single node or admitted to run over data distributed on multi nodes (third layer). The accessed data may be stored statistically in some node(s) or may change its locations periodically (fourth layer). 4. Paradigm-based Systems Classification In this section we will classify the research efforts according to the previous suggested paradigm. 4.1 SHSS The SHSS are the systems that share nothing using highly structured data store (DBMS) on single node to access data partitioned statically at this node as in [5]. 4.2 SHSD The SHSD are the systems that share nothing using highly structured data store (DBMS) on single node to access data partitioned dynamically at this node by combining a workload-aware approach for efficient data placement with a graph-based algorithm to automatically analyze the way in which and data items relate to one another to automatically analyze complex query workloads and map data items to nodes to minimize the number of multi nodes /statements by co-locating data items frequently accessed together in as in [8]. 4.3 SHMS The SHMS are the systems that share nothing using highly structured data store (DBMS) on multi nodes to access data partitioned statically at these nodes. 4.4 SLSS The SLSS are the systems that share nothing using less structured data store on single node to access data partitioned statically at this node as in [7]. 4.5 SLSD The SLSD are the systems that share nothing using less structured data store on single node to access data partitioned dynamically at this node. 4.6 SLMS The SLMS are the systems that share nothing using less structured data store on multi nodes to access data partitioned statically at these nodes by using range. Range involves splitting the tables into non-overlapping ranges of their keys and then mapping the ranges to a set of nodes. It distributes tuples based on the value intervals (ranges) of some attribute. In addition to supporting exact-match queries, it is well-suited for range queries as in [17]. 4.7 DHSS The DHSS are the systems that use decoupled storage using highly structured data store (DBMS) on single node to access data partitioned statically at this node by using Schema
6 based Partitioning which allows designing practical and meaningful applications while being able to restrict transactional access to a single database partition. The rationale behind schema level is that in a large number of database schemas and applications, only access a small number of related rows which can be potentially spread across a number of tables. This access or schema pattern can be used to group together related data into the same partition, while allowing unrelated data in different partitions, and thus limiting accesses to a single database partition as in [9, 10]. 4.8 DHSD The DHSD are the systems that use decoupled storage using highly structured data store (DBMS) on single node to access data partitioned dynamically at this node as in [15]. 4.9 DHMS The DHMS are the systems that use decoupled storage using highly structured data store (DBMS) on multi nodes to access data partitioned statically at these nodes as in [15] DLSS The DLSS are the systems that use decoupled storage using less structured data store on single node to access data partitioned statically at this node by using the entity group technique. An entity group is essentially a hierarchical key structure that eliminates the need for most joins by storing data that is accessed together in nearby rows or de-normalized into the same row. It consists of a root entity along with all entities in child tables that reference it as in [4] DLSD The DLSD are the systems that use decoupled storage using less structured data store on single node to access data partitioned dynamically at this node by using the key group abstraction. This abstraction allows applications to select members of a group from any set of keys in the data store and dynamically create (and dissolve) groups on the fly, while allowing the data store to provide efficient, scalable, and transactional access to these groups of keys. At any instant of time, a given key can participate in a single group, but during its lifetime, a key can be a member of multiple groups. Multi-key accesses are allowed only for keys that are part of a group, and only during the lifetime of the group. Groups are dynamic in nature. Groups are independent of each other and the on a group guarantee consistency only within a group where are not allowed across these formed groups as in [11] DLMS The DLMS are the systems that use decoupled storage using less structured data store on multi nodes to access data partitioned statically at these nodes by using the hash technique. In hash, the keys are hashed to the nodes serving them. It applies a hash function to some attributes that yields the partition number. This strategy allows exact-match queries on the selection attribute to be processed by exactly one node and all other queries to be processed by all the nodes in parallel as in [18]
7 Table 1: Cloud shared nothing systems based on our paradigm Criteria SHSS SHSD SHMS SLSS SLSD SLMS System Availability ( due replication and dynamic quorum) Fixed (due to geographic replication) System Scalability Fixed Dynamic Fixed Dynamic Dynamic Low because no separation of system state and application state (we must consult the nodes in BATON) System Reliability Atomicity Consistency Isolation Primary copy Supported at single node consistency guarantee Serializable (locks) Supported at single node consistency guarantee Serializable Replicat ion Support ed consiste ncy guarante e Serializ able Geographic (asynchronous ) Supported Per record Time line Durability Log Log Log Yahoo Message Broker (YMB) Recovery After image Redo log Redo log Applicability Transaction Types Exchange Hosted Archive (EHA) and SQL Azure Non distributed short Not yet functioning Load Balancing Flexible ( monitoring and prediction) Not yet function ing transacti ons Eventual Supported Eventual using BASE instead of 2PC Snapshot Snapshot Snapshot (MVOCC) Backup replicas Social applications Range and exact match Log Redo log Not yet functioning Unknown Log Redo operations for long term failure Web shop applications Multi keys Fixed Flexible Flexible Replica Contents Static Dynamic Static Static Static Dynamic using Two tier replication mechanism Replica Placement Static Dynamic Static Static Static Dynamic using Shift key value scheme Number of copies for each replica Dynamic Dynamic Static Dynamic Static Dynamic using Selftuning range histogram Partitioning Technique Row group or table group Workload aware approach with graph based Any Hash Any static type Range CAP type CP CP CP AP AP AP Systems Example Cloud SQL Server [5] Relational Cloud [8] Not yet PNUTS [7] Not yet ecstore [17]
8 Table 2: Cloud decoupled storage systems based on our paradigm Criteria DHSS DHSD DHMS DLSS DLSD DLMS System Availability System Scalability System Reliability Atomicity as long as the transaction component (TC) is active Limited due to single TC as long as the transaction component (TC) is active Limited due to single TC Dynamic through Synchronous replication Supported at single node Supported Supported Supported per entity group Consistency within single entity group Isolation OCC (Serializabl e) Locks (serializabl e) Locks (serializable) MVCC (Serializable) but depends on group leader Dynamic Supported per key group depends on underlying key value store OCC (Serializable) Durability Log Log Log Log Log Log Recovery Redo log Undo and redo log Applicability Transaction Types Load Balancing Replica Content Replica Placement Number of copies for each replica Partitioning Technique Cloud data intensive applications, payment applications that access single partition Not yet functioning Not high due to single TC Undo and redo log Not yet functioning Not high due to single TC Redo log Interactive online service per entity group Undo and redo log Collaboration based application, On line multi players Long Not high during server and network failure Linear Supported (2PC) Timestamp ordering (Serializable) Redo log Low Linear Static Static Static Static Static Static Dynamic Dynamic Dynamic Static near the partition Static Scheme based Static due to single TC Any technique Static due to single TC Any technique Dynamic On line book store 1. that access well identified data items 2. Range query for read only Static Static Dynamic Static Entity group Key group Consistent hashing CAP Type CP CP CA CP CP CP Systems Example ElasTraS [9, 10] Deuterono my [15] Deuteronomy [15] Megastore [4] G-Store [11] CloudTPS [18]
9 5. Paradigm-based Systems Transactional Features According to the preceding paradigm-based classification, we try in this section to project on the expected features of shared nothing and decoupled system classes (Table 1 and 2 respectively). The shared nothing systems are classified (based on data structure type) into two groups (highly structure and less structured). The first group is further classified (based on transaction execution) into two sub groups, single node (SHSS, SHSD) or multi nodes (SHMS). Single node are well suited for OLTP and Web applications which are characterized by short lived /queries with little internal parallelism. The main practical difficulty facing single node was in scaling for handling skewed workloads, possible solution for this problem is data in a way that minimizes multi nodes. In this type availability on unreliable commodity hardware can be maintained through replication. On the Other hand, multi nodes use two phase commit (2PC). The main disadvantage facing SHMS are the blockage of the site if coordinator fails and the introduction of a large number of messages that affects the performance. In this type availability is also achieved through replication. the second group of the shared nothing system is also classified into two sub groups based on transaction execution either single node transaction (SLSS, SLSD) or multi nodes (SLMS). Single node focus on system availability and scalability with weaker consistency guarantee and no serializable. These systems are applicable on social applications, which require scalability, good response time for geographically dispersed users, and high availability. At the same time, they can tolerate relaxed consistency guarantees. On the other hand, multi node (SLMS) focus primarily on atomicity and durability with weaker isolation levels through optimistic multi-version concurrency control. However, update are always required in accessing the primary copy of data, both in the read and writing phases (based on the assumption that users are more likely to operate on their own data and this data tends to be independent between concurrent of different users so no sharing of information between users). Another disadvantage, there is no separation of system state and application state which implies limited scalability. They are suitable for applications that might use cloud range storages (i.e. a web shop which wants to store listings in a sorted order based on their timestamps). Similarly, the Decoupled storage systems are classified (based on data structure type) into two groups (highly structure and less structured). The first group is further classified (based on transaction execution type) into two sub groups either single node transaction (DHSS, DHSD) or multi nodes (DHMS). Single node are well suited for DBMS applications; however, they are not suitable for collaborative applications (that require consistent access to a dynamically formed group of keys). In Single node, availability is achieved by replication, while scalability is achieved through the data in a way that minimizes multi nodes
10 On the Other hand, multi nodes use 2PC (which may affect performance as mentioned earlier) with ACID guarantee, it is suitable for OLTP applications on disjoint set of data. The second group of decoupled system is also classified into two sub groups (based on transaction execution type) either single node transaction (DLSS, DLSD) or multi nodes (DLMS). Single node focus on achieving high consistency by supporting transactional access for groups of keys on a single node and hence do not need 2PC. These groups of keys (partitions) are either static or dynamic and groups are independent of each other. Transactions on a group guarantee consistency only within that group, therefore, across partition have to accept weaker consistency. They are targeted to applications whose data have a key-value schema, and require scalable and transactional access to non-disjoint groups of keys, and perform a large number of operations (long ). On the other hand, multi nodes use two 2PC with ACID guarantee as long as all are short and span a relatively small number of well-identified data items ( are allowed to access any number of data items by primary key, the list of primary-keys must be given before executing the transaction). Other disadvantages include the occurrence of a temporary drop in throughput as well as a few aborted as a result of server failures. In addition DLMS, it is not suitable for applications that require strict consistency. Finally, we expect to design a system regardless of its type (Shared nothing or Decoupled system) and its data structure type (either highly or less structured) with Single node transaction execution type and Dynamic data (XXSD) to enhance its transaction capability and scalability by supporting a wider class of applications that involve overlapping partitions or require queries span multiple partitions. 6. Conclusion and Future Work Transaction management in the cloud faces many challenges this is why we made a paradigm that deals with system types, data structure types, transaction execution types, and types. In this paper, we presented the serious efforts done to bridge the data management gaps; we classified all systems in 12 types, and we expect that system of type (XXSD) is able to enhance the current transaction features in the cloud. References: [1] D. J. Abadi, "Data Management in the Cloud: Limitations and Opportunities". IEEE Data Eng. Bull, vol 32(1), pp. 3-12, [2] D. Agrawal, A. E. Abbadi, S. Das, and A. J. Elmore, "Database scalability, elasticity, and autonomy in the cloud". In Proceedings of the 16 th international conference on Database systems for advanced applications, pp. 2-15, [3] D. Agrawal, A. El Abbadi, S. Antony,, and S. Das, "Data Management Challenges in Cloud Computing Infrastructures". In DNIS,
11 [4] J. Baker, C. Bond, J. C. Corbett, J. J. Furman, A. Khorlin, J. Larson,, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh, Megastore: Providing scalable, highly available storage for interactive services, in Proc. CIDR, pp , Jan [5] P. A. Bernstein, I. Cseri, N. Dani, N. Ellis, A. Kalhan, G. Kakivaya, D. B. Lomet, R. Manne, L. Novik, and T. Talius, Adapting Microsoft SQL Server for Cloud Computing, in Proc. ICDE, pp , [6] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, Bigtable: a distributed storage system for structured data, in Proc. OSDI, pp , [7] B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni, Pnuts: Yahoo! s hosted data serving platform, Proc. VLDB Endow., vol(1), pp , August [8] C. Curino, E. P. C. Jones, R. A. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, and N. Zeldovich, Relational Cloud: A Database as a Service for the Cloud, in Proc. CIDR, pp , [9] S. Das, D. Agrawal, and El A. Abbadi, Elastras: an elastic transactional data store in the cloud, in Proc. USENIX HotCloud, [10] S. Das, D. Agrawal, and El A. Abbadi, Elastras : An elastic, scalable, and self managing transactional database for the cloud, Technical Report, CS UCSB, April [11] S. Das, D. Agrawal, and El A. Abbadi, G-store: a scalable data store for transactional multi key access in the cloud, in Proc. ACM SoCC, pp , [12] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, Dynamo: amazon s highly available key-value store, SIGOPS Oper. Syst. Rev, vol(41), pp , Oct.2007 [13] S. Gilbert, and N. Lynch, "Brewer s conjecture and the feasibility of consistent, available, partition-tolerant web services". SIGACT News, vol(33), number 2, pp [14] D. Kossmann, T. Kraska, and S. Loesing,, An evaluation of alternative architectures for transaction processing in the cloud, in SIGMOD Conference, pp , [15] J. J. Levandoski, D. Lome, M. F. Mokbel, and K. K. Zhao, Deuteronomy: Transaction support for cloud data, in Proc. CIDR, pp , Jan [16] S. Sakr, A. Liu, D. M. Batista, and M. Alomari, A survey of large scale data management approaches in cloud environments, IEEE Communications Surveys and Tutorials, vol13), no. 3, [17] H. T. VO, C. Chen, and B. C. Ooi, "Towards Elastic Transactional Cloud Storage with Range Query Support", VLDB, Sept [18] Z. Wei, G.Pierre, and C.-H. Chi, CloudTPS: Scalable for web applications in the cloud, Services Computing, IEEE Transactions, vol(99), 2011 [19]Amazon.com. Amazon SimpleDB., [20]Amazon.com. EC2 elastic compute cloud,
Amr El Abbadi. Computer Science, UC Santa Barbara [email protected]
Amr El Abbadi Computer Science, UC Santa Barbara [email protected] Collaborators: Divy Agrawal, Sudipto Das, Aaron Elmore, Hatem Mahmoud, Faisal Nawab, and Stacy Patterson. Client Site Client Site Client
Hosting Transaction Based Applications on Cloud
Proc. of Int. Conf. on Multimedia Processing, Communication& Info. Tech., MPCIT Hosting Transaction Based Applications on Cloud A.N.Diggikar 1, Dr. D.H.Rao 2 1 Jain College of Engineering, Belgaum, India
Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara
Divy Agrawal and Amr El Abbadi Department of Computer Science University of California at Santa Barbara Sudipto Das (Microsoft summer intern) Shyam Antony (Microsoft now) Aaron Elmore (Amazon summer intern)
Scalable Transaction Management on Cloud Data Management Systems
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 10, Issue 5 (Mar. - Apr. 2013), PP 65-74 Scalable Transaction Management on Cloud Data Management Systems 1 Salve
Cloud Data Management: A Short Overview and Comparison of Current Approaches
Cloud Data Management: A Short Overview and Comparison of Current Approaches Siba Mohammad Otto-von-Guericke University Magdeburg [email protected] Sebastian Breß Otto-von-Guericke University
Data Management Challenges in Cloud Computing Infrastructures
Data Management Challenges in Cloud Computing Infrastructures Divyakant Agrawal Amr El Abbadi Shyam Antony Sudipto Das University of California, Santa Barbara {agrawal, amr, shyam, sudipto}@cs.ucsb.edu
Scalable Transaction Management with Snapshot Isolation on Cloud Data Management Systems
1 Scalable Transaction Management with Snapshot Isolation on Cloud Data Management Systems Vinit Padhye and Anand Tripathi Department of Computer Science University of Minnesota Minneapolis, 55455, Minnesota,
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems
A Novel Cloud Computing Data Fragmentation Service Design for Distributed Systems Ismail Hababeh School of Computer Engineering and Information Technology, German-Jordanian University Amman, Jordan Abstract-
A Taxonomy of Partitioned Replicated Cloud-based Database Systems
A Taxonomy of Partitioned Replicated Cloud-based Database Divy Agrawal University of California Santa Barbara Kenneth Salem University of Waterloo Amr El Abbadi University of California Santa Barbara Abstract
Introduction to NOSQL
Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM Julia Myint 1 and Thinn Thu Naing 2 1 University of Computer Studies, Yangon, Myanmar [email protected] 2 University of Computer
Loose Coupling between Cloud Computing Applications and Databases: A Challenge to be Hit
International Journal of Computer Systems (ISSN: 2394-1065), Volume 2 Issue 3, March, 2015 Available at http://www.ijcsonline.com/ Loose Coupling between Cloud Computing Applications and Databases: A Challenge
Eventually Consistent
Historical Perspective In an ideal world there would be only one consistency model: when an update is made all observers would see that update. The first time this surfaced as difficult to achieve was
Distributed Data Stores
Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High
5-Layered Architecture of Cloud Database Management System
Available online at www.sciencedirect.com ScienceDirect AASRI Procedia 5 (2013 ) 194 199 2013 AASRI Conference on Parallel and Distributed Computing and Systems 5-Layered Architecture of Cloud Database
A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications
A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications Li-Yan Yuan Department of Computing Science University of Alberta [email protected] Lengdong
TRANSACTION MANAGEMENT TECHNIQUES AND PRACTICES IN CURRENT CLOUD COMPUTING ENVIRONMENTS : A SURVEY
TRANSACTION MANAGEMENT TECHNIQUES AND PRACTICES IN CURRENT CLOUD COMPUTING ENVIRONMENTS : A SURVEY Ahmad Waqas 1,2, Abdul Waheed Mahessar 1, Nadeem Mahmood 1,3, Zeeshan Bhatti 1, Mostafa Karbasi 1, Asadullah
How To Manage A Multi-Tenant Database In A Cloud Platform
UCSB Computer Science Technical Report 21-9. Live Database Migration for Elasticity in a Multitenant Database for Cloud Platforms Sudipto Das Shoji Nishimura Divyakant Agrawal Amr El Abbadi Department
Secure Cloud Transactions by Performance, Accuracy, and Precision
Secure Cloud Transactions by Performance, Accuracy, and Precision Patil Vaibhav Nivrutti M.Tech Student, ABSTRACT: In distributed transactional database systems deployed over cloud servers, entities cooperate
NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
Data Consistency on Private Cloud Storage System
Volume, Issue, May-June 202 ISS 2278-6856 Data Consistency on Private Cloud Storage System Yin yein Aye University of Computer Studies,Yangon [email protected] Abstract: Cloud computing paradigm
A Distribution Management System for Relational Databases in Cloud Environments
JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 11, NO. 2, JUNE 2013 169 A Distribution Management System for Relational Databases in Cloud Environments Sze-Yao Li, Chun-Ming Chang, Yuan-Yu Tsai, Seth
MyDBaaS: A Framework for Database-as-a-Service Monitoring
MyDBaaS: A Framework for Database-as-a-Service Monitoring David A. Abreu 1, Flávio R. C. Sousa 1 José Antônio F. Macêdo 1, Francisco J. L. Magalhães 1 1 Department of Computer Science Federal University
Data Management in the Cloud
Data Management in the Cloud Ryan Stern [email protected] : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
A Review of Column-Oriented Datastores. By: Zach Pratt. Independent Study Dr. Maskarinec Spring 2011
A Review of Column-Oriented Datastores By: Zach Pratt Independent Study Dr. Maskarinec Spring 2011 Table of Contents 1 Introduction...1 2 Background...3 2.1 Basic Properties of an RDBMS...3 2.2 Example
Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected].
Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected] Joining Cassandra Binjiang Tao Computer Science Department University of Crete Heraklion,
Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)
Not Relational Models For The Management of Large Amount of Astronomical Data Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) What is a DBMS A Data Base Management System is a software infrastructure
SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
Communication System Design Projects
Communication System Design Projects PROFESSOR DEJAN KOSTIC PRESENTER: KIRILL BOGDANOV KTH-DB Geo Distributed Key Value Store DESIGN AND DEVELOP GEO DISTRIBUTED KEY VALUE STORE. DEPLOY AND TEST IT ON A
Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers
BASEL UNIVERSITY COMPUTER SCIENCE DEPARTMENT Cloud Computing: Meet the Players. Performance Analysis of Cloud Providers Distributed Information Systems (CS341/HS2010) Report based on D.Kassman, T.Kraska,
A Survey on Cloud Database Management
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 2 Issue 1 Jan 2013 Page No. 229-233 A Survey on Cloud Database Management Ms.V.Srimathi, Ms.N.Sathyabhama and
Slave. Master. Research Scholar, Bharathiar University
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper online at: www.ijarcsse.com Study on Basically, and Eventually
Consistency Models for Cloud-based Online Games: the Storage System s Perspective
Consistency Models for Cloud-based Online Games: the Storage System s Perspective Ziqiang Diao Otto-von-Guericke University Magdeburg 39106 Magdeburg, Germany [email protected] ABSTRACT The
Review of Query Processing Techniques of Cloud Databases Ruchi Nanda Assistant Professor, IIS University Jaipur.
Suresh Gyan Vihar University Journal of Engineering & Technology (An International Bi Annual Journal) Vol. 1, Issue 2, 2015,pp.12-16 ISSN: 2395 0196 Review of Query Processing Techniques of Cloud Databases
The Cloud Trade Off IBM Haifa Research Storage Systems
The Cloud Trade Off IBM Haifa Research Storage Systems 1 Fundamental Requirements form Cloud Storage Systems The Google File System first design consideration: component failures are the norm rather than
DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES
DATABASE REPLICATION A TALE OF RESEARCH ACROSS COMMUNITIES Bettina Kemme Dept. of Computer Science McGill University Montreal, Canada Gustavo Alonso Systems Group Dept. of Computer Science ETH Zurich,
NewSQL: Towards Next-Generation Scalable RDBMS for Online Transaction Processing (OLTP) for Big Data Management
NewSQL: Towards Next-Generation Scalable RDBMS for Online Transaction Processing (OLTP) for Big Data Management A B M Moniruzzaman Department of Computer Science and Engineering, Daffodil International
Evaluation of NoSQL and Array Databases for Scientific Applications
Evaluation of NoSQL and Array Databases for Scientific Applications Lavanya Ramakrishnan, Pradeep K. Mantha, Yushu Yao, Richard S. Canon Lawrence Berkeley National Lab Berkeley, CA 9472 [lramakrishnan,pkmantha,yyao,scanon]@lbl.gov
The Sierra Clustered Database Engine, the technology at the heart of
A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel
Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures.
Big Data, Deep Learning and Other Allegories: Scalability and Fault- tolerance of Parallel and Distributed Infrastructures Professor of Computer Science UC Santa Barbara Divy Agrawal Research Director,
Big Data & Scripting storage networks and distributed file systems
Big Data & Scripting storage networks and distributed file systems 1, 2, adaptivity: Cut-and-Paste 1 distribute blocks to [0, 1] using hash function start with n nodes: n equal parts of [0, 1] [0, 1] N
Introduction to NoSQL
Introduction to NoSQL NoSQL Seminar 2012 @ TUT Arto Salminen What is NoSQL? Class of database management systems (DBMS) "Not only SQL" Does not use SQL as querying language Distributed, fault-tolerant
Megastore: Providing Scalable, Highly Available Storage for Interactive Services
Megastore: Providing Scalable, Highly Available Storage for Interactive Services J. Baker, C. Bond, J.C. Corbett, JJ Furman, A. Khorlin, J. Larson, J-M Léon, Y. Li, A. Lloyd, V. Yushprakh Google Inc. Originally
Providing Scalable Database Services on the Cloud
Providing Scalable Database Services on the Cloud Chun Chen 1, Gang Chen 2, Dawei Jiang #3, Beng Chin Ooi #4, Hoang Tam Vo #5, Sai Wu #6, and Quanqing Xu #7 # National University of Singapore Zhejiang
Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 25: DBMS-as-a-service and NoSQL We learned quite a bit about data management see course calendar Three topics left: DBMS-as-a-service and NoSQL
NoSQL Databases: a step to database scalability in Web environment
NoSQL Databases: a step to database scalability in Web environment Jaroslav Pokorny Charles University, Faculty of Mathematics and Physics, Malostranske n. 25, 118 00 Praha 1 Czech Republic +420-221914265
Benchmarking and Analysis of NoSQL Technologies
Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The
Structured Data Storage
Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct
F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013
F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords
these three NoSQL databases because I wanted to see a the two different sides of the CAP
Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the
Challenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
Report Data Management in the Cloud: Limitations and Opportunities
Report Data Management in the Cloud: Limitations and Opportunities Article by Daniel J. Abadi [1] Report by Lukas Probst January 4, 2013 In this report I want to summarize Daniel J. Abadi's article [1]
CLOUD DATA MANAGEMENT: SYSTEM INSTANCES AND CURRENT RESEARCH
CLOUD DATA MANAGEMENT: SYSTEM INSTANCES AND CURRENT RESEARCH 1 CHEN ZHIKUN, 1 YANG SHUQIANG, 1 ZHAO HUI, 2 ZHANG GE, 2 YANG HUIYU, 1 TAN SHUANG, 1 HE LI 1 Department of Computer Science, National University
Survey on Comparative Analysis of Database Replication Techniques
72 Survey on Comparative Analysis of Database Replication Techniques Suchit Sapate, Student, Computer Science and Engineering, St. Vincent Pallotti College, Nagpur, India Minakshi Ramteke, Student, Computer
TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED DATABASES
Constantin Brâncuşi University of Târgu Jiu ENGINEERING FACULTY SCIENTIFIC CONFERENCE 13 th edition with international participation November 07-08, 2008 Târgu Jiu TECHNIQUES FOR DATA REPLICATION ON DISTRIBUTED
CSCI 550: Advanced Data Stores
CSCI 550: Advanced Data Stores Basic Information Place and time: Spring 2014, Tue/Thu 9:30-10:50 am Instructor: Prof. Shahram Ghandeharizadeh, [email protected], 213-740-4781 ITS Help: E-mail: [email protected]
An Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
Data Distribution with SQL Server Replication
Data Distribution with SQL Server Replication Introduction Ensuring that data is in the right place at the right time is increasingly critical as the database has become the linchpin in corporate technology
How To Write A Key Value Database (Kvs) In Ruby On Rails (Ruby)
Persisting Objects in Redis Key-Value Database Matti Paksula University of Helsinki, Department of Computer Science Helsinki, Finland [email protected] Abstract In this paper an approach to
Towards secure and consistency dependable in large cloud systems
Volume :2, Issue :4, 145-150 April 2015 www.allsubjectjournal.com e-issn: 2349-4182 p-issn: 2349-5979 Impact Factor: 3.762 Sahana M S M.Tech scholar, Department of computer science, Alvas institute of
The CAP-Theorem & Yahoo s PNUTS
The CAP-Theorem & Yahoo s PNUTS Stephan Müller June 5, 2012 Abstract This text is thought as an introduction to the CAP-theorem, as well as for PNUTS, a particular distributed databased. It subsumes a
Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms
UNIVERSITY OF CALIFORNIA Santa Barbara Scalable and Elastic Transactional Data Stores for Cloud Computing Platforms A Dissertation submitted in partial satisfaction of the requirements for the degree of
AN EFFECTIVE PROPOSAL FOR SHARING OF DATA SERVICES FOR NETWORK APPLICATIONS
INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE AN EFFECTIVE PROPOSAL FOR SHARING OF DATA SERVICES FOR NETWORK APPLICATIONS Koyyala Vijaya Kumar 1, L.Sunitha 2, D.Koteswar Rao
Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com [email protected]
Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com [email protected] How do you model data in the cloud? Relational Model A query operation on a relation
SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol-1, Iss.-3, JUNE 2014, 54-58 IIST SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE
Report for the seminar Algorithms for Database Systems F1: A Distributed SQL Database That Scales
Report for the seminar Algorithms for Database Systems F1: A Distributed SQL Database That Scales Bogdan Aurel Vancea May 2014 1 Introduction F1 [1] is a distributed relational database developed by Google
Data Management Course Syllabus
Data Management Course Syllabus Data Management: This course is designed to give students a broad understanding of modern storage systems, data management techniques, and how these systems are used to
This paper defines as "Classical"
Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of
Distributed Data Management
Introduction Distributed Data Management Involves the distribution of data and work among more than one machine in the network. Distributed computing is more broad than canonical client/server, in that
Lecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
Big Objects: Amazon S3
Cloud Platforms 1 Big Objects: Amazon S3 S3 = Simple Storage System Stores large objects (=values) that may have access permissions Used in cloud backup services Used to distribute software packages Used
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore.
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA 1 V.N.Anushya and 2 Dr.G.Ravi Kumar 1 Pg scholar, Department of Computer Science and Engineering, Coimbatore Institute
Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.
Architectures Cluster Computing Job Parallelism Request Parallelism 2 2010 VMware Inc. All rights reserved Replication Stateless vs. Stateful! Fault tolerance High availability despite failures If one
