Data Management in the Cloud



Similar documents
Distributed Data Management

Data Management in the Cloud. Zhen Shi

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Distributed Databases

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

In Memory Accelerator for MongoDB

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Daniel J. Adabi. Workshop presentation by Lukas Probst

The Hadoop Distributed File System

Low Overhead Concurrency Control for Partitioned Main Memory Databases

Cloud Computing at Google. Architecture

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

HDFS Users Guide. Table of contents

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

bigdata Managing Scale in Ontological Systems

Database Tuning and Physical Design: Execution of Transactions

Mind Q Systems Private Limited

Hadoop Job Oriented Training Agenda

SQL Server 2012 Database Administration With AlwaysOn & Clustering Techniques

Distributed Architectures. Distributed Databases. Distributed Databases. Distributed Databases

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

DISTRIBUTED AND PARALLELL DATABASE

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

Practical Cassandra. Vitalii

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Module 14: Scalability and High Availability

Chapter 18: Database System Architectures. Centralized Systems

Hadoop Architecture. Part 1

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.

THE HADOOP DISTRIBUTED FILE SYSTEM

A programming model in Cloud: MapReduce

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999

Appendix A Core Concepts in SQL Server High Availability and Replication

Big Data With Hadoop

Conflict-Aware Scheduling for Dynamic Content Applications

Microsoft SQL Database Administrator Certification

Bigdata High Availability (HA) Architecture

Apache Hadoop. Alexandru Costan

Apache HBase. Crazy dances on the elephant back

Integrating VoltDB with Hadoop

Lecture Data Warehouse Systems

Cosmos. Big Data and Big Challenges. Pat Helland July 2011

A Shared-nothing cluster system: Postgres-XC

SQL Server Training Course Content

Availability Digest. MySQL Clusters Go Active/Active. December 2006

CSE-E5430 Scalable Cloud Computing Lecture 11

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Distributed File Systems

Cloud DBMS: An Overview. Shan-Hung Wu, NetDB CS, NTHU Spring, 2015

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Cosmos. Big Data and Big Challenges. Ed Harris - Microsoft Online Services Division HTPS 2011

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

Concepts of Database Management Seventh Edition. Chapter 7 DBMS Functions

What's New in SAS Data Management

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Oracle Database 12c Plug In. Switch On. Get SMART.

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Lecture 7: Concurrency control. Rasmus Pagh

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

Optimizing Performance. Training Division New Delhi

Hypertable Architecture Overview

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

SQL Server Performance Tuning and Optimization

The Vertica Analytic Database Technical Overview White Paper. A DBMS Architecture Optimized for Next-Generation Data Warehousing

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

CitusDB Architecture for Real-Time Big Data

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

The Hadoop Distributed File System

The Sierra Clustered Database Engine, the technology at the heart of

CSE-E5430 Scalable Cloud Computing Lecture 2

Hadoop IST 734 SS CHUNG

Innovative technology for big data analytics

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Distributed Systems. Tutorial 12 Cassandra

SQL Server 2016 New Features!

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Contents. SnapComms Data Protection Recommendations

MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015

Big Data Course Highlights

RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

NoSQL for SQL Professionals William McKnight

NoSQL and Hadoop Technologies On Oracle Cloud

Introduction to Apache Cassandra

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Integrating Big Data into the Computing Curricula

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. project.org. University of California, Berkeley UC BERKELEY

Transcription:

Data Management in the Cloud Ryan Stern stern@cs.colostate.edu : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University

Outline Today Microsoft Cloud SQL Server Pig and Pig Latin Next Week H-Store Data Management in the Cloud 2

Adapting Microsoft SQL Server for Cloud Computing Data Management in the Cloud 3

Cloud SQL Server A relational database system Backed by Microsoft SQL Server Designed to scale out to cloud computing workloads Data Management in the Cloud 4

Shared-nothing Architecture A distributed computing architecture Each node in the system is independent of the others Each node in the system is self-sufficient The nodes do not share disk storage Data Management in the Cloud 5

Data Model Provides relational access to large datasets A logical database is called a table group Table groups may be keyless or keyed Want to avoid running transactions across multiple nodes Avoid the need for a two-phase commit Participants would block in the event of a failure Introduces messages with relatively random distribution patterns Data Management in the Cloud 6

Keyed vs. Keyless Tables Table groups can be keyed or keyless A keyed table group has an additional column, a partition key All rows with the same partition key are part of the same row group Transactions are limited to operations on only one row group A keyless table group does not have a partition key Transactions can operate on any row in the table group Tradeoff: The entire table group must fit on a single node Data Management in the Cloud 7

Data Partitioning Partition size is limited by the capacity of a single server Partitioning depends on the table type Data Management in the Cloud 8

Partition Replication Partitions are replicated for high availability A single partition is a failover unit No two replicas are placed in the same failure domain Meaning not on the same rack or behind the same switch There is always one primary replica of a partition Responsible for processing queries, updates, and data definition ops Ships updates to secondary partitions Similar to BigTable Data Management in the Cloud 9

Partition Load Balancing Just balancing the number of primary and secondary partitions per node might not balance the load Dynamic rebalancing is possible by demoting the primary This is only allowed if no transactions are in progress on that node A secondary node is then promoted to primary Managed by a global partition manager Keyed table groups can be partitioned dynamically When the data exceeds the allowed size or when the node is loaded Partitions are split using the existing replicas, no data movement The secondary will become the primary for the subset of rows Data Management in the Cloud 10

System Architecture Data Management in the Cloud 11

System Architecture SQL Server Instance Manages many individual databases Different sub-databases are isolated Saves memory on internal structures Shares a common transaction log, improving performance Distributed Fabric Runs on every node with an SQL server instance Maintains the up/down status of servers Detects server failures and recoveries Performs leadership elections for various roles Data Management in the Cloud 12

System Architecture Global Partition Manager Highly available Knows the key range for each partition Knows the location of all replicas Knows the primary state and history Decides whether to refresh, replace, or discard replicas Protocol Gateway Understands the native wire protocol of SQL Server Accepts inbound database connections and binds it to the primary replica Masks some failures from clients Infrastructure and Deployment Upgrades the cluster while it operational and enables new features Data Management in the Cloud 13

During a Transaction A transaction on the primary creates update records An after-image of data changed by each update These update records serve as redo records Secondaries can be used as read only copies of data if needed Have an isolation level of read-committed Updates records are identified by table key and not page ID SQL instances holding replicas do not need to be identical Primary streams updates to the secondaries After-images of modified indices are sent as well Deleted if secondary received abort from the primary Data Management in the Cloud 14

When a Transaction Commits The primary assigns the next commit sequence number The secondaries apply the updates to their databases in commit sequence number order The secondary sends an ACK back to the primary when finished When the primary obtains ACKs from a quorum of replicas, it writes a persistent commit record Data Management in the Cloud 15

Commits and Recovery Data Management in the Cloud 16

Recovery Update records that are lost by a secondary can be recovered Primary sends a queue of updates to the secondary based on last CSN If the secondary is too far behind, a fresh copy can be transferred Secondaries are always nearly up-to-date Apply committed updates almost immediately Act as hot standbys incase of primary failure In the event of primary failure A quorum of secondaries are contacted, ensuring that no updates are lost A leader is selected based on the secondary with the latest state The leader propagates updates to the secondaries with older state Since a quorum of secondaries is used most failed secondaries are silently replaced in the background Data Management in the Cloud 17

Benchmarks Used a variant of TPC-C Reduced memory and disk I/O, to match cloud environment Used a database that fits in RAM Data Management in the Cloud 18

Benchmarks Data Management in the Cloud 19

Data Management in the Cloud: Limitations and Opportunities Data Management in the Cloud 20

Transactional data management systems Do not tend to scale well without limiting transactions It is difficult to maintain ACID guarantees due to required replication Many systems choose to relax ACID guarantees BigTable, SimpleDB, and PNUTS Systems with eventual consistency Require storage of sensitive mission-critical data Customer information, credit card numbers Data Management in the Cloud 21

Analytical data management systems Applications are mostly read-only ACID guarantees are not needed Infrequent updates Reading a snapshot of data is acceptable Sensitive data is often not required Encrypted with an offsite key, left out Data Management in the Cloud 22

Hybrid Solution Requirements of a data management system Efficiency Fault tolerance Ability to run in heterogeneous environment Ability to operate on encrypted data Ability to interface with existing business intelligence products Neither MapReduce nor parallel databases fully satisfy Need a hybrid solution Ease of MapReduce along with performance enhancing data structures Data Management in the Cloud 23

Pig and Pig Latin Data Management in the Cloud 24

Motivation There is often a need for ad-hoc analysis of very large datasets Driven by innovation at a number of organizations Data includes web crawls, search logs, and click streams Existing distributed databases tend to be prohibitively expensive at extremely large scales MapReduce works well for data analysis, but requires custom code for common operations available in database solutions Projection Grouping Filtering Data Management in the Cloud 25

SQL vs. Pig Latin Encoding efficient dataflows in SQL is difficult Procedural solution are easier to reason about Automatic query optimization performs poorly in this environment Pig Latin is a middle ground between SQL and procedural code A sequence of steps transforming the data Each transformation is high level like SQL Data Management in the Cloud 26

SQL vs. Pig Latin SELECT category, AVG(pagerank) FROM urls WHERE pagerank > 0.2 GROUP BY category HAVING COUNT(*) > 10000000 good_urls = FILTER urls BY pagerank > 0.2; groups = GROUP good_urls BY category; big_groups = FILTER groups BY COUNT(good_urls)>10000000; output = FOREACH big_groups GENERATE category, AVG(good_urls.pagerank); Data Management in the Cloud 27

Dataflow Pig Latin is compiled into many MapReduce jobs These jobs then execute on Hadoop Operations do not need to be executed in order Some operations may be completed concurrently Filters can be performed in an order that returns results efficiently spam_urls = FILTER urls BY isspam(url); culprit_urls = FILTER spam_urls BY pagerank > 0.8; Data Management in the Cloud 28

Features Directly run on datafiles, no need to import into a database User defined functions are allowed Written is Java Nested data is allowed, as well as nested operations Would require multiple tables in a traditional database Sets, maps, and tuples can be read from a single field Data Management in the Cloud 29

Data Model There are four basic data types Atom: a single value such as a sting Tuple: a sequence of fields of any data type Bag: a collection of tuples Map: a collection of data items where each has a key Data Management in the Cloud 30

Operations LOAD: specify the file and schema FOREACH: perform an operation on each tuple FILTER: discards unwanted data ORDER: sorts the data STORE: write the output to a file Data Management in the Cloud 31

Grouping Operations COGROUP: groups tuples into nested bags GROUP: special case of COGROUP for one data set JOIN: shortcut for COGROUP followed by flattening CROSS: performs the cross product of two or more bags UNION: union of two or more bags Data Management in the Cloud 32

Grouping Operations join_result = JOIN results BY querystring, revenue BY querystring; temp_var = COGROUP results BY querystring, revenue BY querystring; join_result = FOREACH temp_var GENERATE FLATTEN(results), FLATTEN(revenue); Data Management in the Cloud 33

Conversion to MapReduce Commands are parsed and a logical plan is built for each bag Lazy: Processing starts once a STORE command is issued Allows filter reordering and other optimizations FOREACH: multiple MapReduce instances are started GROUP: mappers assign keys, reducers process each group ORDER: first job samples input, second range partitions based on first job Data Management in the Cloud 34

H-Store Concurrency Control Data Management in the Cloud 35

H-Store Distributed cluster of shared-nothing machines All data resides in main memory Uses stored procedures for transactions Identified by a unique name Procedures are provided when cluster is deployed Used to determine how data is partitioned and replicated Ad-hoc queries are still possible Each partition is managed by a single thread Transactions are queued and execute one at a time Data Management in the Cloud 36

H-Store Architecture Data Management in the Cloud 37

Assumptions Designed for a partitioned main-memory database Transactions are stored procedures All data fits into main memory at the node Most of the transactions should access a single partition Data Management in the Cloud 38

Components Process for: Each partition, replicated A central coordinator Data Management in the Cloud 39

When a Client Connects The client downloads parts of the system catalog Downloaded information contains: The available stored procedures The locations of each partition Details on how data is distributed Allows clients to direct queries to the appropriate process Data Management in the Cloud 40

Transactions Available as stored procedures A mixture of control code and SQL operations Transactions can be divided into fragments A fragment is a unit of work that can be executed on one partition Single partition transactions: Request sent directly to the primary partition for the data Primary forwards requests to replicas Primary executes the transaction Waits for acknowledgement from replicas Data Management in the Cloud 41

Multi-Partition Transactions All multi-partition transactions can be sent through a central coordinator Only needed when using the speculative concurrency scheme Assigns a global order so that there will be no deadlocks Downside: Limits the rate of multi-partition transactions Use two-phase commit with an undo buffer The in-memory buffer is discarded when the transaction commits Transactions may need to wait for replies from other nodes Could be doing useful work during this time Data Management in the Cloud 42

Concurrency Control Schemes Blocking The simplest scheme Limits system throughput Speculative Execution Execute other transactions while waiting Rollback if there are conflicts Locking Utilizes read and write locks Data Management in the Cloud 43

Speculative Execution Multi-part transactions must wait on coordinator for commit Most of the time, the transaction will commit Solution: Execute queued transactions speculatively When the blocked transaction commits Speculative transactions immediately commit Results from speculative transactions are returned to the clients Multi-partition transactions can be speculated as well Results returned to coordinator early, noting the blocked transaction Data Management in the Cloud 44

Speculative Execution Data Management in the Cloud 45

Locking Transactions acquire read and write locks Block for a conflicting lock request Allows non conflicting single-partition transactions Execute during the network stalls of multi-partition transactions Multi-partition transactions are sent directly to partitions Not forwarded from central coordinator Two-phase locking ensures transactions have a serializable order More efficient when there are no conflicts Deadlock is possible in this scheme Cycle detection is used locally Timeouts are used for distributed deadlocks Data Management in the Cloud 46

Performance Data Management in the Cloud 47

Performance Data Management in the Cloud 48

Performance Data Management in the Cloud 49

Concurrency Control Schemes Data Management in the Cloud 50

Issues with H-Store Data is volatile Assume that data will be recoverable from replicas What happens when the data center loses power? Requires transactions as stored procedures No ad-hoc queries Does not work well with long transactions Data Management in the Cloud 51

Questions? Data Management in the Cloud 52