Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel



Similar documents
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Map Reduce / Hadoop / HDFS

16.1 MAPREDUCE. For personal use only, not for distribution. 333

MapReduce (in the cloud)

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Big Data Processing with Google s MapReduce. Alexandru Costan

Introduction to Hadoop

MAPREDUCE Programming Model

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

Big Data and Apache Hadoop s MapReduce

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Architectures for Big Data Analytics A database perspective

CSE-E5430 Scalable Cloud Computing Lecture 2

Advanced Data Management Technologies

Parallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data

Cloud Computing at Google. Architecture

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

MapReduce and the New Software Stack

Big Data With Hadoop

Parallel Processing of cluster by Map Reduce

Introduction to Parallel Programming and MapReduce

Big Data Analytics with MapReduce VL Implementierung von Datenbanksystemen 05-Feb-13

Using distributed technologies to analyze Big Data

Big Fast Data Hadoop acceleration with Flash. June 2013

A Comparison of Approaches to Large-Scale Data Analysis

Systems Engineering II. Pramod Bhatotia TU Dresden dresden.de

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Using In-Memory Computing to Simplify Big Data Analytics

Introduction to Hadoop

A Performance Analysis of Distributed Indexing using Terrier

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Chapter 7. Using Hadoop Cluster and MapReduce

Map-Reduce for Machine Learning on Multicore

Hadoop IST 734 SS CHUNG

Energy Efficient MapReduce

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

ImprovedApproachestoHandleBigdatathroughHadoop

Introduction to Hadoop

Lecture Data Warehouse Systems

Big Data Systems CS 5965/6965 FALL 2015

MapReduce and Hadoop. Aaron Birkland Cornell Center for Advanced Computing. January 2012

Data-Intensive Computing with Map-Reduce and Hadoop

From GWS to MapReduce: Google s Cloud Technology in the Early Days

Hadoop Parallel Data Processing

NoSQL and Hadoop Technologies On Oracle Cloud

Introduction to MapReduce and Hadoop

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

MapReduce and Hadoop Distributed File System V I J A Y R A O

White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP

Lecture 10 - Functional programming: Hadoop and MapReduce

MapReduce Jeffrey Dean and Sanjay Ghemawat. Background context

Big Data and Hadoop. Sreedhar C, Dr. D. Kavitha, K. Asha Rani

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

I/O Considerations in Big Data Analytics

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Big Data and Scripting map/reduce in Hadoop

BIG DATA, MAPREDUCE & HADOOP

Hadoop and Map-Reduce. Swati Gore

A programming model in Cloud: MapReduce

marlabs driving digital agility WHITEPAPER Big Data and Hadoop


Optimization and analysis of large scale data sorting algorithm based on Hadoop

Data Management in the Cloud MAP/REDUCE. Map/Reduce. Programming model Examples Execution model Criticism Iterative map/reduce

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

Hadoop Cluster Applications

Fault Tolerance in Hadoop for Work Migration

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

MapReduce and Hadoop Distributed File System

Developing MapReduce Programs

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Hadoop implementation of MapReduce computational model. Ján Vaňo

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Apache Hadoop. Alexandru Costan

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

HADOOP PERFORMANCE TUNING

Cloudera Certified Developer for Apache Hadoop

Storage Architectures for Big Data in the Cloud

Using an In-Memory Data Grid for Near Real-Time Data Analysis

Transcription:

Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup: More processors Faster If you increase the processors, TPS how much faster is the system (typically measured in transactions per second)? # processors Scaleup: More processors Process more data If you increase proportionally TPS the number of processors and data, what is the performance of the system? # processors & data size 1

Challenges For linear speedup and scaleup: Startup cost Cost to start processes Interference Contention for resources between processors Skew Different sized jobs between processors Parallel Architectures Shared-Memory Multicore CPU Processor Interconnection Network Global Shared Memory Disk Parallel Architectures Shared-Disk Memory Processor Interconnection Network Disk 2

Parallel Architectures Shared-Nothing Disk Memory Processor Interconnection Network Architecture Comparison Shared memory + Good for load-balancing, easiest to program Memory contention: Does not scale Shared disk Disk contention: Does not scale Shared nothing + Scales linearly, can use commodity H/W Hardest to program Types of Parallelism In query evaluation: Inter-query parallelism Each query runs on one processor Inter-operator parallelism A query runs on multiple processors Each operator runs on one processor Intra-operator parallelism An operator runs on multiple processors + Most scalable q 1 q 2 q 3 q 1 π σ σ σ σ σ 3

Horizontal Data Partitioning Divide tuples of a relation among n nodes: Round robin Send tuple t i to node [i mod n] Hash partitioning on an attribute C Send tuple t to node [h(t.c) mod n] Range partitioning on an attribute C Send tuple t to node i if v i-1 < t.c < v i Which is better? Let s see Parallel Operators Selection Group-By Join Sort Duplicate Elimination Projection Parallel Selection σ A = v or σ Done in parallel in: Round robin All nodes v1 < A < v2 Hash partitioning One node for C = v All nodes for v1 < C < v2, all nodes for A = v Range partitioning All nodes whose range overlaps with the selection 4

Parallel Group By γ A; SUM(B) 1) Each node partitions its data using hash function h(t.a) to at most as many partitions as nodes h h h Parallel Group By γ A; SUM(B) 2) Each node sends each partition to one node All tuples with the same group by value get collected in one node Parallel Group By γ A; SUM(B) 3) Each node computes aggregate for its partition High communication cost (although << I/O cost) γ A; SUM(B) γ A; SUM(B) γ A; SUM(B) 5

Parallel Group By Optimization γ A; SUM(B) 0) Each node does local aggregation first and then applies previous steps on the aggregation result γ A; SUM(B) γ γ A; SUM(B) A; SUM(B) Parallel Join (Equi-join) R R.A = S.B S Follow similar logic to Group By: Partition data of R and S using hash functions Send partitions to corresponding nodes Compute join for each partition locally on each node Parallel Join (General) R R.A < S.B S Does the previous solution work? If no, why? 6

Parallel Join (General) R R.A < S.B S Use fragment and replicate : 1) Fragment both relations Partition R into m partitions & S into n partitions R S R 1 R 2 S 1 S 2 S 3 Parallel Join (General) R R.A < S.B S Use fragment and replicate: 2) Assign each possible combination of partitions to one node (requires m x n nodes) Each partition is replicated across many nodes S 1 S 2 S 3 R 1 R 2 Parallel Sort Use same partitioning idea as in group by and equi-join: Partition data of R and S using range partitioning Send partitions to corresponding nodes Compute sort for each partition locally on each node Note: Different partitioning techniques suitable for different operators 7

Partitioning Revisited Round robin + Good for load-balancing Have to access always all the data Hash partitioning + Good for load-balancing Works only for equality predicates Range partitioning + Works for range queries May suffer from skew Other Operators Duplicate Elimination: Use sorting or Use partitioning (range or hash) Projection (wo duplicate elimination): Independently at each node Parallel Query Plans The same relational operators With special split and merge operators To handle data routing, buffering and flow control 8

Cost of Parallel Execution Ideally: Parallel Cost = Sequential Cost / #of nodes but Parallel evaluation introduces overhead The partitioning may be skewed Cost of Parallel Execution Parallel Cost = T part + T asm + max(t 1,, T n ) Time to partition the data Time to assemble the results T i : Time to execute the operation at node i Parallel Databases Review Parallel Architectures Ways to parallelize a database Partitioning Schemes Algorithms 9

Shared-Nothing Parallelism and MapReduce Large-scale data processing High-level programming model (API) & Corresponding implementation Don t worry about: - Parallelization - Data distribution - Load balancing - Fault tolerance A little history First paper by Google in 2004 It was used to regenerate Google s index A little history Apache Hadoop: Popular open-source implementation Nowadays you can run hadoop without even setting up your own infrastructure 10

MR Performance How long does it take to sort 9TB of data on 900 nodes? What about 20TB on 2000 nodes? From Hadoop FAQs: Anatomy of a MR Program Input: bag of (input key, value) pairs Output: bag of output values Structure: Map function Reduce function Map function Input: (input key, value) pair Output: (intermediate key, value) pair Semantics: System applies the function in parallel to all (input key, value) pairs in the input 11

Reduce function Input: (intermediate key, bag of values) pair Output: bag of output values Semantics: System groups all pairs with the same intermediate key and passes the bag of values to the Reduce function Example 1 Word Occurrence Count For each word in a set of documents, compute the number of its occurrences in the entire set map (String key, String value) { // key: document name // value: document contents for each word w in value: EmitIntermediate(w, 1 ); } reduce (String key, Iterator values) { // key: a word // value: a list of counts int result = 0; for each v in values: result += ParseInt(v); EmitString(key, AsString(result));} Example 1 Word Occurrence Count doc1, x x x y w doc3, doc2, doc4, y y w z x y z w x x map 1 map 2 reduce 1 reduce 2 reduce 3 (x, 6) (y, 4) (w, 3) (z, 2) 12

Example 1 Word Occurrence Count Do you see any way to improve this solution? Example 1 Word Occurrence Count Improve the MR program by reducing the number of intermediate pairs produced by each mapper: map (String key, String value) { // key: document name // value: document contents H = new AssociativeArray; for each word w in value: H(w) += 1; for each word w in H: EmitIntermediate(w, H(w)); } Each mapper sums up the counts for each document Example 1 Word Occurrence Count doc1, doc3, x x x y w y y w z map 1 (x, 3) (y, 2) (x, 3) (x, 2) (y, 2) reduce 1 reduce 2 (x, 6) (y, 4) doc2, doc4, x y z w x x map 2 (x, 2) reduce 3 (w, 3) (z, 2) 13

Example 1 Word Occurrence Count Do you see any other possibility to improve performance? Introducing the Combiner Combiner: Combine values output by each Mapper (since they already exist in main memory). Similar to an intermediate reduce for each individual Mapper. Input: bag of (intermediate key, bag of values) pairs Output: bag of (intermediate key, bag of values) pairs Semantics: System groups for each mapper separately all pairs with the same intermediate key and passes the bag of values to the Combiner function Example 1 Word Occurrence Count Improve the MR program by utilizing a combiner: combine(string key, Iterator values) { // key: word // value: a list of counts int result = 0; for each v in values: result += ParseInt(v); EmitString(key, AsString(result));} Does it look familiar? It has the same code as the reduce function! 14

Example 1 Word Occurrence Count doc1, x x x y w doc3, doc2, doc4, y y w z x y z w x x map 1 map 2 (x, 3) (y, 2) (x, 2) Combine (x, 3) (y, 3) (w, 2) (x, 3) (x, 3) (x, 3) (y, 2) (w, 2) More examples Sorting Selection Join More examples Sorting Leverage the fact that data are sorted before being pushed to the reducers and also reducers themselves are sorted Map: (k, v) (v, _) Reduce: (v, _) v 15

More examples Join Hash on join attribute Have to encode the relation name as well Map: (table, bag of tuples) (join attribute, (tuple, relname)) Reduce: (join attribute, (tuple, relname)) output tuple Other possibility? Join on mappers If one of the relations fits in main memory Implementation One Master node: Scheduler & Coordinator Many Workers: Servers for Map/Reduce tasks Implementation Map Phase (need M workers) - Master partitions input file into M splits, by key - Master assigns workers to the M map tasks - Workers execute the map task, write their output to local disk, partitioned into R regions according to some hash function Reduce Phase (need R workers) - Master assigns workers to the R reduce tasks - Each worker reads data from the corresponding mapper s disk, groups it by key and execute the reduce function 16

Implementation Filesystem - Input & Output of MapReduce are stored in a distributed file system - Each file is partitioned into chunks - Each chunk is replicated on multiple machines - Implementations: GFS (Google File System): Proprietary HDFS (Hadoop File System): Open source Some more details Fault Tolerance - Master pings each worker periodically - If it does not reply, tasks assigned to it are reset to their initial state and rescheduled on other workers Some more details Tuning - Developer has to specify M & R: M: # of map tasks R: # of reduce tasks - Larger values: Better load balancing - Limit: Master need O(M x R) memory - Also specify: 100 other parameters (50 of which affect runtime significantly) - Automatic tuning? 17

MR: The ultimate solution? Problems - Batch oriented: Not suited for real-time processes - A phase (e.g., Reduce) has to wait for the previous phase (e.g., Map) to complete - Can suffer from stragglers: workers taking a long time to complete - Data Model is extremely simple for databases: Not everything is a flat file - Tuning is hard 18