MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015



Similar documents
Distributed File Systems

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Hadoop Architecture. Part 1

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Distributed File Systems

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Data-Intensive Computing with Map-Reduce and Hadoop

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee June 3 rd, 2008

Distributed Filesystems

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

THE HADOOP DISTRIBUTED FILE SYSTEM

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

Parallel Processing of cluster by Map Reduce

Apache Hadoop. Alexandru Costan

CSE-E5430 Scalable Cloud Computing Lecture 2

Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Cloud Computing at Google. Architecture

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Big Data With Hadoop

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

The Google File System

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

HDFS Users Guide. Table of contents

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Processing of Hadoop using Highly Available NameNode

MapReduce. Tushar B. Kute,

Google File System. Web and scalability

Hadoop Distributed File System (HDFS) Overview

A very short Intro to Hadoop

Jeffrey D. Ullman slides. MapReduce for data intensive computing

This exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing.

Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique

Map Reduce / Hadoop / HDFS

!"#$%&' ( )%#*'+,'-#.//"0( !"#$"%&'()*$+()',!-+.'/', 4(5,67,!-+!"89,:*$;'0+$.<.,&0$'09,&)"/=+,!()<>'0, 3, Processing LARGE data sets

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

Hadoop Distributed Filesystem. Spring 2015, X. Zhang Fordham Univ.

Processing of massive data: MapReduce. 2. Hadoop. New Trends In Distributed Systems MSc Software and Systems

How To Use Hadoop

Introduction to MapReduce and Hadoop

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

MapReduce and Hadoop. Aaron Birkland Cornell Center for Advanced Computing. January 2012


Hadoop Parallel Data Processing

Big Data and Apache Hadoop s MapReduce

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Big Data Management and NoSQL Databases

HADOOP MOCK TEST HADOOP MOCK TEST II

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Big Data Processing in the Cloud. Shadi Ibrahim Inria, Rennes - Bretagne Atlantique Research Center

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Chapter 7. Using Hadoop Cluster and MapReduce

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Distributed File Systems

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Google Bing Daytona Microsoft Research

HDFS Architecture Guide

Hadoop IST 734 SS CHUNG

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Hadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.

Hadoop Certification (Developer, Administrator HBase & Data Science) CCD-410, CCA-410 and CCB-400 and DS-200

Big Data Analytics. Lucas Rego Drumond

HADOOP MOCK TEST HADOOP MOCK TEST I

MapReduce and Hadoop Distributed File System

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Getting to know Apache Hadoop

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

International Journal of Advance Research in Computer Science and Management Studies

HDFS. Hadoop Distributed File System

Intro to Map/Reduce a.k.a. Hadoop

Fault Tolerance in Hadoop for Work Migration

Apache Hadoop FileSystem and its Usage in Facebook

Hypertable Architecture Overview

Introduction to HDFS. Prasanth Kothuri, CERN

Design and Evolution of the Apache Hadoop File System(HDFS)

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Apache HBase. Crazy dances on the elephant back

Introduction to Hadoop

t] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from

Lecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015

ATLAS Tier 3

A programming model in Cloud: MapReduce

Hadoop and Map-Reduce. Swati Gore

MapReduce Job Processing

The Hadoop Framework

HadoopRDF : A Scalable RDF Data Analysis System

The Hadoop Distributed File System

CURSO: ADMINISTRADOR PARA APACHE HADOOP

HADOOP PERFORMANCE TUNING

Apache Hadoop new way for the company to store and analyze big data

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

MapReduce and Hadoop Distributed File System V I J A Y R A O

Introduction to DISC and Hadoop

Transcription:

7/04/05 Fundamentals of Distributed Systems CC5- PROCESAMIENTO MASIVO DE DATOS OTOÑO 05 Lecture 4: DFS & MapReduce I Aidan Hogan aidhog@gmail.com Inside Google circa 997/98 MASSIVE DATA PROCESSING (THE GOOGLE WAY ) Inside Google circa 05 Google s Cooling System

7/04/05 Google s Recycling Initiative Google Architecture (ca. 998) Information Retrieval Crawling Inverted indexing Word-counts Link-counts greps/sorts PageRank Updates Google Engineering Massive amounts of data Each task needs communication protocols Each task needs fault tolerance Multiple tasks running concurrently Ad hoc solutions would repeat the same code Google Engineering Google File System Store data across multiple machines Transparent Distributed File System Replication / Self-healing MapReduce Programming abstraction for distributed tasks Handles fault tolerance Supports many simple distributed tasks! BigTable, Pregel, Percolator, Dremel Google Re-Engineering Google File System (GFS) MapReduce GOOGLE FILE SYSTEM (GFS) BigTable

7/04/05 What is a File-System? Breaks files into chunks (or clusters) Remembers the sequence of clusters Records directory/file structure Tracks file meta-data File size Last access Permissions Locks What is a Distributed File-System Same thing, but distributed What would transparency / flexibility / reliability / performance / scalability mean for a distributed file system? Transparency: Like a normal file-system Flexibility: Can mount new machines Reliability: Has replication Performance: Fast storage/retrieval Scalability: Can store a lot of data / support a lot of machines Google File System (GFS) Files are huge Files often read or appended Writes in the middle of a file not (really) supported Concurrency important GFS: Pipelined Writes File System (In-Memory) /blue.txt [3 chunks] : {A, C, E} : {A, B, D} 3: {B, D, E} /orange.txt [ chunks] : {B, D, E} : {A, C, E} Master 64MB per chunk 64 bit label for each chunk Assume replication factor of 3 blue.txt (50 MB: 3 chunks) orange.txt (00MB: chunks) Failures are frequent A B C D E Streaming important 3 3 3 Chunk-servers (slaves) GFS: Pipelined Writes (In Words). Client asks Master to write a file. Master returns a primary chunkserver and secondary chunkservers 3. Client writes to primary chunkserver and tells it the secondary chunkservers 4. Primary chunkserver passes data onto secondary chunkserver, which passes on 5. When finished, message comes back through the pipeline that all chunkservers have written Otherwise client sends again GFS: Fault Tolerance File System (In-Memory) /blue.txt [3 chunks] : {A, B, E} : {A, B, D} 3: {B, D, E} /orange.txt [ chunks] : {B, D, E} : {A, D, E} Master 64MB per chunk 64 bit label for each chunk Assume replication factor of 3 blue.txt (50 MB: 3 chunks) orange.txt (00MB: chunks) A B C D E 3 3 3 Chunk-servers (slaves) 3

7/04/05 GFS: Fault Tolerance (In Words) Master sends regular Heartbeat pings If a chunkserver doesn t respond. Master finds out what chunks it had. Master assigns new chunkserver for each chunk 3. Master tells new chunkserver to copy from a specific existing chunkserver GFS: Direct Reads File System (In-Memory) /blue.txt [3 chunks] : {A, C, E} : {A, B, D} 3: {B, D, E} /orange.txt [ chunks] : {B, D, E} : {A, C, E} Master I m looking for /blue.txt 3 A B C D E Client Chunks are prioritised by number of remaining replicas, then by demand 3 3 3 Chunk-servers (slaves) GFS: Direct Reads (In Words). Client asks Master for file. Master returns location of a chunk Returns a ranked list of replicas 3. Client reads chunk directly from chunkserver 4. Client asks Master for next chunk Software makes transparent for client! GFS: Modification Consistency Masters assign leases to one replica: a primary replica Client wants to change a file:. Client asks Master for the replicas (incl. primary). Master returns replica info to the client 3. Client sends change data 4. Client asks primary to execute the changes 5. Primary asks secondaries to change 6. Secondaries acknowledge to primary 7. Primary acknowledges to client Remember: Concurrency! Data & Control Decoupled GFS: Rack Awareness GFS: Rack Awareness Core Core Rack A Rack B Rack C 4

7/04/05 GFS: Rack Awareness GFS: Rack Awareness (In Words) Core Make sure replicas not on same rack In case rack switch fails! Rack A A A A3 A4 A5 Rack B B B B3 B4 B5 Files: /orange.txt : {A, A4, B3} : {A5, B, B5} Racks: A: {A, A, A3, A4, A5} B: {B, B, B3, B4, B5} But communication can be slower: Within rack: pass one switch (rack switch) Across racks: pass three switches (two racks and a core) (Typically) pick two racks with low traffic Two nodes in same rack, one node in another rack (Assuming 3x replication) Only necessary if more than one rack! GFS: Other Operations Rebalancing: Spread storage out evenly Deletion: Just rename the file with hidden file name To recover, rename back to original version Otherwise, three days later will be wiped Monitoring Stale Replicas: Dead slave reappears with old data: master keeps version info and will recycle old chunks GFS: Weaknesses? What do you see as the core weaknesses of the Google File System? Master node single point of failure Use hardware replication Logs and checkpoints! Master node is a bottleneck Use more powerful machine Minimise master node traffic Master-node metadata kept in memory Each chunk needs 64 bytes Chunk data can be queried from each slave GFS: White-Paper HADOOP DISTRIBUTED FILE SYSTEM (HDFS) 5

7/04/05 Google Re-Engineering HDFS Google File System (GFS) HDFS-to-GFS Data-node = Chunkserver/Slave Name-node = Master HDFS does not support modifications Otherwise pretty much the same except GFS is proprietary (hidden in Google) HDFS is open source (Apache!) HDFS Interfaces HDFS Interfaces Google Re-Engineering Google File System (GFS) GOOGLE S MAP-REDUCE MapReduce 6

7/04/05 MapReduce in Google Divide & Conquer:. Word count How could we do a distributed top-k word count?. Total searches per user 3. PageRank 4. Inverted-indexing MapReduce: Word Count 3 a-k a 0,03 aa 6,34 A B C A B C l-q lab 8,3 label 983 r-z rag 543 rat,33 Input Distr. File Sys. Map() (Partition/Shuffle) (Distr. Sort) Reduce() Output Better partitioning method? MapReduce (in more detail) MapReduce (in more detail). Input: Read from the cluster (e.g., a DFS) Chunks raw data for mappers Maps raw data to initial (key in, value in ) pairs What might Input do in the word-count case?. Map: For each (key in, value in ) pair, generate zero-to-many (key map, value map ) pairs key in /value in can be diff. type to key map /value map What might Map do in the word-count case? 3. Partition: Assign sets of key map values to reducer machines How might Partition work in the word-count case? 4. Shuffle: Data are moved from mappers to reducers (e.g., using DFS) 5. Comparison/Sort: Each reducer sorts the data by key using a comparison function Sort is taken care of by the framework MapReduce MapReduce: Word Count PseudoCode 6. Reduce: Takes a bag of (key map, value map ) pairs with the same key map value, and produces zero-to-many outputs for each bag Typically zero-or-one outputs How might Reduce work in the word-count case? 7. Output: Merge-sorts the results from the reducers / writes to stable storage 7

7/04/05 MapReduce: Scholar Example MapReduce: Scholar Example Assume that in Google Scholar we have inputs like: paper A citedby paper B How can we use MapReduce to count the total incoming citations per paper? MapReduce as a Dist. Sys. Transparency: Abstracts physical machines Flexibility: Can mount new machines; can run a variety of types of jobs Reliability: Tasks are monitored by a master node using a heart-beat; dead jobs restart Performance: Depends on the application code but exploits parallelism! Scalability: Depends on the application code but can serve as the basis for massive data processing! MapReduce: Benefits for Programmers Takes care of low-level implementation: Easy to handle inputs and output No need to handle network communication No need to write sorts or joins Abstracts machines (transparency) Fault tolerance (through heart-beats) Abstracts physical locations Add / remove machines Load balancing MapReduce: Benefits for Programmers Time for more important things HADOOP OVERVIEW 8

7/04/05 Hadoop Architecture Client NameNode DataNode DataNode DataNode n JobTracker JobNode JobNode JobNode n HDFS: Traditional / SPOF NameNode copy dfs/blue.txt rm dfs/orange.txt rmdir dfs/ mkdir new/ mv new/ dfs/ fsimage SecondaryNameNode DataNode DataNode n. NameNode appends edits to log file. SecondaryNameNode copies log file and image, makes checkpoint, copies image back 3. NameNode loads image on start-up and makes remaining edits SecondaryNameNode not a backup NameNode What is the secondary name-node? Name-node quickly logs all file-system actions in a sequential (but messy) way Secondary name-node keeps the main fsimage file up-to-date based on logs When the primary name-node boots back up, it loads the fsimage file and applies the remaining log to it Hence secondary name-node helps make boot-ups faster, helps keep file system image up-to-date and takes load away from primary Hadoop: High Availability Active NameNode Standby Active NameNode JournalManager fsimage JournalManager fs edits JournalNode JournalNode n fs edits. Input/Output (cmd) > hdfs dfs PROGRAMMING WITH HADOOP 9

7/04/05. Input/Output (Java) Creates a file system for default configuration. Input (Java) Check if the file exists; if so delete Create file and write a message Open and read back. Map (Writable for values) Mapper<InputKeyType, InputValueType, MapKeyType, MapValueType> OutputCollector will collect the Map key/value pairs Same order Emit output Reporter can provide counters and progress to client (not needed in the running example) (WritableComparable for keys/values) 3. Partition New Interface PartitionerInterface Same as before Needed to sort keys Needed for default partition function (not needed in the running example) (This happens to be the default partition method!) (not needed in the running example) 0

7/04/05 4. Shuffle 5. Sort/Comparision Methods in WritableComparator (not needed in the running example) 6. Reduce Reducer<MapKey, MapValue, OutputKey, OutputValue> 7. Output / Input (Java) Creates a file system for default configuration Key, Iterator over all values for that key, output key value pair collector, reporter Check if the file exists; if so delete Create file and write a message Open and read back Write to output 7. Output (Java) Control Flow Create a JobClient, a JobConf and pass it the main class Set the type of output key and value in the configuration Set input and output paths Set the mapper class Set the reducer class (and optionally combiner ) Pass the configuration to the client and run

7/04/05 More in Hadoop: Combiner More in Hadoop: Reporter Map-side mini-reduction Keeps a fixed-size buffer in memory Reporter has a group of maps of counters Reduce within that buffer e.g., count words in buffer Lessens bandwidth needs In Hadoop: can simply use Reducer class More in Hadoop: Chaining Jobs Sometimes we need to chain jobs In Hadoop, can pass a set of Jobs to the client x.adddependingjob(y) More in Hadoop: Distributed Cache Some tasks need global knowledge For example, a white-list of conference venues and journals that should be considered in the citation count Typically small Use a distributed cache: Makes data available locally to all nodes Distributed File Systems RECAP Google File System (GFS) Master and Chunkslaves Replicated pipelined writes Direct reads Minimising master traffic Fault-tolerance: self-healing Rack awareness Consistency and modifications Hadoop Distributed File System NameNode and DataNodes

7/04/05 MapReduce MapReduce/GFS Revision. Input. Map 3. Partition 4. Shuffle 5. Comparison/Sort 6. Reduce 7. Output GFS: distributed file system Implemented as HDFS MapReduce: distributed processing framework Implemented as Hadoop Hadoop FileSystem Mapper<InputKey,InputValue,MapKey,MapValue> OutputCollector<OutputKey,OutputValue> Writable, WritableComparable<Key> Partitioner<KeyType,ValueType> Reducer<MapKey,MapValue,OutputKey,OutputValue> JobClient/JobConf 3