INFO5011. Cloud Computing Semester 2, 2011 Lecture 7, MapReduce (II)

Transcription

1 INFO5011 Cloud Computing Semester 2, 2011 Lecture 7, MapReduce (II) COMMONWEALTH OF Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of the university of Sydney pursuant to Part VB of the Copyright Act 1968 (the Act). The material in this communication may be subject to copyright under the Act. Any further reproduction or communication of this material by you may be the subject of copyright protection under the Act. Do not remove this notice.

2 Outline Counter Table join - Multiple Inputs - Secondary sort - Grouping Comparator DistributedCache Map side join Fault Tolerance and other features 2

3 A typical job status output 3

4 Counters Counters are a useful channel for gathering statistics about the job: for quality control, or for application level-statistics Hadoop has many build-in counters to track - E.g.: Spilled Records: Map(355,671), Reduce(91,650) Map output records(11,760,740), Reduce Input groups (72,220) Developer can include application-level statics as well - E.g: number of bad(incomplete records), number of words containing non-ascii characters. 4

5 Writing a simple custom counter public class TagReducerWCounter extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); private final static int MINFREQ = 5; private enum language{foreign; static CharsetEncoder asciiencoder = Charset.forName("US-ASCII").newEncoder(); public void reduce(text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { if (!asciiencoder.canencode(key.tostring())) context.getcounter(language.foreign).increment(1); int sum = 0; for (IntWritable val : values) { sum += val.get(); if (sum > MINFREQ){ //only emit when sum is bigger than the threshold result.set(sum); context.write(key, result); 5

6 A table join example Two input sources - Photo: user \t \date \t place_id \n - Geo: place_id \t woeid \t lat \t longi \t place_name \t place_url Output: - Place_id \t user \t date \t place_name1 - Place_id \t user \t date \t place_name2 This is like joining two tables on key place_id Map: photo record -> (k:place_id, v: user \t date) Geo record ->(k:place_id, v: place_name) Reduce: (k:place_id,v:(place_name, user1 \t date1, user2 \t date2)) -> (K:place_id, v: user1 \t date1 \t place_name), (K:place_id, v: user2 \t date2 \t place_name), 6

7 Several new features How do we implement Join in Hadoop - Multiple inputs with different formats and hence different Mappers - There should be a way to identify values coming from various input in the values list passed to a same reduce function - One easy way is to order the, in a particular way - Value from the Geo table, which contains the place name should be the first in the value list Multiple input is supported in Hadoop framework Value lists are grouped based on key - Need a way to tag the key to differentiate keys from different input data - Need special handling to ensure the tagged keys maintain their original partition position and the values from the say key are grouped as a list to send to the same reduce function 7

8 Example of join execution A grouping for a reduce function u1,d1,p1 u2,d2,p2 u3,d3,p3 Mapper 1 p1: u1,d1 p2: u2,d2 p3: u3,d3 p1:sydney p1: u1,d1 p2: Boston p2 u2,d2 p2:u1,d2 Reducer 0 p1: u1,d1,sydney p2: u2,d2,boston p2: u1,d2, Boston A partition for a mapper u1,d2,p2 u4,d1,p3 u2,d4,p4 Mapper 2 p2: u1,d2 p3: u4,d1 p4: u2,d4 Another grouping for another call of the reduce function p1: Sydney p2: Boston p3: Beijing Mapper 3 p1: Sydney p2: Boston p3: Beijing p3: Beijing p3: u3,d3 p3:u4,d1 p4: u2,d4 Reducer 1 p3: u3,d3,beijing p3: u4,d1, Beijing p4:u2,d4, NULL Ideal order to be received by all reducers 8

9 We want to sort the map output based on key as well as where it comes from - Tag the key with additional information to indicate where the (key,value) pair comes from. Mapper 1 Mapper 3 p1,1: u1,d1 p2,1: u2,d2 p3,1: u3,d3 P1,0: Sydney p2,0: Boston p3,0: Beijing p1,0:sydney p1,1: u1,d1 p2,0: Boston p2,1 :u2,d2 p2,1:u1,d2 Sort by original key, then by the tag Using default partitioner, there is no guarantee that composite keys (p1,0) and (p1,1) go to the same partition, or composite keys (p2,0) and (p2,1) go to the same partition Composite key The order of reducer input is determined by its output key s Comparator; The allocation of output key to partition is determined by a given or default Partitioner The grouping of key value pairs for each call of reduce function is determined by a grouping comparator; if not specified, the output key s Comparator is used. 9

10 Custom Partitioner and Grouping Comparator We should sort all input based on composite key s comparator which takes into account both the original key and the additional tag We should partition and grouping based on the original key only reducer reduce functions Mapper 1 Mapper 3 p1,1: u1,d1 p2,1: u2,d2 p3,1: u3,d3 P1,0: Sydney p2,0: Boston p1,0:sydney p1,1: u1,d1 p2,0: Boston p2,1 :u2,d2 p2,1:u1,d2 p1,0 -> (sydney, <u1,d1>) p2,0 -> (Boston, <u2,d2>, <u1,d2>) p3,0: Beijing Partitioner(key) Sorting (key+tag) Grouping (key) 10

11 The Composite Key public class TextIntPair implements WritableComparable<TextIntPair>{ private Text key; private IntWritable order; //standard accessor methods for the private fields are omitted; default constructor is omitted public TextIntPair(String key, int order){ this.key = new Text(key);this.order = new IntWritable(order); //serializing and deserializing methods public void readfields(datainput in) throws IOException {key.readfields(in);order.readfields(in); public void write(dataoutput out) throws IOException {key.write(out);order.write(out); public int compareto(textintpair other) { int cmp = key.compareto(other.key); if (cmp!= 0) {return cmp; return order.compareto(other.order); Methods to compare two keys public int hashcode() {return key.hashcode() * order.get(); public boolean equals(object other) { if (other instanceof TextIntPair) { TextIntPair tip = (TextIntPair) other; return key.equals(tip.key) && order.equals(tip.order); return false; 11

12 Partitioner and Group Comparator public class JoinPartitioner implements Partitioner<TextIntPair,Text> { public int getpartition(textintpair key, Text value, int numpartition) { return (key.getkey().hashcode() * 123) % numpartition; public void configure(jobconf arg0) { public class JoinGroupComparator extends WritableComparator { protected JoinGroupComparator() {super(textintpair.class,true); /** * Only compare the key when grouping reducer input together */ public int compare(writablecomparable w1, WritableComparable w2) { TextIntPair tip1 = (TextIntPair) w1; TextIntPair tip2 = (TextIntPair) w2; return tip1.getkey().compareto(tip2.getkey()); 12

13 Auxiliary job data Distributing Auxiliary Job data - In general a small file contains common background knowledge for processing map jobs - E.g. the stop word list for word counting, the dictionary for spelling check - All mappers need to read it - The file is small enough to fit in the memory of mappers Hadoop provides a mechanism for this purpose called the distributed cache. - In old API, use the class called DistributedCache to distribute those files to all tasktrackers - In new API, use methods in Job class to set and get cache files Distributed cache can be used to provide an alternative, more efficient join if one join table is small enough to fit in the memory - Only the map stage is needed, sometimes called replicated join 13

14 DistributedCache public static class ReplicatedJoinMapper extends MapReduceBase implements Mapper<Text, Text, Text, Text> { private enum Place {AUSTRALIA, OTHER; private Hashtable <String, String> placetable = new Hashtable<String, String>(); public void configure(jobconf job){ try{ Path[] cachefiles = DistributedCache.getLocalCacheFiles(job); if (cachefiles!= null && cachefiles.length > 0) { String line; String[] tokens; BufferedReader placereader = new BufferedReader(new FileReader(cacheFiles[0].toString())); try { while ((line = placereader.readline())!= null) { tokens = line.split("\t"); placetable.put(tokens[0], tokens[4]); System.out.println("size of the place table is: " + placetable.size()); finally { placereader.close(); catch (IOException e) { System.err.println("Exception reading DistributedCache:" + e); 14

15 DistributedCache public void map(text key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException { String[] dataarray = value.tostring().split("\t"); String placeid = dataarray[1]; String placename = placetable.get(placeid); if (placename!= null){ output.collect(key, new Text(dataArray[0]+ "\t" + placename)); reporter.incrcounter(place.australia,1); else{ reporter.incrcounter(place.other, 1); 15

16 DistributedCache public int run(string[] args) throws Exception { Configuration conf = getconf(); JobConf job = new JobConf(conf, ReplicatedJoin.class); DistributedCache.addCacheFile(new Path("/user/zhouy/australia.txt").toUri(), job); Path in = new Path(args[0]); Path out = new Path(args[1]); FileInputFormat.setInputPaths(job, in); FileOutputFormat.setOutputPath(job, out); job.setjobname("replicated Join with DistributedCache"); job.setmapperclass(replicatedjoinmapper.class); job.setnumreducetasks(0); job.setinputformat(keyvaluetextinputformat.class); job.setoutputformat(textoutputformat.class); JobClient.runJob(job); return 0; 16

17 Map side join A map-side join works by performing the join before the data reaches the map function. Special requirement for input data - Each input dataset must be divided into the same number of partitions - It must be sorted by the same key (the join key) in each source. - All the records for a particular key must reside in the same partition. A map-side join can be used to join the outputs of several jobs that had the same number of reducers, the same keys, and output files that are not splittable There are special Hadoop classes for handling this type of Join - CompositeInputFormat from org.apache.hadoop.mapred.join package 17

18 Worker failure - Detected by master through periodic pings - Handled via re-execution - Redo in-progress or completed map tasks - Redo in-progress reduce tasks - Map/reduce tasks committed through master Master failure Fault Tolerance and other features - Not covered in original implementation - Could be detected by user program or monitor - Could recover persistent state from disk 18

19 Backup tasks - Straggler: worker that takes unusually long to finish task Fault Tolerance and other features - Possible causes include bad disks, network issues, overloaded machines - Near the end of the map/reduce phase, master spawns backup copies of remaining tasks - Use workers that completed their task already - Whichever finishes first wins 19

20 Add a counter to the join program to count Your turn - the number of place_ids that appear in the photo table but not the place table - The number of place_ids that appear in the place table but not the photo table Find all photos in n08_all.txt that were taken in China - Try the reduce-side join option - Try the distributed cache option - Extract all place_ids related with China - Distribute the result to all mapper tasks - Compare the performance 20