Algorithm Design for MapReduce
|
|
- Blaze Jordan
- 7 years ago
- Views:
Transcription
1 Algorithm Design for MapReduce CprE 419X, Spring 2015 Iowa State University Srikanta Tirthapura 1/29/15 CprE 419 X, Srikanta Tirthapura 1
2 Problem 0: Sum Find the sum and average of many input integers Write the map and reduce pseudocode What is the map cost, reduce cost, per reducer cost, and communication cost of your method? 1/29/15 CprE 419 X, Srikanta Tirthapura 2
3 Problem 1: Set Difference Input: Two sets A and B Output: Set difference (A B), i.e. all those elements that are present in A but not in B Ex: Find all IP addresses that appeared in one log but not the other What is the map cost, reduce cost, per reducer cost, and communication cost of your method? 1/29/15 CprE 419 X, Srikanta Tirthapura 3
4 Set Difference MR Algorithm map(key = set_name, value = elementid) emit (key = elementid, value = set_name) reduce (key = elementid, values) if (A in values) and (B not in values): emit(key = elementid, value = 1) 1/29/15 CprE 419 X, Srikanta Tirthapura 4
5 Observation Also works if A and B are multisets (i.e. the same value appears multiple times within A and B) 1/29/15 CprE 419 X, Srikanta Tirthapura 5
6 Analysis Is the set difference algorithm good? How much does it cost? Not an easy question to answer Consider: Per Mapper Cost, Total Map Cost Per Reducer Cost, Total Reduce Cost Total Bytes of Communication 1/29/15 CprE 419 X, Srikanta Tirthapura 6
7 Set Difference Let n = input size ( A + B ), M = number of mappers, R = number of reducers Total Map Cost = Theta(n) Per Mapper Cost = Theta(n/M) Total Reduce Cost = Theta(n) Per Reducer Cost = Theta(n/R) Total Communication Cost = Theta(n) 1/29/15 CprE 419 X, Srikanta Tirthapura 7
8 Problem 2: Matrix-Vector Input: n x n matrix M Multiplication M[i,j] provided as (i,j,m[i,j]) within a HDFS file rows numbered 1.. n, columns similar Element at row i and column j denoted M[i,j] n x 1 vector A A[j] provided as (j,a[j]) within a HDFS file Output: Vector B = M A B[i] = M[i,1] * A[1] + M[i,2] * A[2] +. 1/29/15 CprE 419 X, Srikanta Tirthapura 8
9 Matrix-Vector Using Mapreduce Suppose reduce key was i, i=1 n Reduce function for key i computes B[i] Needs values M[i,1]*A[1], M[i,2]*A[2], How do we compute these? Computing M[i,j] * A[j] Use j as the key in one MapReduce round 1/29/15 CprE 419 X, Srikanta Tirthapura 9
10 Matrix-Vector MR Round 1 map1(key, value = (i,j,m[i,j])) emit (key = j, value = (i,j,m[i,j])) map1(key, value = (j,a[j])) emit (key = j, value = A[j]) reduce1(key = j, values = [A[j], M[1,j], [2,j],..]) for (i = 1 to n): emit(i, M[i,j] * A[j]) 1/29/15 CprE 419 X, Srikanta Tirthapura 10
11 Matrix-Vector MR Round 2 map2(key = i, value) emit (key = i, value) reduce2(key = i, values) B[i] = 0 for (v in values): B[i] += v emit (key = i, value = B[i]) 1/29/15 CprE 419 X, Srikanta Tirthapura 11
12 Matrix-Vector Analysis First MR Round. Total Map cost = O(n 2 +n) = O(n 2 ) Per Mapper Cost = O(n 2 /M) Total Reduce Cost = O(n 2 ) Per Reducer Cost = O(n 2 /R) Communication = O(n 2 ) Second Round is Similar Q: One Round Algorithm for Matrix-Vector using MapReduce? 1/29/15 CprE 419 X, Srikanta Tirthapura 12
13 Problem 3: Graph Processing Input: A graph G presented as a sequence of edges e1 = (v11, v12) e2 = (v21, v22) e3 = (v31, v32). Output: Set of all triangles in G A triangle is a set of three vertices (u,v,w) such that all three edges (u,v) (v,w) and (u,w) exist in G 1/29/15 CprE 419 X, Srikanta Tirthapura 13
14 Enumerating Triangles: Solution Idea e1 = (a,b) a: (b,d) a: (b,d) b: (a,c,d) d: (b,a) e2 = (b,c) e3 = (b,d) MR Round 1 b: (a,c,d) c: (b) MR Round 2 b: (a,c,d) a: (b,d) d: (b,a) d: (b) e4 = (a,d) d: (b,a) 1/29/15 CprE 419 X, Srikanta Tirthapura 14
15 MapReduce: Triangles, Round 1 map1 (String key, String edge): v1, v2 = vertices in edge emit (v1, v2) emit (v2, v1) reduce1(string vertexid, Iterator values): for other_vertex in values: emit (vertexid, values) 1/29/15 CprE 419 X, Srikanta Tirthapura 15
16 MapReduce: Triangles, Round 2 map2 (String vertexid, String neighbor_list): for each vertex v in neighbor_list: emit (v, (vertexid, neighbor_list)) reduce2 (String vertexid, Iterator values): construct 2-neighborhood graph of vertexid enumerate all triangles in this graph 1/29/15 CprE 419 X, Srikanta Tirthapura 16
17 Problem 4: Length 2 Paths in a Graph Input: A graph G presented as a list of edges Output: All paths of length 2 in the graph Solution: Similar to the problem of enumerating triangles in a graph 1/29/15 CprE 419 X, Srikanta Tirthapura 17
18 Problem 5: Finding Pairs of Nearby Bit Strings Input: Set of bitstrings, each of length k Output: All pairs of bit strings such that the two strings in the pair differ at no more than two positions, i.e. at a Hamming distance of no more than 2. 1/29/15 CprE 419 X, Srikanta Tirthapura 18
19 Example: Nearby Bit Strings Input: 10010, 00010, 11000, 11110, Output: (10010, 00010), (10010, 11000) (10010, 11110), (10010, 00000) (00010, 00000), (11000, 11110) (11000, 00000) 1/29/15 CprE 419 X, Srikanta Tirthapura 19
20 Algorithm 1 Compare all pairs of strings and see if they are within a Hamming Distance of 2 Mapper: For each string b, Send a copy of b to each reducer Reducer i: Receives entire set S Computes a subset S i Examines all pairs in S i x S 1/29/15 CprE 419 X, Srikanta Tirthapura 20
21 Algorithm 1 Analysis Let n = total number of input strings. M = # of mappers, R = # of reducers Total Map Cost = O(nkR) Per Mapper Cost = O(nkR/M) Total Reducer Cost = O(n 2 k) Per Reducer Cost = O(n 2 k/r) Total Communication = O(nkR) 1/29/15 CprE 419 X, Srikanta Tirthapura 21
22 Algorithm 2 For each bitstring b, there are only (k choose 2) strings at Hamming distance 2 (k choose 1) strings at Hamming distance 1 Search all these possibilities 1/29/15 CprE 419 X, Srikanta Tirthapura 22
23 Algorithm 2 Map(key, value = bitstring b): emit (key = b, value = b) for each b formed by flipping 1 bit of b: emit (key = b, value = b) Reduce(key = bitstring b, values): for any two strings b1, b2 in values: emit(key = (b1, b2), value = 1) 1/29/15 CprE 419 X, Srikanta Tirthapura 23
24 Analysis of Algorithm 2 Let n = total number of input strings, M = # mappers, R = # reducers Total Map Cost = O(nk 2 ) Per Mapper Cost = O(nk 2 /M) Total Reduce Cost = O(kn + output size) Per Reducer Cost = O(kn/R + output size) Total Communication = O(nk 2 ) 1/29/15 CprE 419 X, Srikanta Tirthapura 24
25 Generalization List all pairs of bitstrings that are at a Hamming distance of t or lesser. 1/29/15 CprE 419 X, Srikanta Tirthapura 25
26 Problem 6: Sorting Given a large file of strings, one per line, sort them in alphabetical order 1/29/15 CprE 419 X, Srikanta Tirthapura 26
27 Sorting in Mapreduce The key, value paradigm does not help here Use the fact that in Mapreduce, the reducer is provided a list of keys in the sorted order Also use a splitter that sends data to reducers not according to hash, but using a different method 1/29/15 CprE 419 X, Srikanta Tirthapura 27
28 Problem 7: Graph Connectivity Input: A graph, presented as a list of edges Output: Yes if graph is connected, and No if it is not connected. 1/29/15 CprE 419 X, Srikanta Tirthapura 28
Why? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
More informationInfrastructures for big data
Infrastructures for big data Rasmus Pagh 1 Today s lecture Three technologies for handling big data: MapReduce (Hadoop) BigTable (and descendants) Data stream algorithms Alternatives to (some uses of)
More informationhttp://www.wordle.net/
Hadoop & MapReduce http://www.wordle.net/ http://www.wordle.net/ Hadoop is an open-source software framework (or platform) for Reliable + Scalable + Distributed Storage/Computational unit Failures completely
More informationOnline EFFECTIVE AS OF JANUARY 2013
2013 A and C Session Start Dates (A-B Quarter Sequence*) 2013 B and D Session Start Dates (B-A Quarter Sequence*) Quarter 5 2012 1205A&C Begins November 5, 2012 1205A Ends December 9, 2012 Session Break
More informationWarshall s Algorithm: Transitive Closure
CS 0 Theory of Algorithms / CS 68 Algorithms in Bioinformaticsi Dynamic Programming Part II. Warshall s Algorithm: Transitive Closure Computes the transitive closure of a relation (Alternatively: all paths
More informationMapReduce and the New Software Stack
20 Chapter 2 MapReduce and the New Software Stack Modern data-mining applications, often called big-data analysis, require us to manage immense amounts of data quickly. In many of these applications, the
More informationClass One: Degree Sequences
Class One: Degree Sequences For our purposes a graph is a just a bunch of points, called vertices, together with lines or curves, called edges, joining certain pairs of vertices. Three small examples of
More informationApproximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
More informationMining Social Network Graphs
Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand
More information. 0 1 10 2 100 11 1000 3 20 1 2 3 4 5 6 7 8 9
Introduction The purpose of this note is to find and study a method for determining and counting all the positive integer divisors of a positive integer Let N be a given positive integer We say d is a
More informationLossless Data Compression Standard Applications and the MapReduce Web Computing Framework
Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern
More informationNear Optimal Solutions
Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.
More informationApplied Algorithm Design Lecture 5
Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design
More informationVector Notation: AB represents the vector from point A to point B on a graph. The vector can be computed by B A.
1 Linear Transformations Prepared by: Robin Michelle King A transformation of an object is a change in position or dimension (or both) of the object. The resulting object after the transformation is called
More informationOutline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique
More informationA linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form
Section 1.3 Matrix Products A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form (scalar #1)(quantity #1) + (scalar #2)(quantity #2) +...
More informationHadoop SNS. renren.com. Saturday, December 3, 11
Hadoop SNS renren.com Saturday, December 3, 11 2.2 190 40 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December
More informationData Structures and Algorithms Written Examination
Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where
More information(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7
(67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition
More informationCSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
More informationSolutions to Homework 6
Solutions to Homework 6 Debasish Das EECS Department, Northwestern University ddas@northwestern.edu 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example
More informationOn Integer Additive Set-Indexers of Graphs
On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that
More informationGraph Processing and Social Networks
Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationAsking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
More informationCOUNTING INDEPENDENT SETS IN SOME CLASSES OF (ALMOST) REGULAR GRAPHS
COUNTING INDEPENDENT SETS IN SOME CLASSES OF (ALMOST) REGULAR GRAPHS Alexander Burstein Department of Mathematics Howard University Washington, DC 259, USA aburstein@howard.edu Sergey Kitaev Mathematics
More informationEuler Paths and Euler Circuits
Euler Paths and Euler Circuits An Euler path is a path that uses every edge of a graph exactly once. An Euler circuit is a circuit that uses every edge of a graph exactly once. An Euler path starts and
More informationA permutation can also be represented by describing its cycles. What do you suppose is meant by this?
Shuffling, Cycles, and Matrices Warm up problem. Eight people stand in a line. From left to right their positions are numbered,,,... 8. The eight people then change places according to THE RULE which directs
More informationMathematics 3301-001 Spring 2015 Dr. Alexandra Shlapentokh Guide #3
Mathematics 3301-001 Spring 2015 Dr. Alexandra Shlapentokh Guide #3 The problems in bold are the problems for Test #3. As before, you are allowed to use statements above and all postulates in the proofs
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationMapReduce Algorithms. Sergei Vassilvitskii. Saturday, August 25, 12
MapReduce Algorithms A Sense of Scale At web scales... Mail: Billions of messages per day Search: Billions of searches per day Social: Billions of relationships 2 A Sense of Scale At web scales... Mail:
More informationChapter 6: Graph Theory
Chapter 6: Graph Theory Graph theory deals with routing and network problems and if it is possible to find a best route, whether that means the least expensive, least amount of time or the least distance.
More informationEuclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li. Advised by: Dave Mount. May 22, 2014
Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li Advised by: Dave Mount May 22, 2014 1 INTRODUCTION In this report we consider the implementation of an efficient
More informationThis exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing.
Big Data Processing 2013-2014 Q2 April 7, 2014 (Resit) Lecturer: Claudia Hauff Time Limit: 180 Minutes Name: Answer the questions in the spaces provided on this exam. If you run out of room for an answer,
More informationMapReduce in MPI for Large-scale Graph Algorithms
MapReduce in MPI for Large-scale Graph Algorithms Steven J. Plimpton and Karen D. Devine Sandia National Laboratories Albuquerque, NM sjplimp@sandia.gov Keywords: MapReduce, message-passing, MPI, graph
More informationAn Introduction to APGL
An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an
More informationJust the Factors, Ma am
1 Introduction Just the Factors, Ma am The purpose of this note is to find and study a method for determining and counting all the positive integer divisors of a positive integer Let N be a given positive
More informationIntroduction to Parallel Programming and MapReduce
Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant
More information1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)
Some N P problems Computer scientists have studied many N P problems, that is, problems that can be solved nondeterministically in polynomial time. Traditionally complexity question are studied as languages:
More informationChapter 2 Data Storage
Chapter 2 22 CHAPTER 2. DATA STORAGE 2.1. THE MEMORY HIERARCHY 23 26 CHAPTER 2. DATA STORAGE main memory, yet is essentially random-access, with relatively small differences Figure 2.4: A typical
More informationMax Flow, Min Cut, and Matchings (Solution)
Max Flow, Min Cut, and Matchings (Solution) 1. The figure below shows a flow network on which an s-t flow is shown. The capacity of each edge appears as a label next to the edge, and the numbers in boxes
More informationChapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling
Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one
More informationHandout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, 2010. Chapter 7: Digraphs
MCS-236: Graph Theory Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, 2010 Chapter 7: Digraphs Strong Digraphs Definitions. A digraph is an ordered pair (V, E), where V is the set
More informationCloud Computing. Chapter 8. 8.1 Hadoop
Chapter 8 Cloud Computing In cloud computing, the idea is that a large corporation that has many computers could sell time on them, for example to make profitable use of excess capacity. The typical customer
More informationLecture 2 Matrix Operations
Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or
More informationTriangle deletion. Ernie Croot. February 3, 2010
Triangle deletion Ernie Croot February 3, 2010 1 Introduction The purpose of this note is to give an intuitive outline of the triangle deletion theorem of Ruzsa and Szemerédi, which says that if G = (V,
More informationAnalysis of MapReduce Algorithms
Analysis of MapReduce Algorithms Harini Padmanaban Computer Science Department San Jose State University San Jose, CA 95192 408-924-1000 harini.gomadam@gmail.com ABSTRACT MapReduce is a programming model
More informationEfficient Recovery of Secrets
Efficient Recovery of Secrets Marcel Fernandez Miguel Soriano, IEEE Senior Member Department of Telematics Engineering. Universitat Politècnica de Catalunya. C/ Jordi Girona 1 i 3. Campus Nord, Mod C3,
More information4. How many integers between 2004 and 4002 are perfect squares?
5 is 0% of what number? What is the value of + 3 4 + 99 00? (alternating signs) 3 A frog is at the bottom of a well 0 feet deep It climbs up 3 feet every day, but slides back feet each night If it started
More informationThe PageRank Citation Ranking: Bring Order to the Web
The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized
More informationDOT: A Matrix Model for Analyzing, Optimizing and Deploying Software for Big Data Analytics in Distributed Systems
DOT: A Matrix Model for Analyzing, Optimizing and Deploying Software for Big Data Analytics in Distributed Systems Yin Huai 1 Rubao Lee 1 Simon Zhang 2 Cathy H Xia 3 Xiaodong Zhang 1 1,3Department of Computer
More informationNimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff
Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear
More informationDeveloping a MapReduce Application
TIE 12206 - Apache Hadoop Tampere University of Technology, Finland November, 2014 Outline 1 MapReduce Paradigm 2 Hadoop Default Ports 3 Outline 1 MapReduce Paradigm 2 Hadoop Default Ports 3 MapReduce
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;
More informationFinding the Measure of Segments Examples
Finding the Measure of Segments Examples 1. In geometry, the distance between two points is used to define the measure of a segment. Segments can be defined by using the idea of betweenness. In the figure
More informationLesson 7 Pentaho MapReduce
Lesson 7 Pentaho MapReduce Pentaho Data Integration, or PDI, is a comprehensive ETL platform allowing you to access, prepare and derive value from both traditional and big data sources. During this lesson,
More informationLecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs
CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like
More informationDeveloping MapReduce Programs
Cloud Computing Developing MapReduce Programs Dell Zhang Birkbeck, University of London 2015/16 MapReduce Algorithm Design MapReduce: Recap Programmers must specify two functions: map (k, v) * Takes
More informationDMX-h ETL Use Case Accelerator. Web Log Aggregation
DMX-h ETL Use Case Accelerator Web Log Aggregation Syncsort Incorporated, 2015 All rights reserved. This document contains proprietary and confidential material, and is only for use by licensees of DMExpress.
More informationOn the independence number of graphs with maximum degree 3
On the independence number of graphs with maximum degree 3 Iyad A. Kanj Fenghui Zhang Abstract Let G be an undirected graph with maximum degree at most 3 such that G does not contain any of the three graphs
More informationChapter 6: Episode discovery process
Chapter 6: Episode discovery process Algorithmic Methods of Data Mining, Fall 2005, Chapter 6: Episode discovery process 1 6. Episode discovery process The knowledge discovery process KDD process of analyzing
More informationMatrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
More informationBALTIC OLYMPIAD IN INFORMATICS Stockholm, April 18-22, 2009 Page 1 of?? ENG rectangle. Rectangle
Page 1 of?? ENG rectangle Rectangle Spoiler Solution of SQUARE For start, let s solve a similar looking easier task: find the area of the largest square. All we have to do is pick two points A and B and
More informationA Model of Computation for MapReduce
A Model of Computation for MapReduce Howard Karloff Siddharth Suri Sergei Vassilvitskii Abstract In recent years the MapReduce framework has emerged as one of the most widely used parallel computing platforms
More information1 Introduction. Dr. T. Srinivas Department of Mathematics Kakatiya University Warangal 506009, AP, INDIA tsrinivasku@gmail.com
A New Allgoriitthm for Miiniimum Costt Liinkiing M. Sreenivas Alluri Institute of Management Sciences Hanamkonda 506001, AP, INDIA allurimaster@gmail.com Dr. T. Srinivas Department of Mathematics Kakatiya
More informationScheduling Algorithm for Delivery and Collection System
Scheduling Algorithm for Delivery and Collection System Kanwal Prakash Singh Data Scientist, DSL, Housing.com Abstract Extreme teams, large-scale agents teams operating in dynamic environments are quite
More informationAlgorithmic Techniques for Big Data Analysis. Barna Saha AT&T Lab-Research
Algorithmic Techniques for Big Data Analysis Barna Saha AT&T Lab-Research Challenges of Big Data VOLUME Large amount of data VELOCITY Needs to be analyzed quickly VARIETY Different types of structured
More informationTIgeometry.com. Geometry. Angle Bisectors in a Triangle
Angle Bisectors in a Triangle ID: 8892 Time required 40 minutes Topic: Triangles and Their Centers Use inductive reasoning to postulate a relationship between an angle bisector and the arms of the angle.
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationAnswer: (a) Since we cannot repeat men on the committee, and the order we select them in does not matter, ( )
1. (Chapter 1 supplementary, problem 7): There are 12 men at a dance. (a) In how many ways can eight of them be selected to form a cleanup crew? (b) How many ways are there to pair off eight women at the
More informationhttp://www.castlelearning.com/review/teacher/assignmentprinting.aspx 5. 2 6. 2 1. 10 3. 70 2. 55 4. 180 7. 2 8. 4
of 9 1/28/2013 8:32 PM Teacher: Mr. Sime Name: 2 What is the slope of the graph of the equation y = 2x? 5. 2 If the ratio of the measures of corresponding sides of two similar triangles is 4:9, then the
More informationLinear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
More informationComputer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li
Computer Algorithms NP-Complete Problems NP-completeness The quest for efficient algorithms is about finding clever ways to bypass the process of exhaustive search, using clues from the input in order
More informationRelationship collections are well expressed
C loud C omputing Graph Twiddling in a MapReduce World The easily distributed sorting primitives that constitute MapReduce jobs have shown great value in processing large data volumes If useful graph operations
More informationInteger Factorization using the Quadratic Sieve
Integer Factorization using the Quadratic Sieve Chad Seibert* Division of Science and Mathematics University of Minnesota, Morris Morris, MN 56567 seib0060@morris.umn.edu March 16, 2011 Abstract We give
More informationLarge induced subgraphs with all degrees odd
Large induced subgraphs with all degrees odd A.D. Scott Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, England Abstract: We prove that every connected graph of order
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. An Improved Approximation Algorithm for the Traveling Tournament Problem
MATHEMATICAL ENGINEERING TECHNICAL REPORTS An Improved Approximation Algorithm for the Traveling Tournament Problem Daisuke YAMAGUCHI, Shinji IMAHORI, Ryuhei MIYASHIRO, Tomomi MATSUI METR 2009 42 September
More informationBig Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
More information1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier.
Study Group 1 Variables and Types 1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier. 2. What does the byte 00100110 represent? 3. What is the purpose of the declarations
More informationMatrix-vector multiplication in terms of dot-products
Matrix-vector multiplication in terms of dot-products Let M be an R C matrix. Dot-Product Definition of matrix-vector multiplication: M u is the R-vector v such that v[r] is the dot-product of row r of
More informationZachary Monaco Georgia College Olympic Coloring: Go For The Gold
Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Coloring the vertices or edges of a graph leads to a variety of interesting applications in graph theory These applications include various
More informationNumerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number
More informationCOMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012
Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about
More informationDecember 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
More information! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.
Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of
More informationPractical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
More informationBig Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
More informationIntroduction to Graph Mining
Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain
More information16.1 MAPREDUCE. For personal use only, not for distribution. 333
For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several
More informationAssignment 2: More MapReduce with Hadoop
Assignment 2: More MapReduce with Hadoop Jean-Pierre Lozi February 5, 2015 Provided files following URL: An archive that contains all files you will need for this assignment can be found at the http://sfu.ca/~jlozi/cmpt732/assignment2.tar.gz
More informationDesign of LDPC codes
Design of LDPC codes Codes from finite geometries Random codes: Determine the connections of the bipartite Tanner graph by using a (pseudo)random algorithm observing the degree distribution of the code
More informationSECTIONS 1.5-1.6 NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES
SECIONS.5-.6 NOES ON GRPH HEORY NOION ND IS USE IN HE SUDY OF SPRSE SYMMERIC MRICES graph G ( X, E) consists of a finite set of nodes or vertices X and edges E. EXMPLE : road map of part of British Columbia
More informationMapReduce: Algorithm Design Patterns
Designing Algorithms for MapReduce MapReduce: Algorithm Design Patterns Need to adapt to a restricted model of computation Goals Scalability: adding machines will make the algo run faster Efficiency: resources
More informationReductions & NP-completeness as part of Foundations of Computer Science undergraduate course
Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course Alex Angelopoulos, NTUA January 22, 2015 Outline Alex Angelopoulos (NTUA) FoCS: Reductions & NP-completeness-
More informationPatterns in Pascal s Triangle
Pascal s Triangle Pascal s Triangle is an infinite triangular array of numbers beginning with a at the top. Pascal s Triangle can be constructed starting with just the on the top by following one easy
More informationMapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
More informationClustering Big Data. Efficient Data Mining Technologies. J Singh and Teresa Brooks. June 4, 2015
Clustering Big Data Efficient Data Mining Technologies J Singh and Teresa Brooks June 4, 2015 Hello Bulgaria (http://hello.bg/) A website with thousands of pages... Some pages identical to other pages
More informationPolytope Examples (PolyComp Fukuda) Matching Polytope 1
Polytope Examples (PolyComp Fukuda) Matching Polytope 1 Matching Polytope Let G = (V,E) be a graph. A matching in G is a subset of edges M E such that every vertex meets at most one member of M. A matching
More informationData Structures. Chapter 8
Chapter 8 Data Structures Computer has to process lots and lots of data. To systematically process those data efficiently, those data are organized as a whole, appropriate for the application, called a
More informationPositional Numbering System
APPENDIX B Positional Numbering System A positional numbering system uses a set of symbols. The value that each symbol represents, however, depends on its face value and its place value, the value associated
More information