! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)

Size: px
Start display at page:

Download "! E6893 Big Data Analytics Lecture 10:! Linked Big Data Graph Computing (II)"

Transcription

1 E6893 Big Data Analytics Lecture 10: Linked Big Data Graph Computing (II) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big Data Analytics, IBM Watson Research Center November 6th, 2014

2 Course Structure Class Data Number Topics Covered 09/04/14 1 Introduction to Big Data Analytics 09/11/14 2 Big Data Analytics Platforms 09/18/14 3 Big Data Storage and Processing 09/25/14 4 Big Data Analytics Algorithms -- I 10/02/14 5 Big Data Analytics Algorithms -- II (recommendation) 10/09/14 6 Big Data Analytics Algorithms III (clustering) 10/16/14 7 Big Data Analytics Algorithms IV (classification) 10/23/14 8 Big Data Analytics Algorithms V (classification & clustering) 10/30/14 9 Linked Big Data Graph Computing I (Graph DB) 11/06/14 10 Linked Big Data Graph Computing II (Graph Analytics) 11/13/14 11 Big Data on Hardware, Processors, and Cluster Platforms 11/20/14 12 Final Project First Presentations 11/27/14 Thanksgiving Holiday 12/04/14 13 Next Stage of Big Data Analytics 12/11/14 14 Big Data Analytics Workshop Final Project Presentations 2

3 Final Project Proposal (First) Presentation Date/Time: November 20, 7pm - 9:30pm Each Team about 3 mins: 1. Team members and expected contributions of each member; 2. Motivation of your project (The problem you would like to solve); 3. Dataset, algorithm, and tools for your project; 4. Current Status of your project. Please update your team info in the Project webpage by November 11. The presentation schedule will be announced on November 13. The website will be opened to allow you upload your slides by November 20. If a project is purely by CVN students, please submit your slides without oral presentation. 3

4 ScaleGraph an Open Source version of IBM System G 4

5 ScaleGraph algorithms made Top #1 in Graph 500 benchmark 5 Source: Dr. Toyotaro Suzumura, ICPE2014 keynote E6893 Big Data Analytics Lecture 9: Linked Big Data: Graph Computing

6 Graph Definitions and Concepts A graph: G = ( V, E) V = Vertices or Nodes E = Edges or Links The number of vertices: Order The number of edges: Size Ne Nv = = E V 6

7 Subgraph A graph H is a subgraph of another graph G, if: V H V G and E H E G 7

8 Families of Graphs Complete Graph: every vertex is linked to every other vertex. Clique: a complete subgraph. 8

9 Multi-Graph vs. Simple Graph Loops: Multi-Edges: 9

10 Directed Graph vs. Undirected Graph Mutual arcs: Directed Edges = Arcs: { u, v} u v 10

11 Adjacency Two edges are adjacent if joined by a common endpoint in V: u and v are adjacent if joined by an edge in E: u v e 1 e 2 11

12 Decorated Graph Weighted Edges

13 Incident and Degree The degree of a vertex v, say d v, is defined as the number of edges incident on v. A vertex v V is incident on an edge if v is an endpoint of e. e E v e v d v =2 13

14 In-degrees and out-degrees For Directed graphs: In-degree = 8 Out-degree = 8 14

15 Degree Distribution Example: Power-Law Network A. Barbasi and E. Bonabeau, Scale-free Networks, Scientific American 288: p.50-59, p k k m m = e k p = C k e k / κ k τ Newman, Strogatz and Watts, 2001

16 Another example of complex network: Small-World Network Six Degree Separation: adding long range link, a regular graph can be transformed into a small-world network, in which the average number of degrees between two nodes become small. from Watts and Strogatz, C: Clustering Coefficient, L: path length, (C(0), L(0) ): (C, L) as in a regular graph; (C(p), L(p)): (C,L) in a Small-world graph with randomness p.

17 Indication of Small A graph is small which usually indicates the average distance between distinct vertices is small l = 1 ( 1) / 2 u N N + v V v v dist( u, v) For instance, a protein interaction network would be considered to have the smallworld property, as there is an average distance of 3.68 among the 5,128 vertices in its giant component. 17

18 Some examples of Degree Distribution (a) scientist collaboration: biologists (circle) physicists (square), (b) collaboration of move actors, (d) network of directors of Fortune 1000 companies 18

19 Degree Distribution Kolaczyk, Statistical Analysis of Network Data: Methods and Models, Springer

20 ScaleGraph Analytics Algorithms 20

21 Centrality There is certainly no unanimity on exactly what centrality is or its conceptual foundations, and there is little agreement on the procedure of its measurement. Freeman Degree (centrality) Closeness (centrality) Betweeness (centrality) Eigenvector (centrality) 21

22 Conceptual Descriptions of Three Centrality Measurements Kolaczyk, Statistical Analysis of Network Data: Methods and Models, Springer

23 Distance Distance of two vertices: The length of the shortest path between the vertices. Geodesic: another name for shortest path. Diameter: the value of the longest distance in a graph 23

24 Closeness Closeness: A vertex is close to the other vertices c CI ( v) = u V 1 dist( v, u) where dist(v,u) is the geodesic distance between vertices v and u. 24

25 Betweenness Betweenness measures are aimed at summarizing the extent to which a vertex is located between other pairs of vertices. Freeman s definition: c B ( v) = s t v V σ σ ( s, t v) ( s, t) Calculation of all betweenness centralities requires calculating the lengths of shortest paths among all pairs of vertices Computing the summation in the above definition for each vertex 25

26 Betweeness ==> Bridges Example: Healthcare experts in the world Connections between different divisions Example: Healthcare experts in the U.S. Key social bridges 46 E6893 Big Data Analytics Lecture 1: Overview

27 Network Value Analysis First Large-Scale Economical Social Network Study Productivity effect from network variables An additional person in network size ~ $986 revenue per year Each person that can be reached in 3 steps ~ $0.163 in revenue per month A link to manager ~ $1074 in revenue per month 1 standard deviation of network diversity (1 - constraint) ~ $758 1 standard deviation of btw ~ -$300K 1 strong link ~ $-7.9 per month Structural Diverse networks with abundance of structural holes are associated with higher performance. Having diverse friends helps. Betweenness is negatively correlated to people but highly positive correlated to projects. Being a bridge between a lot of people is bottleneck. Being a bridge of a lot of projects is good. Network reach are highly corrected. The number of people reachable in 3 steps is positively correlated with higher performance. Having too many strong links the same set of people one communicates frequently is negatively correlated with performance. Perhaps frequent communication to the same person may imply redundant information exchange. 49 E6893 Big Data Analytics Lecture 1: Overview

28 Eigenvector Centrality Try to capture the status, prestige, or rank. More central the neighbors of a vertex are, the more central the vertex itself is. cei ( v) = α cei ( u) { u, v} E The vector c = ( c (1),..., c ( N )) T Ei Ei Ei v is the solution of the eigenvalue problem: A c = α Ei 1 c Ei 28

29 PageRank Algorithm (Simplified) 29

30 PageRank Steps Example: Simplified Initial State: R(A) = R(B) = R(C) = R(D) = 0.25 Iterative Procedure: R(A) = R(B) / 2 + R(C) / 1 + R(D) / 3 where R( u) R( u) = d + e N v B v v A B F u The set of pages u points to B u The set of pages point to u C D N u = d F u Number of links from u Normalization / damping factor e 1 d N = In general, d=

31 Solution of PageRank The PageRank values are the entries of the dominant eigenvector of the modified adjacency matrix. R R( p1 ) " # R( p2) $ = # $ # : $ # $ % R( pn )& where R is the solution of the equation 31 where R is the adjacency function l( pi, p j ) = 0 if page pj does not link to pi, and normalized such that for each j, N i= 1 l( p, p ) = 1 i j

32 Walk A walk on a graph G, from v 0 to v l, is an alternating sequence: The length of this walk is l. { v0, e1, v1, e2,..., vl 1, el, vl} A walk may be: Trail --- no repeated edges Path --- trails without repeated vertices. 32

33 Connectivity of Graph A measure related to the flow of information in the graph Connected every vertex is reachable from every other A connected component of a graph is a maximally connected subgraph. A graph usually has one dominating the others in magnitude giant component. 33

34 Reachable, Connected, Component Reachable: A vertex v in a graph G is said to be reachable from another vertex u if there exists a walk from u to v. Connected: A graph is said to be connected if every vertex is reachable from every other. Component: A component of a graph is a maximally connected subgraph. 34

35 Local Density A coherent subset of nodes should be locally dense. Cliques: 3-cliques A sufficient condition for a clique of size n to exist in G is: N e 2 N " v n 2 " > $ %$ % & 2 '& n 1 ' 35

36 Weakened Versions of Cliques -- Plexes A subgraph H consisting of m vertices is called n-plex, for m > n, if no vertex has degree less than m n. 1-plex 1-plex No vertex is missing more than 1 of its possible m-1 edges. 36

37 Another Weakened Versions of Cliques -- Cores A k-core of a graph G is a subgraph H in which all vertices have degree at least k. 3-core Batagelj et. al., A maximal k-core subgraph may be computed in as little as O( Nv + Ne) time. Computes the shell indices for every vertex in the graph Shell index of v = the largest value, say c, such that v belongs to the c-core of G but not its (c+1)-core. For a given vertex, those neighbors with lesser degree lead to a decrease in the potential shell index of that vertex. 37

38 Density measurement The density of a subgraph H = ( VH, EH ) is: den( H ) = V H E H ( V 1) / 2 H Range of density and 0 den( H ) 1 den( H ) = ( V 1) d ( H ) H average degree of H 38

39 Use of the density measure Density of a graph: let H=G Clustering of edges local to v: let H=Hv, which is the set of neighbors of a vertex v, and the edges between them Clustering Coefficient of a graph: The average of den(hv) over all vertices 39

40 An insight of clustering coefficient A triangle is a complete subgraph of order three. A connected triple is a subgraph of three vertices connected by two edges (regardless how the other two nodes connect). The local clustering coefficient can be expressed as: den H The clustering coefficient of G is then: τ ( v ) = cl( v) = τ Δ 3 ( v) ( v) # of triangles # of connected triples for which 2 edges are both incident to v. 1 cl( G) = cl( v) V v V Where V V is the set of vertices v with dv 2. 40

41 An example 41

42 Transitivity of a graph A variation of the clustering coefficient takes weighted average where τ Δ τ cl T ( G) 1 ( G) = τ Δ ( v) 3 v V ( G) = τ ( v) 3 3 v V τ 3( v) cl( v) v V " τ Δ = = τ 3 v τ 3 v V " 3 ( G) ( ) ( G) The friend of your friend is also a friend of yours is the number of triangles in the graph is the number of connected triples Clustering coefficients have become a standard quantity for network structure analysis. But, it is important on reporting which clustering coefficients are used. 42

43 Vertex / Edge Connectivity If an arbitrary subset of k vertices or edges is removed from a graph, is the remaining subgraph connected? A graph G is called k-vertex-connected, if (1) Nv>k, and (2) the removal of any subset of vertices X in V of cardinality X smaller than k leaves a subgraph G X that is connected. The vertex connectivity of G is the largest integer such that G is k- vertex-connected. Similar measurement for edge connectivity 43

44 Vertex / Edge Cut If the removal of a particular set of vertices in G disconnects the graph, that set is called a vertex cut. For a given pair of vertices (u,v), a u-v-cut is a partition of V into two disjoint non-empty subsets, S and S, where u is in S and v is in S. Minimum u-v-cut: the sum of the weights on edges connecting vertices in S to vertices in S is a minimum. 44

45 Minimum cut and flow Find a minimum u-v-cut is an equivalent problem of maximizing a measure of flow on the edges of a derived directed graph. Ford and Fulkerson, Max-Flow Min-Cut theorem. 45

46 Graph Partitioning Many uses of graph partitioning: E.g., community structure in social networks A cohesive subset of vertices generally is taken to refer to a subset of vertices that (1) are well connected among themselves, and (2) are relatively well separated from the remaining vertices Graph partitioning algorithms typically seek a partition of the vertex set of a graph in such a manner that the sets E( Ck, Ck ) of edges connecting vertices in Ck to vertices in Ck are relatively small in size compared to the sets E(Ck) = E( Ck, Ck ) of edges connecting vertices within Ck. 46

47 Classify the nodes 47

48 Example: Karate Club Network 48

49 Hierarchical Clustering Agglomerative Divisive In agglomerative algorithms, given two sets of vertices C1 and C2, two standard approaches to assigning a similarity value to this pair of sets is to use the maximum (called singlelinkage) or the minimum (called complete linkage) of the similarity xij over all pairs. x ij = v i d( N ) + d( N 1) v N ΔN v j v The normalized number of neighbors of vi and vj that are not shared. 49

50 Hierarchical Clustering Algorithms Types Primarily differ in [Jain et. al. 1999]: (1) how they evaluate the quality of proposed clusters, and (2) the algorithms by which they seek to optimze that quality. Agglomerative: successive coarsening of parittions through the process of merging. Divisive: successive refinement of partitions through the process of splitting. At each stage, the current candidate partition is modified in a way that minizes a specific measure of cost. In agglomerative methods, the least costly merge of two previously existing partition elements is executed In divisive methods, it is the least costly split of a single existing partition element into two that is executed. 50

51 Hierarchical Clustering The resulting hierarchy typically is represented in the form of a tree, called a dendrogram. The measure of cost incorporated into a hierarchical clustering method used in graph partitioning should reflect our sense of what defines a cohesive subset of vertices. In agglomerative algorithms, given two sets of vertices C1 and C2, two standard approaches to assigning a similarity value to this pair of sets is to use the maximum (called singlelinkage) or the minimum (called complete linkage) of the dissimilarity xij over all pairs. Dissimlarities for subsets of vertices were calculated from the xij using the extension of Ward (1963) s method and the lengths of the branches in the dendrogram are in relative proportion to the changes in dissimilarity. x ij = v i d( N ) + d( N 1) v N ΔN v j v Nv is the set of neighbors of a vertex. Δ is the symmetric difference of two sets which is the set of elements that are in one or the other but not both. xij is the normalized number of neighbors of vi and vj that are not shared. 51

52 Other dissimilarity measures There are various other common choices of dissimilarity measures, such as: x = ( A A ) ij ik jk k i, j 2 Hierarchical clustering algorithms based on dissimilarities of this sort are reasonably efficient, running in time. 2 O( N log N ) v v 52

53 Hierarchical Clustering Example 53

54 Several Graph Open Source on Tools Titan is a native Blueprints enabled graph database 54

55 Graph Language 55

56 Performance Comparison of Titan and others Dataset: 12.2 million edges, 2.2 million vertices Goal: Find paths in a property graph. One of the vertex property is call TYPE. In this scenario, the user provides either a particular vertex, or a set of particular vertices of the same TYPE (say, "DRUG"). In addition, the user also provides another TYPE (say, "TARGET"). Then, we need find all the paths from the starting vertex to a vertex of TYPE TARGET. Therefore, we need to 1) find the paths using graph traversal; 2) keep trace of the paths, so that we can list them after the traversal. Even for the shortest paths, it can be multiple between two nodes, such as: drug->assay->target, drug->moa->target First test (coldstart) Avg time (100 tests) Requested depth 5 traversal Requested full depth traversal IBM System G (NativeStore C++) 39 sec 3.0 sec 4.2 sec IBM System G (NativeStore JNI) 57 sec 4.0 sec 6.2 sec Java Neo4j (Blueprints 2.4) 105 sec 5.9 sec 8.3 sec Titan (Berkeley DB) 3861 sec 641 sec 794 sec Titan (HBase) 3046 sec 1597 sec 2682 sec 56 First full test - full depth 23. All data pulled from disk. Nothing initially cached. Modes - All tests in default modes of each graph implementation. Titan can only be run in transactional mode. Other implementations do not default to transactional mode.

57 ScaleGraph DB System G DB s open source version Prereqs Linux Intel 64 OpenJDK 6 or higher Maven - maven-in-five-minutes.html 57

58 ScaleGraph DB (a.k.a. PropelGraph) Installation 1a) git clone https://github.com/scalegraph/propelgraph.git or 1b) wget https://github.com/scalegraph/propelgraph/archive/master.zip ; unzip master.zip 2) cd propelgraph/propelgraph-gremlin 3)./makepackage.sh 58

59 ScaleGraph DB Trying It Out 3) cd propelgraph-gremlin ) bin/gremlin.sh 5) optional: read a gremlin tutorial 6) g = CreateGraph.openGraph("nativemem_authors","awesome") 7) new LoadCSV().populateFromVertexFile(g, "data/movies.movies.v.csv", "movies", ) 8) new LoadCSV().populateFromVertexFile(g, "data/movies.appearances.e.csv", "appearances", ) 9) g.v(20).both.bothv 10) Analytics.collaborativeFilter(g, 20, "appearance", Direction.OUT, "appearance", Direction.IN) 59

60 ScaleGraph DB Help https://github.com/scalegraph/scalegraph/propelgraph 60

61 Questions? 61

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) ! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015 E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing

More information

Social Media Mining. Graph Essentials

Social Media Mining. Graph Essentials Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

More information

Complex Networks Analysis: Clustering Methods

Complex Networks Analysis: Clustering Methods Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications

More information

Social Media Mining. Network Measures

Social Media Mining. Network Measures Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

More information

Course on Social Network Analysis Graphs and Networks

Course on Social Network Analysis Graphs and Networks Course on Social Network Analysis Graphs and Networks Vladimir Batagelj University of Ljubljana Slovenia V. Batagelj: Social Network Analysis / Graphs and Networks 1 Outline 1 Graph...............................

More information

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014 Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

More information

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II

! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II ! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005 V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer

More information

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

General Network Analysis: Graph-theoretic. COMP572 Fall 2009 General Network Analysis: Graph-theoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Graphs and Network Flows IE411 Lecture 1

Graphs and Network Flows IE411 Lecture 1 Graphs and Network Flows IE411 Lecture 1 Dr. Ted Ralphs IE411 Lecture 1 1 References for Today s Lecture Required reading Sections 17.1, 19.1 References AMO Chapter 1 and Section 2.1 and 2.2 IE411 Lecture

More information

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R. 5. Link Analysis Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

More information

GRAPH THEORY and APPLICATIONS. Trees

GRAPH THEORY and APPLICATIONS. Trees GRAPH THEORY and APPLICATIONS Trees Properties Tree: a connected graph with no cycle (acyclic) Forest: a graph with no cycle Paths are trees. Star: A tree consisting of one vertex adjacent to all the others.

More information

Strong and Weak Ties

Strong and Weak Ties Strong and Weak Ties Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz April 11, 2016 Elisabeth Lex (KTI, TU Graz) Networks April 11, 2016 1 / 66 Outline 1 Repetition 2 Strong and Weak Ties 3 General

More information

Why graph clustering is useful?

Why graph clustering is useful? Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as any other clustering Identification of communities in social networks Webpage clustering for better data management

More information

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003 Graph models for the Web and the Internet Elias Koutsoupias University of Athens and UCLA Crete, July 2003 Outline of the lecture Small world phenomenon The shape of the Web graph Searching and navigation

More information

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph

More information

A scalable multilevel algorithm for graph clustering and community structure detection

A scalable multilevel algorithm for graph clustering and community structure detection A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures

More information

SCAN: A Structural Clustering Algorithm for Networks

SCAN: A Structural Clustering Algorithm for Networks SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected

More information

Lesson 3. Algebraic graph theory. Sergio Barbarossa. Rome - February 2010

Lesson 3. Algebraic graph theory. Sergio Barbarossa. Rome - February 2010 Lesson 3 Algebraic graph theory Sergio Barbarossa Basic notions Definition: A directed graph (or digraph) composed by a set of vertices and a set of edges We adopt the convention that the information flows

More information

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.0, steen@cs.vu.nl Chapter 06: Network analysis Version: April 8, 04 / 3 Contents Chapter

More information

Network/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like?

Network/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like? What is a Network? Network/Graph Theory Network = graph Informally a graph is a set of nodes joined by a set of lines or arrows. 1 1 2 3 2 3 4 5 6 4 5 6 Graph-based representations Representing a problem

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Graph definition Degree, in, out degree, oriented graph. Complete, regular, bipartite graph. Graph representation, connectivity, adjacency.

Graph definition Degree, in, out degree, oriented graph. Complete, regular, bipartite graph. Graph representation, connectivity, adjacency. Mária Markošová Graph definition Degree, in, out degree, oriented graph. Complete, regular, bipartite graph. Graph representation, connectivity, adjacency. Isomorphism of graphs. Paths, cycles, trials.

More information

Walk-Based Centrality and Communicability Measures for Network Analysis

Walk-Based Centrality and Communicability Measures for Network Analysis Walk-Based Centrality and Communicability Measures for Network Analysis Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA Workshop on Innovative Clustering

More information

A discussion of Statistical Mechanics of Complex Networks P. Part I

A discussion of Statistical Mechanics of Complex Networks P. Part I A discussion of Statistical Mechanics of Complex Networks Part I Review of Modern Physics, Vol. 74, 2002 Small Word Networks Clustering Coefficient Scale-Free Networks Erdös-Rényi model cover only parts

More information

Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS

Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS Homework MA 725 Spring, 2012 C. Huneke SELECTED ANSWERS 1.1.25 Prove that the Petersen graph has no cycle of length 7. Solution: There are 10 vertices in the Petersen graph G. Assume there is a cycle C

More information

SGL: Stata graph library for network analysis

SGL: Stata graph library for network analysis SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the

More information

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, 2010. Chapter 7: Digraphs

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, 2010. Chapter 7: Digraphs MCS-236: Graph Theory Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, 2010 Chapter 7: Digraphs Strong Digraphs Definitions. A digraph is an ordered pair (V, E), where V is the set

More information

2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines 2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

More information

1 Basic Definitions and Concepts in Graph Theory

1 Basic Definitions and Concepts in Graph Theory CME 305: Discrete Mathematics and Algorithms 1 Basic Definitions and Concepts in Graph Theory A graph G(V, E) is a set V of vertices and a set E of edges. In an undirected graph, an edge is an unordered

More information

Chapter 2. Basic Terminology and Preliminaries

Chapter 2. Basic Terminology and Preliminaries Chapter 2 Basic Terminology and Preliminaries 6 Chapter 2. Basic Terminology and Preliminaries 7 2.1 Introduction This chapter is intended to provide all the fundamental terminology and notations which

More information

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns

More information

Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

More information

5.1 Bipartite Matching

5.1 Bipartite Matching CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson

More information

Network Analysis and Visualization of Staphylococcus aureus. by Russ Gibson

Network Analysis and Visualization of Staphylococcus aureus. by Russ Gibson Network Analysis and Visualization of Staphylococcus aureus by Russ Gibson Network analysis Based on graph theory Probabilistic models (random graphs) developed by Erdős and Rényi in 1959 Theory and tools

More information

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Gephi Network Statistics

Gephi Network Statistics Gephi Network Statistics Google Summer of Code 2009 Project Proposal Patrick J. McSweeney pjmcswee@syr.edu 1 Introduction My name is Patrick J. McSweeney and I am a fourth year PhD candidate in computer

More information

6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of

6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of Clustering Clustering is an unsupervised learning method: there is no target value (class label) to be predicted, the goal is finding common patterns or grouping similar examples. Differences between models/algorithms

More information

Data Mining Cluster Analysis: Advanced Concepts and Algorithms. ref. Chapter 9. Introduction to Data Mining

Data Mining Cluster Analysis: Advanced Concepts and Algorithms. ref. Chapter 9. Introduction to Data Mining Data Mining Cluster Analysis: Advanced Concepts and Algorithms ref. Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Outline Prototype-based Fuzzy c-means Mixture Model Clustering Density-based

More information

NETZCOPE - a tool to analyze and display complex R&D collaboration networks

NETZCOPE - a tool to analyze and display complex R&D collaboration networks The Task Concepts from Spectral Graph Theory EU R&D Network Analysis Netzcope Screenshots NETZCOPE - a tool to analyze and display complex R&D collaboration networks L. Streit & O. Strogan BiBoS, Univ.

More information

Introduction to Networks and Business Intelligence

Introduction to Networks and Business Intelligence Introduction to Networks and Business Intelligence Prof. Dr. Daning Hu Department of Informatics University of Zurich Sep 17th, 2015 Outline Network Science A Random History Network Analysis Network Topological

More information

Tools and Techniques for Social Network Analysis

Tools and Techniques for Social Network Analysis Tools and Techniques for Social Network Analysis Pajek Program for Analysis and Visualization of Large Networks Pajek: What is it Pajek is a program, for Windows and Linux (via Wine) Developers: Vladimir

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Mining Social-Network Graphs

Mining Social-Network Graphs 342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is

More information

Some questions... Graphs

Some questions... Graphs Uni Innsbruck Informatik - 1 Uni Innsbruck Informatik - 2 Some questions... Peer-to to-peer Systems Analysis of unstructured P2P systems How scalable is Gnutella? How robust is Gnutella? Why does FreeNet

More information

Math 4707: Introduction to Combinatorics and Graph Theory

Math 4707: Introduction to Combinatorics and Graph Theory Math 4707: Introduction to Combinatorics and Graph Theory Lecture Addendum, November 3rd and 8th, 200 Counting Closed Walks and Spanning Trees in Graphs via Linear Algebra and Matrices Adjacency Matrices

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will

More information

Zachary Monaco Georgia College Olympic Coloring: Go For The Gold

Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Coloring the vertices or edges of a graph leads to a variety of interesting applications in graph theory These applications include various

More information

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

More information

Graph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Graph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum

More information

Data Structures in Java. Session 16 Instructor: Bert Huang

Data Structures in Java. Session 16 Instructor: Bert Huang Data Structures in Java Session 16 Instructor: Bert Huang http://www.cs.columbia.edu/~bert/courses/3134 Announcements Homework 4 due next class Remaining grades: hw4, hw5, hw6 25% Final exam 30% Midterm

More information

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014

Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis. Contents. Introduction. Maarten van Steen. Version: April 28, 2014 Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R.0, steen@cs.vu.nl Chapter 0: Version: April 8, 0 / Contents Chapter Description 0: Introduction

More information

Electrical Resistances in Products of Graphs

Electrical Resistances in Products of Graphs Electrical Resistances in Products of Graphs By Shelley Welke Under the direction of Dr. John S. Caughman In partial fulfillment of the requirements for the degree of: Masters of Science in Teaching Mathematics

More information

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE 2012 SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH (M.Sc., SFU, Russia) A THESIS

More information

GRAPH MINING APPLICATIONS TO SOCIAL NETWORK ANALYSIS

GRAPH MINING APPLICATIONS TO SOCIAL NETWORK ANALYSIS Chapter 16 GRAPH MINING APPLICATIONS TO SOCIAL NETWORK ANALYSIS Lei Tang and Huan Liu Computer Science & Engineering Arizona State University L.Tang@asu.edu, Huan.Liu@asu.edu Abstract The prosperity of

More information

Hadoop SNS. renren.com. Saturday, December 3, 11

Hadoop SNS. renren.com. Saturday, December 3, 11 Hadoop SNS renren.com Saturday, December 3, 11 2.2 190 40 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December

More information

Follow links for Class Use and other Permissions. For more information send email to: permissions@press.princeton.edu

Follow links for Class Use and other Permissions. For more information send email to: permissions@press.princeton.edu COPYRIGHT NOTICE: Matthew O. Jackson: Social and Economic Networks is published by Princeton University Press and copyrighted, 2008, by Princeton University Press. All rights reserved. No part of this

More information

Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

More information

CELLULAR MANUFACTURING

CELLULAR MANUFACTURING CELLULAR MANUFACTURING Grouping Machines logically so that material handling (move time, wait time for moves and using smaller batch sizes) and setup (part family tooling and sequencing) can be minimized.

More information

An Introduction to APGL

An Introduction to APGL An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an

More information

Parallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri

Parallel Algorithms for Small-world Network. David A. Bader and Kamesh Madduri Parallel Algorithms for Small-world Network Analysis ayssand Partitioning atto g(s (SNAP) David A. Bader and Kamesh Madduri Overview Informatics networks, small-world topology Community Identification/Graph

More information

The origins of graph theory are humble, even frivolous. Biggs, E. K. Lloyd, and R. J. Wilson)

The origins of graph theory are humble, even frivolous. Biggs, E. K. Lloyd, and R. J. Wilson) Chapter 11 Graph Theory The origins of graph theory are humble, even frivolous. Biggs, E. K. Lloyd, and R. J. Wilson) (N. Let us start with a formal definition of what is a graph. Definition 72. A graph

More information

Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012

Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012 CS-62 Theory Gems October 7, 202 Lecture 9 Lecturer: Aleksander Mądry Scribes: Dorina Thanou, Xiaowen Dong Introduction Over the next couple of lectures, our focus will be on graphs. Graphs are one of

More information

Predicting Influentials in Online Social Networks

Predicting Influentials in Online Social Networks Predicting Influentials in Online Social Networks Rumi Ghosh Kristina Lerman USC Information Sciences Institute WHO is IMPORTANT? Characteristics Topology Dynamic Processes /Nature of flow What are the

More information

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239 STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

More information

Section Summary. Introduction to Graphs Graph Taxonomy Graph Models

Section Summary. Introduction to Graphs Graph Taxonomy Graph Models Chapter 10 Chapter Summary Graphs and Graph Models Graph Terminology and Special Types of Graphs Representing Graphs and Graph Isomorphism Connectivity Euler and Hamiltonian Graphs Shortest-Path Problems

More information

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks

Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand

More information

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Data Mining Clustering (2) Toon Calders Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining Outline Partitional Clustering Distance-based K-means, K-medoids,

More information

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

More information

Solutions to Exercises 8

Solutions to Exercises 8 Discrete Mathematics Lent 2009 MA210 Solutions to Exercises 8 (1) Suppose that G is a graph in which every vertex has degree at least k, where k 1, and in which every cycle contains at least 4 vertices.

More information

Graph Theory. Introduction. Distance in Graphs. Trees. Isabela Drămnesc UVT. Computer Science Department, West University of Timişoara, Romania

Graph Theory. Introduction. Distance in Graphs. Trees. Isabela Drămnesc UVT. Computer Science Department, West University of Timişoara, Romania Graph Theory Introduction. Distance in Graphs. Trees Isabela Drămnesc UVT Computer Science Department, West University of Timişoara, Romania November 2016 Isabela Drămnesc UVT Graph Theory and Combinatorics

More information

Applying Social Network Analysis to the Information in CVS Repositories

Applying Social Network Analysis to the Information in CVS Repositories Applying Social Network Analysis to the Information in CVS Repositories Luis Lopez-Fernandez, Gregorio Robles, Jesus M. Gonzalez-Barahona GSyC, Universidad Rey Juan Carlos {llopez,grex,jgb}@gsyc.escet.urjc.es

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

E6895 Advanced Big Data Analytics Lecture 3:! Spark and Data Analytics

E6895 Advanced Big Data Analytics Lecture 3:! Spark and Data Analytics E6895 Advanced Big Data Analytics Lecture 3:! Spark and Data Analytics Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big

More information

Random graphs and complex networks

Random graphs and complex networks Random graphs and complex networks Remco van der Hofstad Honours Class, spring 2008 Complex networks Figure 2 Ye a s t p ro te in in te ra c tio n n e tw o rk. A m a p o f p ro tein p ro tein in tera c

More information

Fig. 1 A typical Knowledge Discovery process [2]

Fig. 1 A typical Knowledge Discovery process [2] Volume 4, Issue 7, July 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Review on Clustering

More information

SoSe 2014: M-TANI: Big Data Analytics

SoSe 2014: M-TANI: Big Data Analytics SoSe 2014: M-TANI: Big Data Analytics Lecture 4 21/05/2014 Sead Izberovic Dr. Nikolaos Korfiatis Agenda Recap from the previous session Clustering Introduction Distance mesures Hierarchical Clustering

More information

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

More information

(a) (b) (c) Figure 1 : Graphs, multigraphs and digraphs. If the vertices of the leftmost figure are labelled {1, 2, 3, 4} in clockwise order from

(a) (b) (c) Figure 1 : Graphs, multigraphs and digraphs. If the vertices of the leftmost figure are labelled {1, 2, 3, 4} in clockwise order from 4 Graph Theory Throughout these notes, a graph G is a pair (V, E) where V is a set and E is a set of unordered pairs of elements of V. The elements of V are called vertices and the elements of E are called

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining /8/ What is Cluster

More information

1. Relevant standard graph theory

1. Relevant standard graph theory Color identical pairs in 4-chromatic graphs Asbjørn Brændeland I argue that, given a 4-chromatic graph G and a pair of vertices {u, v} in G, if the color of u equals the color of v in every 4-coloring

More information

Structural and functional analytics for community detection in large-scale complex networks

Structural and functional analytics for community detection in large-scale complex networks Chopade and Zhan Journal of Big Data DOI 10.1186/s40537-015-0019-y RESEARCH Open Access Structural and functional analytics for community detection in large-scale complex networks Pravin Chopade 1* and

More information

Solutions to Homework 6

Solutions to Homework 6 Solutions to Homework 6 Debasish Das EECS Department, Northwestern University ddas@northwestern.edu 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Small Maximal Independent Sets and Faster Exact Graph Coloring

Small Maximal Independent Sets and Faster Exact Graph Coloring Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected

More information

Finding and counting given length cycles

Finding and counting given length cycles Finding and counting given length cycles Noga Alon Raphael Yuster Uri Zwick Abstract We present an assortment of methods for finding and counting simple cycles of a given length in directed and undirected

More information

6.042/18.062J Mathematics for Computer Science October 3, 2006 Tom Leighton and Ronitt Rubinfeld. Graph Theory III

6.042/18.062J Mathematics for Computer Science October 3, 2006 Tom Leighton and Ronitt Rubinfeld. Graph Theory III 6.04/8.06J Mathematics for Computer Science October 3, 006 Tom Leighton and Ronitt Rubinfeld Lecture Notes Graph Theory III Draft: please check back in a couple of days for a modified version of these

More information

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Why? A central concept in Computer Science. Algorithms are ubiquitous. Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear

More information

Analysis of Algorithms, I

Analysis of Algorithms, I Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

More information

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES 136 TER 4. INDUCTION, GRHS ND TREES 4.3 Graphs In this chapter we introduce a fundamental structural idea of discrete mathematics, that of a graph. Many situations in the applications of discrete mathematics

More information