Graph theory and network analysis Devika Subramanian Comp 140 Fall 2008 1
The bridges of Konigsburg Source: Wikipedia The city of Königsberg in Prussia was set on both sides of the Pregel River, and included two large islands which were connected to each other and the mainland by seven bridges. Leonard Euler posed the following problem: can we find a walk through the city that crosses each bridge once and only once, and begins and ends at the same point? Rules: The islands cannot be reached by any route other than the bridges, and every bridge must have been crossed completely every time (one cannot walk halfway onto the bridge and then turn around to come at it from another side). 2
A schematic of the seven bridges problem C b1 b2 b3 b7 A B b4 b5 b6 D 3
First paper on graph theory Leonard Euler presented a solution to the St. Petersburg Academy on August 26, 1735 Solutio problematis ad geometriam situs pertinentis (The solution of a problem relating to the geometry of position), Commentarii academiae scientiarum Petropolitanae, 1741. 4
Abstract representation A b1 b4 b2 b5 C b7 D b3 b6 B 1. Only land masses and the bridges connecting them matter! 2. Shapes of land masses and lengths of bridges are not relevant. Relative distances between land masses also not relevant. 3. Topological connectivity is the only relevant aspect for solving the problem. 4. The structure shown alongside makes only the relevant factors of the problem explicit. 5
Euler s insight When one enters a land mass (that is not the start or the end of the tour) by a bridge, one leaves it by a bridge. If each bridge is to be traversed exactly once, then each land mass that is not the start or the end, needs to have an even number of bridges touching it. Land mass A has five bridges touching it, land masses B, C and D each have three bridges touching them. So a tour that starts and ends on any of these land masses and which crosses each bridge exactly once is not possible. 6
Elements of graph theory b1 b2 C b3 Land masses are vertices. Bridges are edges. The problem is represented as an undirected multi-graph. A b7 B The degree of a vertex is the number of edges on it. b4 b5 D b6 all vertexes in this problem have odd degree. Euler s insight: An Eulerian tour in a connected graph is possible only if all vertexes in it have even degree. 7
Some definitions A graph G is a pair of sets V and E V is a non-empty set of vertices E is a set of pairs of vertices V = {A,B,C,D,E,F} G={V,E} A B C E={{A,B},{A,D},{B,C},{B,E}, {C,D},{C,E},{E,F}} E F D 8
Subgraphs Deleting some vertices or edges from a graph leaves a subgraph. Formally, G =(V,E ) is a subgraph of G = (V,E) if V is a non-empty subset of V E is a subset of E 9
A computer scientist reads the paper A 1994 University of Chicago entitled The Social Organization of Sexuality found that on average men have 74% more opposite-gender partners than women. 10
Mapping to graph theory Men Women 11
Analysis Every edge in this graph connects an M vertex to a W vertex. So the sum of the degrees of the M vertices must equal the sum of the degrees of the W vertices. x M deg(x) = yinw deg(y) 12
Analysis contd. x M deg(x) M 1. W = y W deg(y) W 1. M Avg. deg in M Avg. deg in W = W M Avg. deg in M = W.Avg. deg in W M 13
Analysis contd. Census Bureau reports W / M is about 1.035. Therefore, on average men have 3.5% more opposite-gender partners. The University of Chicago study has problematic data. The average number of opposite-gender partners is completely determined by W / M. 14
Graph variations Multigraph: more than one edge between a pair of vertices. Directed graph: edges have direction. the edges of a directed graph are ordered pairs of vertices. indegree of a vertex is the number of edges directed into a vertex. outdegree of a vertex is the number of edges directed out of a vertex. 15
Problems that map to graphs Social networks: nodes are people, edges represent the is-friends-with relation. Terrorist networks: nodes are terrorist groups/individuals, edges are participatedin-an-incident-with Conflict networks: nodes are countries, edges are cooperate-with or conflict-with 16
2 weeks prior to Desert Storm 17
The SHSU database A human curated database of global terrorist incidents from 1/22/1990 to 12/31/2007 31,199 incidents 1257 groups Very detailed information on incidents (e.g. weapons used, fatalities, etc) and some information on the groups. (c) Devika Subramanian 2008 18
Pre-Bali network Palestine groups Kashmir groups Columbia Al Qaeda US terror groups (KKK etc) Irish groups Philippines, Indonesian groups Hamas (c) Devika Subramanian 2008 19
Post Bali network Bangladesh Al Qaeda All the rest are fragments of networks from previous slide US environmental Terror groups Splintering of the terror network into smaller, more decentralized pieces (c) Devika Subramanian 2008 20
More problems The web: each vertex is a page, directed edges between vertices represent hyperlinks Algorithm to compute hubs and authorities to determine page rank in Google Modeling the spread of infection in a community: vertices are people, and edges represent contact between them. Routing messages on the Internet: vertices are end hosts and routers, edges denote vertices that are directly linked. 21