Lecture Notes on Spanning Trees

Similar documents
CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Social Media Mining. Graph Essentials

CMPSCI611: Approximating MAX-CUT Lecture 20

Lecture Notes on Binary Search Trees

Lecture 17 : Equivalence and Order Relations DRAFT

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

GRAPH THEORY LECTURE 4: TREES

Lecture Notes on Binary Search Trees

Lecture Notes on Linear Search

Shortest Inspection-Path. Queries in Simple Polygons

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Scheduling Shop Scheduling. Tim Nieberg

Analysis of Algorithms, I

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

INCIDENCE-BETWEENNESS GEOMETRY

Data Structures and Algorithms Written Examination

CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

The Graphical Method: An Example

Chapter 6: Graph Theory

Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li. Advised by: Dave Mount. May 22, 2014

Exponential time algorithms for graph coloring

Network Flow I. Lecture Overview The Network Flow Problem

Lecture 1: Course overview, circuits, and formulas

Lecture 16 : Relations and Functions DRAFT

1. Write the number of the left-hand item next to the item on the right that corresponds to it.

Binary Search Trees CMPSC 122

Full and Complete Binary Trees

The Basics of Graphical Models

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

The Goldberg Rao Algorithm for the Maximum Flow Problem

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

One pile, two pile, three piles

Triangulation by Ear Clipping

3. The Junction Tree Algorithms

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

Lecture 10: Regression Trees

Markov random fields and Gibbs measures

5.1 Bipartite Matching

DATA ANALYSIS II. Matrix Algorithms

Why? A central concept in Computer Science. Algorithms are ubiquitous.

1 Solving LPs: The Simplex Algorithm of George Dantzig

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Introduction Solvability Rules Computer Solution Implementation. Connect Four. March 9, Connect Four

An Empirical Study of Two MIS Algorithms

COMBINATORIAL PROPERTIES OF THE HIGMAN-SIMS GRAPH. 1. Introduction

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Part 2: Community Detection

8.1 Min Degree Spanning Tree

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 4 Nancy Lynch

A Non-Linear Schema Theorem for Genetic Algorithms

Data Structure [Question Bank]

Problem Set 7 Solutions

Lecture 22: November 10

MATHEMATICAL THOUGHT AND PRACTICE. Chapter 7: The Mathematics of Networks The Cost of Being Connected

Handout #Ch7 San Skulrattanakulchai Gustavus Adolphus College Dec 6, Chapter 7: Digraphs

Gerry Hobbs, Department of Statistics, West Virginia University

Data Structures. Chapter 8

Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)

Graphs without proper subgraphs of minimum degree 3 and short cycles

Pigeonhole Principle Solutions

WOLLONGONG COLLEGE AUSTRALIA. Diploma in Information Technology

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

Section IV.1: Recursive Algorithms and Recursion Trees

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

Strong and Weak Ties

Approximation Algorithms

Load balancing Static Load Balancing

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

From Last Time: Remove (Delete) Operation

S on n elements. A good way to think about permutations is the following. Consider the A = 1,2,3, 4 whose elements we permute with the P =

Lecture 7: NP-Complete Problems

Load Balancing and Termination Detection

6.3 Conditional Probability and Independence

2) Write in detail the issues in the design of code generator.

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

Patterns in Pascal s Triangle

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

PES Institute of Technology-BSC QUESTION BANK

6.02 Practice Problems: Routing

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Small Maximal Independent Sets and Faster Exact Graph Coloring

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

International Journal of Advanced Research in Computer Science and Software Engineering

Tree-representation of set families and applications to combinatorial decompositions

Reading 13 : Finite State Automata and Regular Expressions

Lecture 2: Data Structures Steven Skiena. skiena

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION


Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

West Virginia University College of Engineering and Mineral Resources. Computer Engineering 313 Spring 2010

arxiv: v1 [math.co] 7 Mar 2012

Network (Tree) Topology Inference Based on Prüfer Sequence

Graph Theory Origin and Seven Bridges of Königsberg -Rhishikesh

Distributed Computing over Communication Networks: Maximal Independent Set


Algorithms and Data Structures

Practical statistical network analysis (with R and igraph)

Any two nodes which are connected by an edge in a graph are called adjacent node.

Transcription:

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model for many structures, for example, maps with distances or Facebook relationships. Algorithms on graphs are therefore important to many applications. They will be a central subject in the algorithms courses later in the curriculum; here we only provide a very small sample of graph algorithms. 2 Spanning Trees We start with undirected graphs which consist of a set V of vertices (also called nodes) and a set E of edges, each connecting two different vertices. A graph is connected if we can reach any vertex from any other vertex by following edges in either direction. In a directed graph edges provide a connection from one node to another, but not necessarily in the opposite direction. More mathematically, we say that the edge relation between vertices is symmetric for undirected graphs. In this lecture we only discuss undirected graphs, although directed graphs also play an important role in many applications. The following is a simple example of a connected, undirected graph

Spanning Trees L26.2 with 5 vertices (A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE). In this lecture we are particularly interested in the problem of computing a spanning tree for a connected graph. What is a tree here? They are a bit different than the binary search trees we considered early. One simple definition is that a tree is a connected graph with no cycles, where a cycle let s you go from a node to itself without repeating an edge. A spanning tree for a connected graph G is a tree containing all the vertices of G. Below are two examples of spanning trees for our original example graph. When dealing with a new kind of data structure, it is a good strategy to try to think of as many different characterization as we can. This is somewhat similar to the problem of coming up with good representations of the data; different ones may be appropriate for different purposes. Here are some alternative characterizations the class came up with: 1. Connected graph with no cycle (original). 2. Connected graph where no two neighbors are otherwise connected. Neighbors are vertices connected directly by an edge, otherwise connected means connected without the connecting edge.

Spanning Trees L26.3 3. Two trees connected by a single edge. This is a recursive characterization. The based case is a single node, with the empty tree (no vertices) as a possible special case. 4. A connected graph with exactly n 1 edges, where n is the number of vertices. 5. A graph with exactly one path between any two distinct vertices, where a path is a sequence of distinct vertices where each is connected to the next by an edge. When considering the asymptotic complexity it is often useful to categorize graphs as dense or sparse. Dense graphs have a lot of edges compared to the number of vertices. Writing n = V for the number of vertices (which will be our notation in the rest of the lecture) know there can be at most n (n 1)/2: every node is connected to any other node (n (n 1)), but in an undirected way (n (n 1)/2). If we write e for the number of edges, we have e = O(n 2 ). By comparison, a tree is sparse because e = n 1 = O(n). 3 Computing a Spanning Tree There are many algorithms to compute a spanning tree for a connected graph. The first is an example of a vertex-centric algorithm. 1. Pick an arbitrary node and mark it as being in the tree. 2. Repeat until all nodes are marked as in the tree: (a) Pick an arbitrary node u in the tree with an edge e to a node w not in the tree. Add e to the spanning tree and mark w as in the tree. We iterate n 1 times in Step 2, because there are n 1 vertices that have to be added to the tree. The efficiency of the algorithm is determined by how efficiently we can find a qualifying w. The second algorithm is edge-centric. 1. Start with the collection of singleton trees, each with exactly one node. 2. As long as we have more than one tree, connect two trees together with an edge in the graph.

Spanning Trees L26.4 This second algorithm also performs n steps, because it has to add n 1 edges to the trees until we have a spanning tree. Its efficiency is determined by how quickly we can tell if an edge would connect two trees or would connect two nodes already in the same tree, a question we come back to in the next lecture. Let s try this algorithm on our first graph, considering edges in the listed order: (AB, BC, CD, AE, BE, CE). The first graph is the given graph, the completley disconnected graph is the starting point for this algorithm. At the bottom right we have computed the spanning tree, which we know because we have added n 1 = 4 edges. If we tried to continue, the next edge BE could not be added because it does not connect two trees, and neither can CE. The spanning tree is complete. 4 Creating a Random Maze We can use the algorithm to compute a spanning tree for creating a random maze. We start with the graph where the vertices are the cells and the edges represent the neighbors we can move to in the maze. In the graph, all potential neighbors are connected. A spanning tree will be defined by a subset of the edges in which all cells in the maze are still connected by some (unique) path. Because a spanning tree connects all cells, we can arbitrarily decide on the starting point and end point after we have computed it.

Spanning Trees L26.5 How would we ensure that the maze is random? The idea is to generate a random permutation (see Exercise 1) of the edges and then consider the edges in the fixed order. Each edge is either added (if it connects two disconnected parts of the maze) or not (if the two vertices are already connected). 5 Minimum Weight Spanning Trees In many applications of graphs, there is some measure associated with the edges. For example, when the vertices are locations then the edge weights could be distances. We might then be interested in not any spanning tree, but one whose total edge weight is minimal among all the possible spanning trees, a so-called minimum weight spanning tree (MST). An MST is not necessarily unique. For example, all the edge weights could be identical in which case any spanning tree will be minimal. We annotate the edges in our running example with edge weights as shown on the left below. On the right is the minimum weight spanning tree, which has weight 9. Before we develop a refinement of our edge-centric algorithm for spanning trees to take edge weights into account, we discuss a basic property it is based on. Cycle Property. Let C E be a cycle, and e be an edge of maximal weight in C. The e does not need to be in an MST. How do we convince ourselves of this property? Assume we have a spanning tree, and edge e from the cycle property connects vertices u and w. If e is not in the spanning tree, then, indeed, we don t need it. If e is in the spanning tree, we will construct another MST without e. Edge e splits

Spanning Trees L26.6 the spanning tree into two subtrees. There must be another edge e from C connecting the two subtrees. Removing e and adding e instead yields another spanning tree, and one which does not contain e. It has equal or lower weight to the first MST, since e must have less or equal weight than e. The cycle property is the basis for Kruskal s algorithm. 1. Sort all edges in increasing weight order. 2. Consider the edges in order. If the edge does not create a cycle, add it to the spanning tree. Otherwise discard it. Stop, when n 1 edges have been added, because then we must have spanning tree. Why does this create a minimum-weight spanning tree? It is a straightforward application of the cycle property (see Exercise 2). Sorting the edges will take O(e log(e)) steps with most appropriate sorting algorithms. The complexity of the second part of the algorithm depends on how efficiently we can check if adding an edge will create a cycle or not. As we will see in Lecture 27, this can be O(n log(n)) or even more efficient if we use a so-called union-find data structure. Illustrating the algorithm on our example we first sort the edges. There is some ambiguity say we obtain the following list AE 2 BE 2 CE 2 BC 3 CD 3 AB 3

Spanning Trees L26.7 We now add the edges in order, making sure we do not create a cycle. After AE, BE, CE, we have At this point we consider BC. However, this edge would create a cycle BCE since it connects two vertices in the same tree instead of two different trees. We therefore do not add it to the spanning tree. Next we consider CD, which does connect two trees. At this point we have a minimum spanning tree We do not consider the last edge, AB, because we have already added n 1 = 4 edges. In the next lecture we will analyze the problem of incrementally adding edges to a tree in a way that allows us to quickly determine if an edge would create a cycle.

Spanning Trees L26.8 Exercises Exercise 1 Write a function to generate a random permutation of a given array, using a random number generator with the interface in the standard rand library. What is the asymptotic complexity of your function? Exercise 2 Prove that the cycle property implies the correctness of Kruskal s algorithm.