Network Flow Problems Arising From Evacuation Planning

Transcription

1 Network Flow Problems Arising From Evacuation Planning vorgelegt von Diplom-Mathematiker Daniel Dressler Speyer Von der Fakultät II - Mathematik und Naturwissenschaften der Technischen Universität Berlin zur Erlangung des akademischen Grades Doktor der Naturwissenschaften Dr. rer. nat. genehmigte Dissertation Promotionsausschuss: Vorsitzender: Berichter: Berichter: Prof. Dr. Peter Bank Prof. Dr. Martin Skutella Prof. Dr. Ekkehard Köhler Tag der wissenschaftlichen Aussprache: 7. August 2012 Berlin 2012 D 83

2

3

4 Contents 0 Introduction 7 Outline and Contributions Collaborations and Previous Publications Acknowledgements Fundamentals Complexity Theory and Approximations Graphs Shortest Paths Static Flows Static Flow Problems Working with Static Flows Multi-commodity Flows Flows Over Time constant Flows The Time-Expanded Network Earliest Arrival Flows and Their Computation Background Modeling and Simulation Connection to MATSim Related Work Our Contribution Preliminaries Interval-Based Successive Shortest Paths Assumptions Storing Intervals Shortest Paths in the Time-Expanded Network Propagating Intervals Improvements Pseudo-Code Implementation Details Computational Results Rounding Instances Settings for Our Algorithm Algorithms for Comparison

5 2.6.5 Results Against Established Algorithms Results for the SSP Algorithms Conclusion from Computational Results Using the Solution Outlook Meta-Heuristics Holdover Time-dependent Capacities and Travel Times Warm-Starting the Path Search Alternatives to the Path Search Shelters Flows with Aggregate Arc Capacities Related Work Our Contribution Preliminaries Aggregate Arc Capacities Discrete Bridge Flows and Time-Expansion Complexity and Integrality Approximation Scheme with Resource Augmentation Outlook Confluent Flows Related Work Our Contribution Preliminaries Outerplanar Graphs Tree Decompositions and Treewidth Bounding the Complexity Polynomial Algorithm for Outerplanar Graphs Approximation Scheme for Bounded Treewidth Dynamic Program Computing the Tables Deriving the Approximation Scheme Outlook Bibliography 191 Index 199

6

7 Chapter 0 Introduction We study how network flow theory can aid in evacuation planning. Our goal is to optimize the routes taken out of the endangered areas such that the evacuation can finish more quickly. This is of uttermost importance for scenarios like tsunami warnings, where the disaster might only be minutes away. The mathematical perspective can help directing people to safety while avoiding bottlenecks in the infrastructure. Similar considerations apply for large public buildings, which must be quickly evacuated in case of a fire or other threats. Note that we strictly consider an evacuation where the goal is to empty a certain area, even though in some cases disaster response forces might need to enter the area at the same time. Also, the goal always is to find an optimum plan in advance. This is opposed to managing an on-going evacuation. The models we consider come from network flow theory, a sub-field of combinatorial optimization. Flows have a long and successful history in logistics. On the sliding scale of intricate versus simple traffic models, they are about as simple as possible. Their advantage is that, like in all of combinatorial optimization, this makes it possible to find a global optimum solution, which accurately analyzes the big picture. It is important to realize that essentially any traffic model contains the flow model. If we do not understand flows and how to deal with them, complex models might degrade to simulations which offer little optimization potential. On the other hand, the apparent simplicity of flows hides their full potential. Many interpretations of flows have been suggested that add features from more elaborate traffic models, while still enabling global optimization. Flows require a network describing the instance. In case of a city, this is typically a graph that closely resembles a map of the city. The arcs represent streets. They are modeled with transit times derived from the lengths of the streets and capacities depending on the widths. The vertices of the graph are the intersections. The sources are special vertices that represent the initial positions of the evacuees. The sinks are vertices for areas that are considered safe. We assume that this network is already available. In our case, we obtained real-world data from related projects. A flow over time moves flow units from the sources to the sinks along the arcs of the network. A flow unit entering an arc reaches the other end of the arc after the transit time has passed. The capacities limit the number of flow 7

8 8 CHAPTER 0. INTRODUCTION units entering an arc at the same time. Because a flow over time tracks the flow units in time as well, it can differentiate between flow units reaching a bottleneck simultaneously or one after another. As we said above, one goal is to find a flow over time that minimizes the time until all flow units have reached the sink, that is, until all evacuees are in safe areas. A classic result shows that there can be an even better answer to this evacuation problem: The best imaginable flow over time is one that ensures that at any point in time already the maximum number of flow units have reached the sinks. Such a flow has the following properties: The first evacuees arrive in safe places as early as possible; the average time to safety is minimized; and the evacuation finishes as early as possible. The full effect of such a plan can also be summarized like this: If the disaster should strike at any time before the evacuation is completed, the maximum number of people have reached safety by then. Solutions of this kind are called earliest arrival flows in the mathematical literature. Their existence is only guaranteed if all safe areas can accept sufficiently many evacuees. This is the case, for example, when people exit a building. We develop a fast algorithm for computing earliest arrival flows under this assumption that all safe areas are sufficiently large. This algorithm is ideal for being included in evacuation planning software because the good performance allows accurate models without technical concerns. When modeling buildings, the algorithm is fast enough to allow almost immediate feedback to the user. Another benefit of its increased performance over existing algorithms is that more variations of the same scenario can be evaluated, for example, to study the effects of a fire breaking out at varying places if such data is available. We now turn to variant flow models which add desirable options for evacuation planning. However, these models have algorithmic challenges that prevent their use in practice yet. Therefore, our focus is clearly on their mathematical analysis and on developing first algorithms, which form the basis for possible further advances. The first of these variations is a recently proposed model for flows over time called flows with aggregate arc capacities. This new model allows, among other things, bounding how much flow can be traveling on an arc at the same time. For example, this can represent a load limit on a bridge or a limit on the number of cars simultaneously in a tunnel, as opposed to treating these structures like any other street. We discuss the substantial differences between flows with aggregate arc capacities and standard flows over time. We also provide a fully polynomialtime approximation scheme, that is, an algorithm that can solve the considered flow problem with any desired accuracy in polynomial time. However, due to many remaining algorithmic challenges, this model can probably not be applied on large-scale instances just yet. Finally, we want to express the effects of emergency exit signs on the flow of evacuees if all evacuees observing the sign move into the same direction. This leads us to confluent flows. In such a confluent flow, all flow units arriving at the same vertex have to continue into the same arc. The challenge

9 9 then lies in choosing the right arc at each vertex, or equivalently, the right direction for the exit sign. Similar principles are commonly used for routing in telecommunication networks. Specifically, we want to find confluent flows that send the maximum amount of flow, but in the simpler model without time. We have to solve this first, because this is a subproblem needed to determine a confluent flow over time that minimizes the time needed. Progressing through complexity considerations for this problem, we first present brute-force algorithms for less challenging cases. We then develop a novel polynomial-time algorithm if the network has a certain structure, namely, if it is outerplanar (a planar graph where all vertices lie on the outer face). This is significant because this case borders on other network structures for which no polynomial-time algorithm is expected to exist. We complement this with an approximation scheme than can solve this problem on much larger graph classes with arbitrary precision. Outline and Contributions Chapter 1 contains the Fundamentals, mostly flow theory, that are used throughout this entire work. Its main purpose is to state definitions and introduce the notation we use. Many professional readers will be able to skip large parts of this. Should the definition of a symbol be needed, the Index helps to locate it. The definitions unique to a chapter are given in the Preliminaries section of each chapter along with further material, like complexity considerations, to familiarize everyone with the topics at hand. Chapter 2 deals with Earliest Arrival Flows and Their Computation. Earliest arrival flows are flows over time with minimum total travel time and other desirable properties. As this model is well-studied from a theoretical point of view [49], we focus on the development of an algorithm for large real-world instances. The algorithm is based on the successive shortest path algorithm, specialized to the repeating nature of the time-expanded network and typical instances. One central idea is to consider copies of the same vertex across multiple time layers at once when determining a shortest path. The computational results show that we achieved our design goals: Our algorithm works very well on real-world instances the larger, the better. We match the performance of all competitors on the largest real-world instances modeling cities, and significantly outperform them on the subset of instances modeling large buildings. This makes our algorithm particularly suitable for inclusion in evacuation planning software for buildings, where our performance gains allow near-instantaneous feedback or evaluating multiple scenarios of the same instance. Chapter 3 is on Flows with Aggregate Arc Capacities, which are flows over time where capacities bound the flow into each arc within a sliding time window. This model was recently introduced by Melkonian [70], who showed first NP-hardness results. We generalize the model by uncoupling the length of the sliding windows from the transit times of the arcs, and show how this model generalizes standard flows over time.

10 10 CHAPTER 0. INTRODUCTION We first discuss weak and strong flow conservation for our generalized model, as well as the complexity of the problem under various settings. Our findings differ a lot from the standard flow over time model. The main contribution in this chapter is our fully polynomial-time approximation scheme (FPTAS), which computes solutions that require a violation of the capacities and the time horizon by the approximation factor. The arc costs and the lengths of the sliding windows are considered exactly, though. The correctness proof requires adapting the flow globally to rounded transit times. This is an extension of a technique by Fleischer and Skutella [27]. Chapter 4 discusses Confluent Flows, which impose a natural restriction on static flows by allowing only one outgoing arc to carry flow at each vertex. We consider the maximum confluent flow problem, because we think of maximizing the flow value as the most fundamental problem. Recent works by Chen et al. [12, 13] explore minimizing the congestion of a confluent flow, but they omit arc capacities from their analysis. We allow arbitrary arc capacities, which is much more typical for flow problems. Our complexity discussions let us isolate interesting and challenging cases for the maximum confluent flow problem. We then develop a polynomial algorithm for computing the maximum confluent flow value on outerplanar graphs with a single sink. We argue that this is an important non-trivial case. While presented as a dynamic program, the tables require pseudo-polynomially many entries, for which we find an efficient encoding. We also detail an FPTAS for maximum confluent flows with arbitrary terminals on graphs with bounded treewidth. We conjecture that a combination of the two approaches results in polynomial algorithms for all possible graph classes for which the maximum confluent flow problem with a single sink is tractable. Collaborations and Previous Publications Several parts of this work were jointly developed and have been previously published in some way. Chapter 2 is mostly unpublished. Preliminary results appeared in a short conference submission with Gunnar Flötteröd, Gregor Lämmel, Kai Nagel, and Martin Skutella [20]. Some of the general ideas on evacuation modeling and how to use the computed solution can also be found in [21], which has a different focus than this chapter, though. The approximation scheme for flows with aggregate arc capacities in Chapter 3 was joint work with Martin Skutella and has been published in [22]. Our presentation here extends the complexity analysis of this new flow model and aims to clarify the main proof. The results in Chapter 4 on confluent flows were joint work with Martin Strehler. The chapter is based on our publication [24] and a pending submission [23]. These were also included in his dissertation thesis [88]. Various changes have been made here, in particular, the approximation scheme presented in [24] was completely overhauled to simplify the technical proofs.

11 11 Acknowledgements This thesis only exists thanks to a lot of people and their support. My advisor/supervisor Martin Skutella was a source of knowledge and helpful nudges. Ekkehard Köhler took it upon him to assess this thesis, too. Martin Strehler was an indispensable coauthor since the first calculus course. Actually, everyone else from the Adaptive Verkehrssteuerung (Adaptive Traffic Control) project [1] was great, too. The project meetings were a wonderful opportunity to exchange ideas. This project, granted by the German Federal Ministry of Education and Research, was also the main source of my funding. I m also indebted to my students, in particular: Manuel Schneider wrote a lot of the code for the algorithm in Chapter 2. Matthias Rost involuntarily ended up debugging our code while writing his Bachelor thesis. He also provided the Berlin instance for testing. With Martin Günther we seriously started the work towards the bridge flow model in Chapter 3. Finally, Stefan Müller was a great partner to discuss confluent flows and related models. The hardest part near the end had my proofreaders, though: José Verschae read the bridge flow article in the frantic hours before the submission deadline and later Chapter 3 for this thesis. He also had many insightful comments on the introduction. Jan-Philipp Kappmeier and Martin Groß checked Chapter 2 and have been a source of evacuation knowledge for a couple of years now. Jannik Matuschke took on the quite technical Chapter 4. I hope our friendship did not suffer too much. Of course, many more people helped me endure the daily academic life (and occasionally without quotation marks). There are so many to name for so many good reasons that I couldn t shorten the list to a reasonable length. Thank you all. To my family, I say: Thanks a lot for your support and the many ways in which you always encouraged me over the last three decades and especially the last year. Finally, Janina Müttel proofread the entire thesis. There is more to say, but nothing she would not already know.

12 12 CHAPTER 0. INTRODUCTION

13 Chapter 1 Fundamentals 1.1 Complexity Theory and Approximations We will make use of the following notions: For running times of algorithms, we consider the categories polynomial, strongly polynomial, as well as pseudopolynomial, and exponential. We consider our algorithms in the model of a Random Access Machine. This model is similar enough to a modern PC for our purposes that we do not distinguish the two. For precise definitions, we refer to the book by Schrijver [82]. From complexity theory, we use the classes P and NP extensively, as well as NP-hard, NP-complete, and strongly or weakly NP-complete. A thorough introduction to complexity theory concerning NP is the book by Garey and Johnson [36]. We study approximation algorithms, and we say that an algorithm that solves a maximization problem is an α-approximation if it always produces a value of αopt or better, where OPT is the respective maximum value of the given instance. A fully polynomial-time approximation scheme (FPTAS) is an algorithm that, for every parameter ε > 0, is an (1 ε)-approximation that runs in time polynomial in the input and ε 1. One of the many existing complexity results that we want to point out for later usage, is that linear programs can be solved in polynomial time, first shown by [55]. 1.2 Graphs Throughout this work we consider finite graphs. Usually, we will denote a directed graph by G = (V, A). For an arc a A, let tail(a) denote its startvertex and head(a) its end-vertex. There may possibly be loops and parallel arcs. Because of the latter, the arc set is technically a multi-set, but we still refer to it as arc set and to arcs by their tail and head (v, w). It will be clear from the context which arc is meant. For v V, we use δ in (v) := {a A : head(a) = v} and δ out (v) := {a A : tail(a) = v} for the set of arcs entering and leaving v, respectively. Undirected graphs have an edge set E instead of the arc set A, and the set of incident edges is simply δ(v). The undirected version of a directed graph is obtained by removing the directions of 13

14 14 CHAPTER 1. FUNDAMENTALS the arcs and unifying parallel edges. A directed version of an undirected graph introduces a direction for each arc. The bidirected version of an undirected graph introduces two opposing arcs for each edge. The induced subgraph of G = (V, A) on the vertex set V V is denoted by G[V ], and we use A[V ] for the arc set of G[V ]. A walk W = (a 1,..., a k ) is a finite sequence of arcs that fit head to tail to each other. We use tail(w ) and head(w ) as shortcuts for the tail of the first and head of the last arc, respectively. A walk is said to be closed if its tail and head coincide. A path is a walk which passes through each vertex at most once, while a cycle is a closed walk that uses each vertex at most once with the exception of the joint, which occurs exactly as the tail and head. We will use V (W ) and A(W ) for the (multi-)set of vertices and arcs occurring in W. 1.3 Shortest Paths A fundamental graph problem is asking for a shortest connection from one vertex to another. For this, the length (or cost) of the arcs is given by a function c : A R, and the cost of a walk W, c(w ), is the sum of the costs of each arc along the walk, that is c(w ) := a A(W ) c(a). Given a start vertex v and a target vertex w, one asks for a shortest path P from v to w. Assuming that there is a path from v to w at all, such a shortest path always exists because there can only be a finite number of paths in a finite graph. However, the complexity of finding such a shortest path depends on the cost function and ranges between NP -hard and easy enough to be ubiquitous in modern life. The fundamental distinction for determining the complexity is whether the cost function admits a negative directed cycle. It is easy to see that using a cost function that assigns -1 to all arcs includes the problem of finding a path with the maximum number of arcs between v and w, which is NP -hard [36]. If no negative cycle exists, the cost function is called conservative. While conservative costs make the problem tractable, many algorithms are designed for the further restricted case of non-negative cost functions. We will only need to compute distances from one vertex to many others here as opposed to the all-pair shortest path problem. Thus, we denote the shortest path distance from the starting vertex to some vertex w by dist(w) without explicitly mentioning the starting vertex as it will be clear from the context. Dijkstra s algorithm [17] is the classical algorithm for computing shortest paths with a single starting vertex when the cost function is non-negative. If the cost function can have negative entries, the Moore-Bellman-Ford algorithm [7,28,74] either detects a negative cycle or returns a shortest path. Both run in polynomial time, and Dijkstra s algorithm in its modern version even achieves a very respectable running time of O( V log V + A ) [32]. Both algorithms maintain labels on the vertices, label( ), which are upper bounds on the shortest path lengths dist( ) and get adjusted during the execution. If label(w) = dist(w), this label is called exact. The algorithms end when the

15 1.4. STATIC FLOWS 15 labels for all target vertices are known to be exact. Shortest path computation is an entire sub-field of computer science in its own, though, and many improved algorithms exist. See Delling et al. [16] for an introduction to this field with an emphasis on road networks. But we have no need for typical shortest path algorithms in this work. Our successive shortest path algorithm in Chapter 2, that would be the obvious application, is quite specialized and needs to solve only a simpler problem than computing shortest paths. 1.4 Static Flows The classic book on flows is the one by Ford and Fulkerson [30], a more modern introduction is given by Ahuja et al. [2]. For a broad theoretical approach, see Schrijver [82]. While there is a general agreement what properties a flow should have, we sometimes need to use objects that are not quite flows. Hence, the following definition of a flow is modular. In its most general form for our purposes, a flow function is simply a real-valued function on the arc set, say, x : A R. Most settings consider non-negative arc capacities, u : A R 0. Often a graph with its associated capacity (and maybe additional parameters) is called a network. A flow function is feasible if 0 x(a) u(a) for all arcs a A. From this, we can derive the inflow into a vertex, the outflow, and its balance which is the net flow out of a vertex. inflow(v) := x(a) outflow(v) := a δ in (v) a δ out (v) x(a) bal(v) := outflow(v) inflow(v) A flow function satisfies flow conservation at v, if bal(v) = 0 and this is the usual assumption for a vertex. But a source vertex may also have a positive balance (i.e, may send new flow into the network), while a sink vertex may have a negative balance (remove flow from the network). We denote the set of sources with S + and the sinks with S. Combined, they form the set of terminals. Assuming S + S = is only natural and lets us bypass some exceptional cases later on. We then call a feasible flow function that satisfies flow conservation at all non-terminals and has non-negative balance at each source and non-positive balance at each sink a flow. The value of a flow x is the sum of flow reaching the sinks, i.e., val(x) := v S bal(v). Due to flow conservation at the non-terminals, one can also express this as val(x) = v S + bal(v). Often we will want to further limit the balance of terminals. For this, we introduce a supply/demand function d : V R. All sources v S + must

16 16 CHAPTER 1. FUNDAMENTALS directed arc 1 1 non-terminal sources with supplies = antiparallel arcs 1 sink with demand edge (undirected) flow 3 out of capacity 7 3/7 residual backwards arc (a) (b) Figure 1.1: Our standard drawing style for graphs and flows. Note that unless otherwise noted we consider single-commodity flows. Sources are only distinguished by colors to trace flow units more easily. We will omit various aspects, for example exact supplies/demands or residual arcs, when they do not contribute to the drawing. Costs and other parameters will be added as necessary. have non-negative values d(v) 0, while all sinks v V must have nonpositive d(v) 0. We will often talk about the supply of a source and the demand of a sink and mean the absolute values in that case. Non-terminals must have 0 supply/demand. We say that a flow function satisfies the supplies/demands d if d(v) = bal(v) holds for all vertices v. Flow conservation implies that a flow can only satisfy supplies/demands d if v V d(v) = 0, i.e., when the sources want to send exactly as much flow as the sinks can accept. We will call such a flow a transshipment for the supply/demand function d. A weaker condition only requires that sources do not exceed their supplies while sinks do not exceed their demands. We say a flow obeys the supplies/demands d if the flow satisfies some supplies/demands d such that 0 d (v) d(v) for all v V without changing the roles of sources and sinks. Conversely, a flow always obeys some infinite supplies/demands, where infinite can be replaced by trivial bounds on the flow value in all our applications. Finally, the graph might have an associated arc cost c : A R, which defines a cost per unit of flow. The cost of a flow function is then cost(x) := a A c(a)x(a).

17 1.4. STATIC FLOWS Static Flow Problems With this notation available, we can define some common flow problems. Problem 1.1 (MaxFlow). Input: A digraph G, arc capacities u, sources S +, sinks S, and a matching supply/demand function d. Question: What is the maximum value of a flow obeying d? Problem 1.2 (MinCostFlow). Input: As for MaxFlow, also arc costs c. Question:What is the minimum cost of a maximum flow? Instead of asking for a maximum flow, one could also ask for the existence of a transshipment or a transshipment of minimum cost. If the set of sources is a singleton or if the demands imply only a single source, we call this the single-source version of the problem. Similarly, there are single-sink versions. Actually, the single-source single-sink MinCostFlow problem is general enough to model the other problems by introducing a supersource and a supersink. The supersource is a new vertex s + with supply equal to the total supply of the sources. For each original source s +, we introduce a new arc (s +, s + ) with capacity d(s + ) and cost 0. Then the supply of s + is set to 0. Similarly, a supersink s captures all flow from the various sinks, which become normal vertices. It is easy to see that the maximum value of a flow remains unchanged, and the costs of the solutions are also not affected. The MinCostFlow problem becomes NP-hard under the same condition that makes shortest paths NP-hard, namely that the cost function admits a negative directed cycle. Otherwise, these static flow problems can be solved in polynomial time, even very quickly in theory and practice. Indeed, they have become routine in many application areas. See Table 1.1 for some results on possible running times, but there are many other specialized algorithms. We will give some intuition into the classical algorithms in the following section Working with Static Flows Path decompositions and related concepts are a key ingredient in large parts of this work. Given a path P = (a 1,..., a k ) and some non-negative d P R, we can send d P units of flow along P. This defines a flow function { dp if a A(P ), x P (a) := 0 else. Since we used a path (in contrast to a walk), each arc in P occurs only once and therefore the balance of tail(p ) is d P, that of head(p ) is d P and any other vertex has balance 0. Assuming feasibility, we call x P a path flow. Similarly, for a cycle C, one can define a cycle flow x C as d C on the arcs in C, and 0 elsewhere. This is a flow that satisfies flow conservation at all vertices and has no source or sink. Path and cycle flows let us trace the flow units in the network. They are the essential building blocks of flows, as demonstrated by the following theorem:

18 18 CHAPTER 1. FUNDAMENTALS MaxFlow problem algorithm running time comment path augmentation O(val m) Ford, Fulkerson [30] shortest path augmentation O(nm 2 ) Edmonds, Karp [25] blocking flow O(n 2 m) Dinic [18] (improvements exist) push-relabel better than O(nm log n) e. g., King et al. [56] MinCostFlow problem algorithm running time comment successive shortest path O(nm + val(m + n log n)) Edmonds, Karp [25] minimum-mean cycle O(m 3 n 2 log n) Goldberg, Tarjan [40] canceling cost-scaling O(nm log(n 2 /m) log(nc)) Goldberg, Tarjan [41] cs2 (scaling pushrelabel) O(n 2 m log(nc)) Goldberg [39] mcf (network simplex) practical, not polynomial Löbel [64, 65] (polynomial variants exist) Table 1.1: Some algorithmic results for static flow problems. For brevity, we denote n = V, m = A, C = the maximum cost, val = the maximum flow value. All data must be integral. Theorem 1.3 (Ford and Fulkerson [30]). Let G = (V, A) be a graph with capacities u. Let x be a flow. Then there is a set of paths P with positive constants d P R for P P, and a set of cycles C with positive constants d C R for C C, such that P + C {a A : x(a) > 0} and x x P + x C. P P C C The paths can be chosen such that tail(p ) is a source and head(p ) a sink of x for all P P. The proof is constructive, repeatedly subtracting a path flow for some source-sink-pair from the flow. Any remaining flow cannot have a source or sink and thus can be similarly decomposed into cycle flows. There is also an opposite modification. One can add a path or cycle flow to an existing flow, assuming that the arc capacities allow this. More generally, the balance of a vertex depends linearly on the flow function, so it is trivial to add, subtract or scale flow functions and determine the effect of this on the balances and feasibility, as well as costs. So the path decomposition also suggests a way to construct a flow: Simply add path and cycle flows to an existing flow until the desired flow is reached. This approach, however, depends on the ability to undo flow (and bad decisions ) when adding a path, which is not required for decompositions. The

19 1.4. STATIC FLOWS 19 notion of the residual network allows just that. Given a feasible flow function x, the residual network G x = (V, A x ) of G has the same vertex set V. We will define its arc set shortly. First, let us introduce forward and backward arcs. For each arc a = (v, w) A, there can be a forward copy, which is a itself, and a backward copy a = (w, v). (This backward arc is meant to be distinct from any existing arc (w, v).) They form the set of possible forward arcs A and backward arcs A, respectively. The cost function is extended to the backward arcs with c( a ) := c(a). The capacities of these arcs are only defined with respect to some feasible flow function x, resulting in the residual capacity u x. For a forward arc this is u x (a) := u(a) x(a), while the residual capacity of a backward arc is u x ( a ) := x(a). The residual network then uses these capacities u x and the extended costs c on the arc (multi-)set A x := {a A A : u x (a) > 0}. A new flow function x on the residual network represents changes that can be applied to the flow x. Flow on forward arcs can be added to x, while flow on backward arcs can be subtracted from x. This is called augmenting x with x. There are well-known optimality conditions based upon augmentation, which further specify that we need only look for augmenting paths or cycles. Similar theorems can be found in many textbooks. Theorem 1.4 (Ford and Fulkerson [30], Edmonds and Karp [25]). Let G = (V, A) be a graph with capacities u and costs c, and let x be a flow for the sources S + and sinks S. 1. Flow x is a maximal flow in G for some obeyed supply/demand function d if and only if there is no path in G x from s + S + with bal(s + ) < d(s + ) to s S with bal(s ) < d(s ). 2. Flow x has minimum cost among all flows satisfying the same supplies/demands as x if and only if there is no directed cycle with negative total cost in G x. If, however, such path or cycle exists, we can send some flow x along it and change the flow x towards optimality. The maximum amount of flow we can send is the bottleneck capacity of the path or cycle with arc set A A A, that we define as u min (A ) := min a A u x (a). The actual augmentation is demonstrated in the following theorem: Theorem 1.5 (Ford and Fulkerson [30]). Let G = (V, A) be a graph with capacities u and costs c. Let x be a flow satisfying some demands d. Let x be a flow in G x satisfying some demands d with the same sources and sinks as x. Extend x with 0 to all possible forward or backward arcs. Let x be x augmented by x as follows: x(a) := x(a) + x (a) x ( a ). Then x is a flow satisfying supplies/demands d + d. If x is a minimum cost flow for d in G and x a minimum cost flow for d in G x, then x has minimum cost for d + d in G as well.

20 20 CHAPTER 1. FUNDAMENTALS This theorem proves the correctness of the classical Ford-Fulkerson algorithm [30] for the MaxFlow problem: It iteratively looks for a path between a suitable source and sink and augments flow along it. It only has a pseudopolynomial running time. The Edmonds-Karp algorithm [25] improves upon this by augmenting along a path with the minimum number of arcs each, which can be found by a breadth first search. It achieves a polynomial running time. The classic successive shortest path (SSP) algorithm extends these maximum flow algorithms to solve the MinCostFlow problem with non-negative costs and forms the basis for our work in Chapter 2. Recall that the singlesource single-sink MinCostFlow problem is general enough to represent all the problems we mentioned. There are actually two goals in there: Maximizing the amount of flow and minimizing the cost. The SSP algorithm works on these two goals simultaneously, iteratively increasing the flow value while maintaining that the flow has minimum cost for this value. It starts with the zero flow x 0. By Theorem 1.4, if the flow x does not have maximum value, we can find a path P that can be added to it. If one chooses a shortest path P in the residual network, any path flow x along P is also a minimum cost flow for its value. Thus, if we augment x by the path flow x, we obtain a flow with greater value and still with minimum cost. The basic version of this is shown as Algorithm 1.1. The correctness depends on the initial zero flow x being a minimum cost flow for supply 0. This is true because there are no backward arcs in the residual network of the zero flow and no negative cycles because all arc costs are non-negative. Then, in each iteration, the flow is increased along a shortest path in G x without exceeding supplies/demands or capacities. This maintains the optimality for the current flow value as described in Theorem 1.5. The following theorem summarizes the properties of the SSP algorithm that we will need: Theorem 1.6 (Edmonds and Karp [25]). Let the input of the SSP algorithm consist of a digraph G = (V, A), capacities u : A Z 0, costs c : A Z 0, supersource s, supersink s =, with supply, respectively, demand d Z 0. Then the following hold true: 1. The shortest path distances from s to any other vertex in the respective residual networks never decrease from one iteration to the next. In particular, the length of the paths never decreases from one iteration to the next. 2. There are at most d iterations and the running time is dominated by the shortest path computations in the residual network. 3. The algorithm is correct. The running time of this algorithm, however, is only pseudo-polynomial. While the shortest path computation can be done efficiently, because there can never be any negative cycles in G x if x has minimum cost, the algorithm only guarantees to augment a single flow unit each iteration given integral

21 1.4. STATIC FLOWS 21 Algorithm 1.1: The basic successive shortest path algorithm. input : Digraph G = (V, A), capacities u : A Z 0, costs c : A Z 0, supersource s with supply d Z 0, supersink s = with demand d output: Flow x with minimum cost among all maximum value flows obeying d 1 x := 0 // start with the empty flow 2 while d > 0 do 3 P := shortest s -s = -path in G x 4 if P = then return x // cannot send all supply 5 γ := u min (P ) // bottleneck capacity of P in G x 6 γ := min{γ, d} 7 augment x by γ on P 8 d := d γ 9 return x // satisfied supply capacities. There are instances known which exhibit this pseudo-polynomial running time [91]. Further improvements to the SSP algorithm achieve a polynomial running time using the general concept of capacity scaling. Observe that an exponential running time can only occur in the SSP algorithm if the network contains paths with exponential capacities. By considering only arcs with large capacities, such paths can be found efficiently. Further iterations also consider residual arcs with smaller and smaller capacities. The actual algorithms can be found in many textbooks [2, 62]. We do not delve any deeper into this because capacity scaling is unlikely to help with the instances we consider in Chapter 2, as described in our assumptions in Section and the description of the instances for the computational results in Section This leaves the shortest path computation as the core component of the SSP algorithm. This is an open invitation to adapt this part to the task at hand, and we will do so extensively in Chapter 2. Note that from this algorithm one can see that the flow problems considered so far always have integral solutions: Theorem 1.7 (Dantzig and Fulkerson [15]). The MaxFlow and MinCostFlow problems for integral input data have an integral optimum solution Multi-commodity Flows Many applications distinguish multiple types of flow units, for example, various goods or people with different origins and destinations. This often cannot be expressed as a single flow, not even with multiple sources and sinks, as there is no rule enforcing that flow units from a certain source head to a certain sink. The common solution is to consider multi-commodity flows. This simply splits the problem into multiple flows x k, one for each commodity k in some set K.

22 22 CHAPTER 1. FUNDAMENTALS Typically, each commodity also has their own set of sources S + k, sinks S k and supplies/demands d k. There may be commodity-dependent costs and arc capacities as well. There is essentially only one thing linking these flows: The sum of the flow functions has to obey certain arc capacities. Except for this condition, the problem could be decomposed into independent flow problems. From an algorithmic point of view, multi-commodity problems are harder than static flow problems, but can still be solved with linear programming in polynomial time, even in strongly polynomial time because of their special structure. They do exhibit other properties than single-commodity flows, though. For instance, the integrality guarantee by Theorem 1.7 does not hold in the setting with multiple commodities. The textbook by Schrijver [82] includes many examples where multi-commodity behave differently than single-commodity flows. 1.5 Flows Over Time As we have seen, a static flow only describes which paths are taken. But in many applications, it is necessary to examine more closely how flow moves along these paths: One needs to describe the flow at each moment and how it develops over time. This concept of a flow over time goes back to Ford and Fulkerson [29] (who used the term dynamic flows). Skutella gives a compact introduction to this field [86]. Given a graph G = (V, A), a flow over time function is a function f : A R 0 R that denotes the flow rate entering each arc at each point in time. For notational simplicity, we always assume that the domain of the flow functions is extended to the negative times with value 0. Capacities in the flow over time setting limit these flow rates on the arcs. Given enough time, an arbitrary amount of flow units could still pass through an arc. For our general model, we allow these capacities to change over time, that is, the arc capacities are of the form u : A R 0 R 0. A flow over time function is feasible, if 0 f(a, t) u(a, t) for all a A and all t R 0. For technical reasons, we also require that f(a, ) and u(a, ) are Lebesgue-measurable for every arc a A, and we will soon see that we can usually restrict ourselves to simpler flow functions as well. The most important rules that govern how a flow over time changes stem from the transit time (or length) that each arc has. These transit times could be time-dependent themselves, but here we will assume that they are constant over time, and capture them in a parameter τ : A R 0. Their interpretation is that if flow enters the tail of arc a at time t at rate f(a, t), then flow leaves the head of a at time t + τ(a) at the same rate f(a, t). Similar to a static flow, we can define the balance of a vertex v at each time t as the net flow rate out of v: bal(v, t) := f(a, t) f(a, t τ a ). a δ out (v) a δ in (v) It is certainly possible to require that bal(v, t) = 0 for all times t and all non-terminal vertices v S + S. This is in concordance with static

23 1.5. FLOWS OVER TIME 23 flows, and we call this strong flow conservation. There is a viable alternative, though: Flow units might be allowed to wait at vertices. This is called weak flow conservation or holdover. To define it properly, we need the excess of a vertex, that describes how much flow has already entered but not left a vertex v by time t: excess(v, t) := t 0 bal(v, θ)dθ. With strong flow conservation, there can never be any excess at nonterminal vertices. In contrast, for weak flow conservation, we only require excess(v, t) 0 for all times t and all non-terminals v. That is, the vertex can always send out stored flow but must not send out flow when nothing is stored. The amount of flow that can be stored is infinite by default. (It is natural to put an upper capacity on the amount of storage at a vertex, but we will not use this enough to justify unique notation for it.) We still need to define the behavior of sources and sinks. Like in static flows, flow units originate at sources and vanish at sinks. If we require strong flow conservation for non-terminals, it is appropriate to demand bal(v, t) 0 for a source v and bal(v, t) 0 for a sink v and for all times t. In the case of weak flow conservation, a sink may store incoming flow for a while but then send it onwards, so that the balance might first be negative and then becomes positive. Thus, the balance of a sink can have either sign in the case of weak flow conservation, and the same holds for sources. So there is no immediate restriction on the balance of a terminal for weak flow conservation. For both alternatives, how much flow has already left a source, respectively, reached a sink, is expressed by the excess of the terminal. As in static flows, there can be supplies/demands d : V R 0, which must be non-negative for sources, non-positive for sinks and 0 for non-terminals. A source v obeys supply d(v) if excess(v, t) d(v) for all times t, while a sink v with demand d(v) simply has to obey excess(v, t) 0 for all t, just like a non-terminal vertex under weak flow conservation. When we evaluate the flow at some given time T R 0, we can then determine if the supplies/demands have been satisfied: Is excess(v, T ) = d(v) for all sources and sinks v S + S? A problem arises when flow functions continue forever: Whatever we measure at time T is irrelevant shortly after, which is why we want the flow to end after finite time. We say that a flow function has time horizon T R 0 if the network is empty for all t T. In particular, no flow is allowed to enter an arc at some time t T, that is f(a, t) = 0 for all t T, and no flow must leave an arc at time t T, which can be expressed as f(a, t) = 0 for all t T τ a. Additionally, no flow must be left in storage at a non-terminal vertex at the time horizon T, so excess(v, T ) must be 0 for all v V \ (S + S ). With all this in place, we can define a flow over time: It is a feasible flow function that satisfies weak or strong flow conservation (depending on the context). A flow over time with time horizon T satisfies the supplies/demands d if excess(v, T ) = d(v) for all v V, in which case it is a transshipment over

24 24 CHAPTER 1. FUNDAMENTALS time for this supply/demand function. It obeys the supply/demand function d, if it satisfies some demands d, with d (v) d(v) without changing the role of a vertex. The value of a flow with time horizon T is val(f) := v S excess(v, T ). Note that we assume strong flow conservation in Chapter 2, while the main results in Chapter 3 require weak flow conservation. The following problem definitions can be made independently of this distinction, though, by simply referring to flows over time. One can also define costs for flow over time functions. Like in static flows, there is a cost c : A R attached to each arc, which has to be paid per unit of flow on that arc. (Some definitions also make c time-dependent.) The cost of a flow over time f is cost(f) := a A 0 c(a)f(a, θ)dθ. We could also attach costs to storage at vertices (the excess), but this leads to some difficulties for the sources and sinks and how they interact with excess. For example, a flow over time as defined here does not track explicitly what amount of the excess at a sink is considered to enter the sink and what might be sent on to another sink at a later point. This cost will be added in Section 2.4, though, where waiting has the same cost no matter where it occurs. That section will also work on the time-expanded network (see below), which has no such ambiguity. As in the static case, there are some standard problems considered for flows over time. These extend the static problems by fixing or minimizing the time horizon. Problem 1.8 (MaxFlowOverTime). Input: A digraph G, transit times τ, time-dependent arc capacities u. Sources S +, sinks S for a supply/demand function d. A time horizon T. Question: What is the maximum value of a flow over time with time horizon T obeying d? Problem 1.9 (MinCostFlowOverTime). Input: As for MaxFlowOverTime, also arc costs c. Question:What is the minimum cost of such a maximum flow? Problem 1.10 (QuickestTrans). Input: A digraph G, transit times τ, and time-dependent arc capacities u. Sources S +, sinks S for a supply/demand function d. Question: What is the minimum time horizon for a transshipment satisfying the given supplies/demands? Problem 1.11 (MinCostTransOverTime). Input: As for MaxFlowOverTime, also arc costs c and a cost bound C. Question: Is there a transshipment over time with time horizon T satisfying d at cost C?

25 1.5. FLOWS OVER TIME 25 Note that flows over time also have a natural multi-commodity analogue. Like the static multi-commodity flows, multi-commodity flows over time are multiple flow problems linked by an overall capacity constraint on each arc. We could proceed to define paths over time, the augmentation of flows over time and all the analogues from static flows. However, this is not really necessary, as we can reduce flows over time to static flows for almost all of our purposes by a construction called the time-expanded network, which was already used by Ford and Fulkerson [29]. The idea is two-fold: First of all, for our purposes we can replace a flow over time function by one with discrete time steps. Secondly, a copy of the original network is needed for each time step and the arcs are used to transport flow from one layer to the next according to the transit times. These copies are called time layers, and an arc a with transit time τ(a) transports flow from time layer representing t to one representing t + τ(a). We will define this in more detail in the next sections. With the time-expanded networks, almost all flow over time problems can be reduced to pseudo-polynomially sized linear programs and often to static flow problems. This is probably the most common approach in practice as well. Fleischer and Skutella [27] give an FPTAS for the MinCostTransOver- Time that reduces the size of the time-expanded network. Problems that can be solved without the time-expanded network include the single-source single-sink MaxFlowOverTime problem, which can be reduced to a static MinCostFlow instance. Multiple sources or sinks change the problem (unlike in the static case), but there is a highly non-trivial polynomial-time algorithm by Hoppe and Tardos [49,50]. Its running time is not practical though. The problem variants that include costs are (weakly) NP-complete even on series-parallel graphs [60], so the time-expansion is a reasonable approach constant Flows To formally define the time-expanded network, we start out with the discretization step. We assume that we are given a desired step size for the discretization R >0, that is suitable for the input data (as detailed below). We quickly fix some notation for rounding: For r R we use r to denote r rounded up to the next multiple of. Analogously, r rounds down to multiples of. If is omitted, we assume = 1. A function g : R R is called -constant if g restricted to [i, (i + 1) ) is constant for all i Z. A flow over time function is -constant if each f(a, ) is. Thus, we need only T/ values for each arc to describe a -constant flow function with time horizon T. We call admissible if τ( ) and the time horizon T (if given) are integral multiples of. The time-dependent arc capacities u(a, ) have to be -constant functions as well. When dealing with -constant flows for an admissible, we can replace integration by summation in all expressions so far. The essential equality is j i f(v, θ)dθ = j f(v, k ) for all i, j Z. k=i

26 26 CHAPTER 1. FUNDAMENTALS General intervals from t 1 to t 2 can be linearly interpolated: t2 t 1 f(v, θ)dθ =( t 1 t 1 )f(v, t 1 ) + + (t 2 t 2 )f(v, t 2 ), t2 t 1 f(v, θ)dθ+ with the remaining integral further to be replaced by the sum above. As a side note, this setting of -constant flows is essentially equivalent to discrete flows over time. However, in the interpretation of a discrete flow, the flow units travel in packets or impulses that take 0 time to enter or leave an arc (at what would be infinite rate) but may only do so at integral time steps. This introduces a few technical and notational differences to flows over time as we have introduced them, whereas -constant flows do not need their own notation while still enabling the same algorithmic ideas as discrete flows. Let us now show that restricting ourselves from general flows over time to those that are -constant does not affect the solution quality of our problems if is admissible. Let f : A R 0 R 0 be a flow over time function. Then we can define a flow f that sends the average of what f sends in each time step: f (a, t) := 1 t + t f(a, θ)dθ t R 0 a A. This is clearly a flow over time function again. More importantly, it is feasible for the same capacities as f: f (a, t) = 1 1 = 1 t + t t + t t + t = u(a, t ) = u(a, t) f(a, θ)dθ u(a, θ)dθ u(a, t )dθ The equalities marked with an asterisk follow directly from u(a, ) being constant within the interval [ t, t + ), which is required if is admissible. Note that the behavior of the balance functions for f may be quite erratic in the case of weak flow conservation, changing sign very quickly. The balances induced by f can still exhibit such behavior of changing signs, but they are at least -constant. Due to the averaging, there does not have to be a relation between bal f (v, t) and bal f (v, t) for single values of t. However, the excess functions of the vertices remain unchanged at each time t = i, i Z. That is, excess f (v, i ) = excess f (v, i ) for all i Z. Since general integrals for f can be linearly interpolated, the new flow f satisfies the (linear) conditions on the excess of

27 1.5. FLOWS OVER TIME 27 the vertices at time t if it does so at times t and t, which is equivalent to asking this of f at these two points. Therefore, weak or strong flow conservation also carry over from f to f. Note that if the original flow f has time horizon T and is admissible then the averaged flow f has time horizon T as well. By the same arguments, f satisfies (or obeys) the exact same supply/demand function at time T that f satisfies (or obeys) at time T. If a cost function is given, the cost is also unaffected by the averaging. In summary, assuming is admissible, any flow f implies a -constant flow f which is feasible if f is, inherits the flow conservation and the time horizon. The flow f also satisfies the same supplies/demands as f at the same cost. Note, that if is admissible and the flow is -constant, the entire time axis can be rescaled such that = 1. Besides the clear changes to the transit times, from τ(a) to τ(a)/, and the time horizon, this only affects the capacities: The flow rates being the amount of flow per time unit change by a factor of. Thus, the arc capacities also are scaled from u(a, t) to u(a, t/ ). Alternatively, one could scale the supply/demand-function and cost bounds instead of the flow rates The Time-Expanded Network With -constant flows, one can reduce the arbitrarily detailed flows over time to simpler constructs that can be expressed by linear (in-)equalities that only have to hold at each i. Thus, they can be expressed and solved as a linear program (for finite time horizons). This by itself can be useful and has complexity implications. However, a more elegant and efficient method to deal with flows over time is the classic idea of the time-expanded network, which reduces many flow over time problems to static flows. Consider a graph G = (V, A) with transit times τ : A R 0, timedependent arc capacities u : A R 0 R 0 and a time horizon T. Choose an admissible > 0. We can then define the time-expanded network of G with time horizon T and discretization, which we will write as G T/ = (V T/, A T/ ), or simply as G T = (V T, A T ) if = 1. The T/ is purely notational, not a division. We will construct the time-expanded network such that there is a correspondence between -constant flows over time with time horizon T on G and static flows on G T/. This is a bijection for strong flow conservation. With only weak flow conservation, the flow on the timeexpanded network is not uniquely determined with respect to the balances of terminals. We will only give the construction for = 1 in the following, but the construction can easily be scaled to other values of. The vertex set V T of the time-expanded network is obtained as follows: For each t [0, T ) Z there is a copy of v V named v t. Together, the vertices v t, v V, form the time layer V t at time t. For each source s + S + there is a vertex s ++, together forming the set S ++. Similarly, for each sink s S there is a vertex s in the set S. The union of all the V t, as well as S ++ and S forms V T.

28 28 CHAPTER 1. FUNDAMENTALS The arc set A T consists of multiple groups of arcs, as well: Each original arc a A gives rise to a copy a t for each t [0, T ) Z such that t + τ(a) [0, T ). If a = (v, w), then the arc a t points from v t to w t+τ(a). The arcs starting in V t form the set A t. The principle idea is that the static flow on the arc a t = (v t, w t+τ(a) ) represents the flow over time into the arc a from v at time t, which then arrives at w at time t + τ(a). We need an additional treatment for the terminals. For each source s + S + and each t [0, T ) Z, there is an arc a t s pointing from the corresponding + s ++ to the copy of s + at time t, (s + ) t. Similarly, for each sink s S and the same values of t, there is an arc a t s from (s ) t to s. The arcs from S ++ form A ++, while the arcs to S form A. The static flow on these arcs represents the amount of supply released or demand satisfied at each time step. Figure 1.2 shows an example of a time-expanded network. The (static) flow conservation at a vertex v t will correspond to strong flow conservation in the flow over time. To model weak flow conservation, we must introduce holdover arcs, that carry the excess of the vertex v at time t to time t + 1. For this, the set of holdover arcs A T h contains an arc at v for all vertices v V and all t [0, T 1) Z, which points from v t to v t+1. Then A T consists precisely of the union of A t for all t [0, T ) Z and A ++ and A, and optionally A T h if weak flow conservation is allowed. Since the arcs correspond to certain properties of the flow over time, they should reflect the capacities and other bounds on the flow over time. We define the capacity function of the time-expanded network u T : A T R 0 as follows. For an arc a at time t, the capacity is u(a, t), and therefore we set u T (a t ) := u(a, t). Because they equal, we will often use the simpler u(a, t) instead of u T (a t ). The arcs connecting the new sources S ++ and sinks S to the time-expanded vertices have infinite capacity, although this could be limited to the supply/demand of the respective terminal. We also assume that holdover, if it is allowed, is not bounded. Thus, all of A T h also has infinite capacity. (Instead of infinite capacity one could use, e. g., the total capacity of all capacitated arcs in the time-expanded network.) The supplies and demands of the time-expanded network are shifted from S + to S ++ and from S to S, respectively. Their values do not have to be changed, so d T (s ++ ) := d(s + ) for all s + S +, and similar for the sinks. These new vertices are the only terminals in the time-expanded network. Note that we could simplify the construction if we assume (infinite) holdover: In this case, (s + ) 0 could be the source and (s ) T 1 could be the sink for s + and s, respectively, and the holdover arcs provide the functionality of the arcs in A ++ and A. This is, however, not suitable for the case without holdover, and has no advantage for our purposes. Finally, we can relate the costs of the flow over time to the time-expanded network: The cost of an arc a t is directly inherited from the arc a, so c(a t ) := c(a). All other arcs have 0 cost according to the standard flow model. But since we now explicitly account for the flow entering or leaving the terminals, as well as the excess of a vertex, it is easy to attach a cost to these arcs in the time-expanded network if it is needed. The following lemma summarizes the correctness of the entire construction:

29 1.5. FLOWS OVER TIME 29 T = 4 t [3, 4) (s + ) 3 v 3 (s ) 3 t [2, 3) = 1 (s + ) 2 v 2 (s ) 2 s t [1, 2) t [0, 1) s ++ d(s + ) s + (s + ) 1 (s + ) 0 v 1 v 0 τ = 2 τ = 1 v (s ) 1 (s ) 0 s d(s ) (a) A ++ A u(a, 1) u(a, 0) u(b, 2) u(b, 1) u(b, 0) arc a arc b (b) Figure 1.2: (a) A graph G and the time-expanded network G T with the label of the vertices, the new supply/demand function and the time layers. For our purposes it is consistent to consider the new sources S ++ as part of the earliest time layer because the supply is already available at time 0. The new sinks S, however, do not fit into any single time layer because we think of flow units disappearing as soon as they arrive there. (b) The capacities of the arcs in the time-expanded network. The vertical arcs are the holdover arcs A T h, which are only included if weak flow conservation is allowed. The arcs A ++, A, and A T h have no direct correspondence in a flow over time but must be deduced.

30 30 CHAPTER 1. FUNDAMENTALS Lemma Consider a flow over time instance on a graph G with some time horizon T Z 0, and construct the time-expanded network G T for it. 1. Then f is a flow over time on G if and only if x with x(a t ) := f(a, t) can be extended to a static flow on G T. That is, if and only if suitable non-negative flow values can be found for the holdover arcs and arcs out of the sources and into the sinks. 2. If and only if f has strong flow conservation, one can choose x such that no flow is on the holdover arcs A T h. 3. The cost of these flows is the same and they satisfy the same supply/demand function with the discussed mapping between the terminals. Proof. The principal construction should be clear. However, there is some freedom in the static flow values on the holdover arcs of terminals. This can be resolved by greedily using as much of the supply/demand of the terminals as early as possible. Since all of these arcs have zero cost attached, this is as good as any other scheme. So the time-expanded network is a reduction of flows over time to static flows. Thus, we can use the terminology, theorems and algorithms for static flows to discuss flows over time. For example, using linear programming techniques many flow over time problems can be solved. However, static flow problems are special linear programming problems and, thus, more efficient algorithms can be used on the time-expanded network. We can also use path decompositions for flows over time if they are represented on a time-expanded network. From Theorem 1.3 we then obtain a set of paths and cycles on the time-expanded network with associated flow values. If we consider a single path flow x P with value d P on the time-expanded path from this decomposition, then it has a direct interpretation as a flow over time: It is a flow that sends d P flow units uniformly distributed over one time step along the path. (Flow in and out of terminals as well as that on holdover arcs are not explicitly visible but only affect the balance functions.) There is a straight-forward generalization for such path flows in the timeexpanded network: We can use an arbitrary function for the flow rate entering the path, possibly for longer or shorter than a single time step. If this function is -constant again, the resulting flow over time is also expressible in the same time-expanded network, but this is not necessary. To define such a path flow over time properly, let us start with a path over time. Let W = (a 1,..., a k ) be a walk in the original network G, and associate a vector of starting times h R k 0 with it. We use h i to designate when the walk enters a i. This walk is compatible with the transit times τ if h i + τ(a i ) h i+1, that is, the walk does not continue into the next arc before it arrives. While arcs may occur multiple times in a walk, P := (W, h) is still considered a path over time if all the pairs (tail(a i ), h i ) as well as the final (head(a k ), h k + τ(a k )) are distinct. These pairs correspond to the vertices that the walk visits in the time-expanded networks.

31 1.5. FLOWS OVER TIME 31 Let d P : R 0 R 0 be the function of flow into the path, extended to with 0. Then this defines a flow function over time f P as follows: f P (a, t) := d P (t h i ). i : a=a i Assuming enough capacity on the arcs, this forms a flow over time. If the starting times even satisfy the equality h i +τ(a i ) = h i+1 for all i < k, then this flow satisfies strong flow conservation. It is also easy to derive such path flows over time from the path decomposition of the static flow on the time-expanded network by a simple projection to the original arc set. The missing starting times can be read off the time layer at which the static path started using each arc. The corresponding flow function into the path is d P (t) = b P χ [0,1) (t) for some constant b P. (Where χ denotes the characteristic function.)

32 32 CHAPTER 1. FUNDAMENTALS

33 Chapter 2 Earliest Arrival Flows and Their Computation Some problems can only be modeled if their temporal aspects are considered. Without time, the theoretical solution would not be applicable in the real world. But all modeled aspects are potentially part of the objective function, and minimizing the time taken is a natural goal. Indeed, time is the only cost in the mathematical setting of this chapter, where we study flows over time with minimum travel time (MinTravelTime). While mathematical methods are universal, we base our discussion in the following on the overall theme of evacuating a city or a building. This allows us to point out advantages and disadvantages with respect to this specific application. The study of evacuations within the research grant Adaptive Verkehrssteuerung (Adaptive Traffic Control) [1] by the German Federal Ministry of Education and Research also gave rise to this work in the first place. We will mostly consider the case of the MinTravelTime problem with a single supersink, which means that the solutions are earliest arrival flows (EAF): A flow over time is an earliest arrival flow if it simultaneously maximizes the number of flow units that have reached the sink at each time step. Such a flow exists if there is a single sink. Earliest arrival flows are of particular interest for evacuation modeling because they simultaneously minimize the average travel time as well as the egress time. See Section 2.4 for more details on earliest arrival flows. 2.1 Background Modeling and Simulation We start out with an informal discussion of the model that we consider. We need a representation of the area to be evacuated and the areas that are considered safe. This is usually given as geometric data in the form of a street map or the floor plan of a building. To use this information we need to extract a graph from it. This graph will depend on the scale of the problem and the desired resolution: In a 33

34 34 CHAPTER 2. EAF S AND THEIR COMPUTATION macroscopic view, each edge of the graph can represent a street and vertices are intersections. Associated with each arc is the length of the street and a measure of its flow rate capacity, which usually correlates to the width of the street. This is the kind of information we have about the city of Padang, Indonesia, which also happens to be one of the largest instances we consider. See Section for details on this instance including a map of the area (Figure 2.15) and the network (Figure 2.16). If we model a building instead, each vertex of the graph could represent a room or a hallway, so arcs simply represent the doors between them. This might be suitable for easy layouts, but a single vertex per room or hallway is in general too rough considering that rooms can have large obstacles from furniture to decoration, and hallways bend around corners etc. Instead, in a microscopic model only small unproblematic areas could be represented by a vertex, maybe a handful of square meters each at most. The arcs then join vertices which share a common border and have a length equal to the distance of the centers of these areas. Capacities are derived from the size of the shared border. This is the approach taken in the pedestrian simulator ZET [21, 92], from which we also obtain instances. The example in Figure 2.1 contains much less details to be still readable. It might seem odd to model quite small areas in a building, but treat intersections as unimportant in a city. This is mainly due to the way the distances are set up. In a typical street map, the streets define the routes and the lengths, while the exact way through an intersection can mostly be neglected. In a building, though, the size of the intersections (the rooms) is quite large compared to the size of the straight segments which do not require decisions, say the distance between two doors in a hallway. This is only a rule of thumb, though. For example, large open urban areas (plazas) should be treated like rooms that need to be filled with vertices, with each representing only a part of the area. Another key ingredient is the initial position of the population (at least those in endangered areas) and a designation which areas are considered safe. In some settings the size of the safe area is essentially infinite: People leaving a building are likely to find a spot outside on the street, just like people leaving a neighborhood can probably fit in the surrounding neighborhoods. However, people seeking shelter in few selected buildings may find them overcrowded, so these safe areas should be modeled with a limited capacity. Given this input, the natural goal of an evacuation is to minimize the time it takes for everyone to reach safety. There are a few conflicting objectives here, though. For example, there could be a trade-off between minimizing the average travel time and the egress time (the time until the area is fully evacuated). A well-known result, which we will discuss in Section 2.4, states that minimizing the average travel time will also minimize the egress time if all safe areas have sufficient capacities. In any case, the desired result is an optimal flow over time that represents the movement of the people. Note that network flows are not the end-all approach to evacuation planning. They certainly are not a simulation of human behavior. They help to

35 2.1. BACKGROUND (a) (b) Figure 2.1: (a) A fictitious building with occupants. In case of an evacuation, they should assemble outside on the gray area. (b) A graph with medium detail for this building. The vertices represent several square meters each and furniture is not considered. The sources can represent one or multiple evacuees, in which case the number is denoted next to the sources. The sink is located at the top. The arcs are all bidirected. The transit times can be derived from the Euclidean distances. The capacities are typically large for arcs connecting vertices in the same room or outside, and small for arcs leading through doors. evaluate the potential of the most orderly evacuation, though, and can guide decision making. See Section 2.7 for interpretations of the optimum solution. For a more detailed discussion of this and other models, we refer to a survey by Hamacher and Tjandra [47] Connection to MATSim While it is already useful to determine an earliest arrival flow, this was only one step in the Adaptive Verkehrssteuerung project. The computed movement should then be tested and refined in the simulator MATSim [68], which stands for Multi-Agent Transport Simulation Toolkit. MATSim also operates on the same data as we mentioned above. Each individual is called an agent and each agent has a plan, that is, a path to some destination. In the case of an evacuation, the destination is always safety (a supersink), which can be reached through any safe area. The foundations of this approach were laid out by Gawron [38], and then refined by Simon et al. [85]. Lämmel et al. [63] detail the specific application and adaptation of MATSim to evacuation scenarios, with more pointers to the literature as well. We only give a brief outline of MATSim s operation in the following.

36 36 CHAPTER 2. EAF S AND THEIR COMPUTATION MATSim performs many iterations, each consisting of two phases, aiming for an approximate equilibrium in the end. In the first phase of each iteration, called network loading, every agent simply executes its plan by stubbornly following the path. This leads to conflicts, which have to be resolved by MATSim. This happens according to the deterministic queuing model: When an agent enters an arc at the tail, it reaches the head of the arc after the arc s transit time has passed. But instead of being allowed to immediately leave the arc at the head, the agent is placed in a queue. This queue works according to the first-in-first-out principle and resembles a small traffic jam on the arc. Agents leave the queue according to the capacity of the arc, but they may enter at an arbitrary rate. This explains why a queue can build up in the first place. There is also the special case that the queue of an arc is saturated. It has a queuing capacity which is proportional to the entire surface area of the street that the arc represents. A full queue means that the traffic jam fills the entire street. When this happens, no more agents may even enter the arc and thus have to wait on the arcs they are coming from, which blocks the queue there. This is called spillback. Note that the underlying principle of the network loading in MATSim has many similarities to flows over time. Flow units take a constant time to traverse an arc and there are capacities limiting the flow rate out of the arcs. One can also imagine these queues to control the entering of an arc, which would then just mean that flow units wait in vertices. That would be holdover. But a difference lies in the FIFO condition within the queue. A flow over time can send out any flow unit waiting in the storage at a vertex and they are all the same. The agents have a predefined order how they leave the queue and they differ because they have individual plans. (They generally have the same destination in our setting, though.) Another difference is spillback, which is not explicitly modeled in network flows. But this is unlikely to pose a major problem when the agents agree on a general direction. They just have to wait at different places than without spillback. After network loading comes the second phase, called replanning. Each agent evaluates the performance of its plan, in our case the time at which safety was reached. Some randomly selected agents, typically 10 percent in MATSim, then decide on new plans based on the observed traffic situation in the network loading phase. The others keep their plans. This finishes a single iteration, and the next iteration starts with the network loading again. But this time, the agents that were selected in replanning will take different routes to avoid the previously encountered traffic jams. This also alleviates the traffic situation. Intuitively, this resembles the behavior of commuters that know which routes to take and which to avoid based on many days of experience. From a theoretical point of view, such a system leads to an equilibrium state under the assumption that in each iteration only an ε-fraction of all agents are allowed to replan. In game theory, this would be the classical idea of players choosing a best-response to the strategies of the other players. However, MATSim is designed for large-scale problems where each iteration is costly. Thus, a higher percentage of agents is selected for replanning to reduce

37 2.2. RELATED WORK 37 the number of iterations required. This and some other compromises have to be made in practice. Keep in mind that MATSim is an independent project. However, knowing that the output is meant for MATSim had consequences for the research direction of this chapter. First of all, it is clear that we will need a path for each agent, that is, for each flow unit. This is from a complexity point of view already atypical because we are looking for a pseudo-polynomially sized answer in the flow value. Also, MATSim has to reroute each agent when it is selected for replanning. Thus, it performs somewhat expensive operations for each agent as well. In line with this, our algorithm also routes each agent individually, but using the mathematically sound framework of the successive shortest path (SSP) algorithm, which this chapter focuses on. We will see how our solutions affect MATSim in Section Related Work Earliest arrival flows (under the name universal maximal flows), were first studied by Gale [34], shortly after the initial work of Ford and Fulkerson on flows over time [29]. Gale proves the existence of earliest arrival flows in the time-expanded network, even when transit times or capacities are timedependent, as long as there is a single sink (see Section 2.4). Almost all results on earliest arrival flows assume a single sink to guarantee the existence. Algorithms for computing a single-source single-sink EAF were independently developed by Minieka [71] and Wilkinson [90] more than a decade later. These use the successive shortest path (SSP) algorithm on the static network and record the found paths, which are then repeated over time to construct the earliest arrival flow. But these algorithms are not polynomial because the number of necessary paths can be exponential, as shown by Zadeh [91]. Jarvis and Ratliff [51] proved what is known as the triple-optimization result (see Theorem 2.3): In networks with a single sink, earliest arrival flows are exactly those flows that minimize the overall travel time, which again is equivalent to minimizing the average travel time per flow unit. This makes computing an earliest arrival flow much easier in practice because it can be reduced to a MinCostFlow problem on the time-expanded network. This is also the direction we will follow. The modern resource for the theory of earliest arrival flows with a single source and sink is Hoppe s dissertation thesis [49]. Besides an excellent introduction and summary of existing results, it provides an approximation scheme for this case as well as a polynomial algorithm to compute the flow value on a single arc at a specific point in time. His work also includes a major contribution to the field of flows over time: Hoppe and Tardos [50] deal with the QuickestTrans problem of minimizing the time horizon for given supplies/demands with arbitrary terminals, and prove that this can be solved in strongly polynomial time. However, the algorithm depends (among other things) on the minimization of submodular functions. Known algorithms for this problem have an impractical running time, as discussed in a survey on submodular function minimization by McCormick [69].

38 38 CHAPTER 2. EAF S AND THEIR COMPUTATION One special case for which an EAF can be computed in polynomial time are graphs where all transit times are zero. This greatly simplifies the structure of the solution. Namely, the solution is a sequence of temporally repeated maximum (static) flows. While this was first observed by Hajek and Ogier [45], various improvements exist. See Fleischer [26] for a version that can also incorporate piece-wise constant time-dependent capacities, as long as not too many changes occur. Another special case are series-parallel graphs. Ruzika et al. [81] show that an earliest arrival flow on a series-parallel graph with a single source and a single sink can be found in polynomial time. For this, they make use of an existing result that the SSP algorithm on series-parallel graphs does not need to consider residual arcs and therefore runs in polynomial time [6]. The earliest arrival pattern is the function that maps a point in time to the number of flow units that have already reached the sink by that time. Baumann and Skutella [5] consider the problem of computing the earliest arrival pattern for multiple sources and a single sink. Because the desired output is the earliest arrival pattern as a piece-wise linear function, which may require an exponential description, a polynomial algorithm in the typical sense cannot exist. However, they give an algorithm that produces the arrival pattern in time polynomial in the input and the number of pieces of the arrival pattern. Their approach requires submodular function minimization as well. Once the earliest arrival pattern is known, they also show how to reduce computing an EAF to a QuickestTrans problem, for which they can employ the algorithm by Hoppe and Tardos [50] again. Fleischer and Skutella [27] give an FPTAS for the MinCostTransOver- Time problem with a slightly relaxed time horizon. Their approach uses a time-expanded network but balances the discretization step size against the accuracy of the solution to achieve a polynomial-sized network. As we will discuss in Section 2.5.1, our instances do not have exponentially large time horizons, though, and therefore do not benefit from this theory. (Chapter 3 expands this approximation scheme, though.) Tjandra [89] and Tjandra and Hamacher [46] compute earliest arrival flows by adapting the SSP algorithm to the time-expanded network with travel times as the cost function. Their general algorithm is very similar to ours, in that they observe the same simplification of the shortest path computation to a reachability problem on the time-expanded network (see Section 2.5.3). They also give a pseudo-polynomial bound on the running time of their algorithm, which depends on the time horizon because they use a time-expanded network. But the similarities do not last long. A fundamental difference between their and our approach is that they explicitly assume the arc parameters to change at every time step, while our algorithm is built around the assumption of repetition in the time-expanded network and the solution. This allows us to perform various steps more efficiently. Additionally, neither the description of their algorithm nor their implementation lodyfa [66] make use of the many improvements that we detail in Section We think that these improvements are crucial to making the algorithm perform well, as shown in Section 2.6.6, where we also compare our algorithm to lodyfa.

39 2.3. OUR CONTRIBUTION 39 Because any algorithm for the MinCostFlow problem can be used on the time-expanded network to compute an earliest arrival flow, we compare our algorithm against existing MinCostFlow algorithms. The main competitors that we settled on are the network simplex and a scaling push-relabel algorithm. The network simplex is an adaptation of the simplex algorithm to MinCostFlow instances. See, e. g., Ahuja et al. [2] for a description. The scaling push-relabel algorithm iteratively approximates the desired cost function, and in each iteration uses a push-relabel (also called preflow-push) algorithm to reestablish a feasible flow, which is optimal for the current cost function. See Goldberg [39] for the details and also for heuristic improvements. See Section for the technical details of the implementations we chose. 2.3 Our Contribution We have already introduced some typical models for evacuations and the MinTravelTime problem. In the following, we discuss the ramifications of the MinTravelTime problem and the properties of earliest arrival flows. Our main focus then lies on developing a fast algorithm to compute earliest arrival flows. The central point is to exploit the repeating structure of the time-expanded network, which previous algorithms do not explicitly consider. We discuss and evaluate various improvements which ultimately lead to our algorithm performing best on the largest real-world instances that we have. In particular, our algorithm outperforms the state-of-the-art competitor on highly-detailed models of buildings by a factor of up to 20. It is therefore ideal for evacuation planning tools like ZET [92] that focus on such instances. The performance improvements can be used to evaluate multiple scenarios for the same instance, for example, with added hazards or different populations. We point out technical considerations as well so that future implementations can more easily benefit from the same performance level. Possible additional features are also discussed so that the algorithm can be applied to a larger range of models. Finally, we show how earliest arrival flows can benefit simulation-based approaches like MATSim. We also reduce the rather detailed network flow to an exit assignment. Displayed like this, the solution can also be interpreted as a first draft for an evacuation plan and quickly lets us identify areas where, even in the optimum, bottlenecks limit the evacuation. 2.4 Preliminaries So the goal is to compute a flow that minimizes the total travel time, and we will then see that this also minimizes the egress time if the problem has a single supersink or unlimited sinks. Recall that a flow over time f can have an associated arc cost c and a total T 0 cost(f) which is given by a A c(a)f(a, t)dt. Setting the cost equal to the transit time, i.e., c : τ, at first glance captures the transit times as desired. However, this only counts flow in transit, not flow waiting at intermediate vertices or sources. Thus, this kind of cost is more accurately described as total

40 40 CHAPTER 2. EAF S AND THEIR COMPUTATION time moving, or total travel distance (assuming that distance and travel time are proportional). Travel time, however, should include time spent waiting, in particular the time spent waiting due to congestion in the network. We want to capture this total travel time in the time-expanded network (for = 1). For this, we put a cost on every arc that corresponds to how much this arc displaces flow in time. For normal arcs, this would be the transit time. Holdover arcs, if they are allowed, take flow from one time layer to the next so should have cost 1. The arcs A ++ from the sources S ++ to the normal time-expanded vertices have a cost equal to the time layer at which they end. The supply is already present at time 0 and can wait in the source, but this is waiting time just like holdover. Once a flow unit reaches a time-expanded copy of a sink, it is immediately thought of as having arrived. In the timeexpanded network the flow still has to continue through the arcs A to the sinks S, but these should not produce any more travel time, so they have cost 0. For most of this chapter we only consider strong flow conservation, and defer the discussion on adding holdover to Section 2.8. Holdover can always be added in form of loops with transit time 1, though. With this, we can finalize the definition of the MinTravelTime problem we want to consider in this chapter: Problem 2.1 (MinTravelTime). Input: A digraph G = (V, A), time-dependent 1-constant capacities u : A R 0 Z 0, transit times τ : A Z 0. Sources S +, sinks S, and a matching supply/demand function d : V Z. A time horizon T Z >0. The cost function is constructed as above for the time-expanded network G T. Question: What is the minimum cost of a maximum flow in G T (without holdover arcs)? Note that this is close, but not the same, to asking for a minimum cost maximum 1-constant flow over time with these parameters. Besides that the domain of the cost function does not include arcs that represent waiting times, which could be redefined, the flow units in a 1-constant flow over time leave the sources on average half a time unit later than the cost function in the time-expanded network suggests. This is because a time layer represents an entire interval [t, t + 1), but the cost function only charges according to the lower bound t. One could easily adjust for this discrepancy in post-processing without affecting optimality, though. We work with the given definition of the cost function on the time-expanded network throughout this chapter because this is the one commonly used by other authors and also more suitable for algorithms. The MinTravelTime problem can also be expressed in another, very convenient cost function. This alternative is to simply put all costs on the arcs A leading to the sinks S, as if one was waiting at the finish line with a stop watch. The artificial arcs leading to the new sink at time t will then have cost t. (Again, they should have cost t ) Any other arc has cost 0. The two cost functions are also shown in Figure 2.2.

41 2.4. PRELIMINARIES τ(a) τ(a) 1 τ(b) τ(b) 1 1 τ(b) τ(a) = 2 τ(b) = 1 (a) τ(a) = 2 τ(b) = 1 (b) Figure 2.2: (a) The cost function for MinTravelTime that puts the cost on the arcs that incur the travel time. (b) The cost function that puts the cost on the arcs A to the sinks. Note that the holdover arcs will not be present in most of what follows.

42 42 CHAPTER 2. EAF S AND THEIR COMPUTATION Earliest Arrival Flows Earliest arrival flows are a central concept in the mathematical literature dealing with evacuations. They settle the perceived conflict between minimizing the total (or average) travel time and the time horizon by showing that there is no conflict if there is only a single sink. (This single sink could also be a supersink for multiple unlimited sinks.) For a flow x on the time-expanded network and t Z 0, let arrival(t) be the amount of flow that reaches any sink s S through an arc starting at a time layer 0,..., t. So arrival(t 1) is the total amount of flow reaching the sinks within the time horizon. The function t arrival(t) is called the arrival pattern of this flow. We use arrival (t) := arrival(t) arrival(t 1) (with arrival( 1) = 0) to measure the flow arriving exactly from time layer t in the time-expanded network. When computing the total travel time, we note that cost(x) = t Z t arrival (t). One can imagine what an ideal arrival pattern would be like: At each point in time, arrival(t) is the maximum over all feasible flows obeying the given supplies. Because flow units arriving later can be omitted, this is equivalent to requiring that arrival(t) is the maximum flow value for a flow over time with time horizon t + 1. In particular, this implies that the first and the last flow unit reach the sink as early as possible. A flow having this ideal arrival pattern is called an earliest arrival flow. Definition 2.2 (Earliest Arrival Flow (EAF)). A flow on the time-expanded network G T is called an earliest arrival flow if for all t {0, 1,..., T 1} its arrival pattern arrival(t) equals the maximum flow value in G t+1. Let us show the existence of an earliest arrival flow using the SSP algorithm. The proof also shows how the SSP algorithm works when we use it to compute an EAF in the following sections. Theorem 2.3 (Gale [34], Jarvis and Ratliff [51]). Let G = (V, A) be a digraph with time-dependent 1-constant capacities u : A R 0 Z 0, transit times τ : A Z 0, time horizon T Z >0, supplies/demands d : V Z with only a single sink. Then a flow is an earliest arrival flow if and only if it is an optimum solution to the MinTravelTime problem. Proof. Consider the time-expanded network G T. The time-expanded network also introduces a single sink, s. (There is no need for a supersink.) We use the equivalent cost function that places the cost only on the arcs A leading directly to the new sink in the time-expanded network. Recall that for a flow x with maximum value, cost(x) = t t arrival (t). We show that an earliest arrival flow minimizes this cost function and is therefore an optimum solution for the MinTravelTime problem without flow theory. Let val be the maximum flow value. We can rewrite the cost function by charging the cost of each flow unit arriving at time t to the t earlier time steps {0,..., t 1}. Then

43 2.4. PRELIMINARIES 43 cost(x) = = = = t arrival (t) t=0 t 1 arrival (t) t=0 θ=0 θ=0 t=θ+1 arrival (t) (val arrival(θ)). θ=0 But an earliest arrival flow minimizes each term in the last sum independently by definition. Thus, it has minimum cost. For the other direction, we need to show that arrival(t) is the maximum flow value for the respective time horizon t + 1. Let us run the SSP algorithm on G T and record the paths P 1,..., P k it produces in its iterations, together with the amount of flow augmented d 1,..., d k. By Theorem 1.6, the costs of the paths increase monotonically. The SSP algorithm yields the flow x with minimum total travel time, solving MinTravelTime. Now consider some t {0, 1,..., T 1} for which we want to show the maximality of the arrival pattern of x. When we want to compute the maximum flow value for time horizon t + 1, we could use the Ford-Fulkerson algorithm on the time-expanded network G t+1. Let P k be the path before the first path that uses an arc (or its residual) not contained in the smaller timeexpanded network. (If there is no such path, the SSP algorithm already shows the maximality of the flow.) Then the paths P 1,..., P k with the corresponding augmentation values are valid choices for the Ford-Fulkerson algorithm. However, the path P k +1 cannot arrive within time horizon t + 1 because it is the first path that uses an arc in G T that is not in G t+1. Thus, it would require a residual arc to return to the earlier parts of G T. The flow established has time horizon t + 1, though, so there is no residual arc beyond time layer [t, t + 1). So the cost of P k +1 is at least t + 1, meaning there is no more path in G t+1 x according to the SSP algorithm. Thus, the Ford-Fulkerson algorithm could terminate after P k and the flow found has maximum value for that time horizon. So we have constructed some flow x that minimizes the total travel time and is an earliest arrival flow. Because the earliest arrival pattern is the unique optimum solution to the MinTravelTime problem, any other optimum flow must also have the earliest arrival pattern. While most parameters in the theorem above can be chosen quite freely, the crucial condition for the existence of an earliest arrival flow is that there is only a single sink. Even the small example in Figure 2.3 does not admit an earliest arrival flow. The same instance shows that minimizing the total travel

44 44 CHAPTER 2. EAF S AND THEIR COMPUTATION 1 u 1 τ on arcs d on vertices 2 travel time 4 T = travel time 3 T = 4 (a) (b) Figure 2.3: (a) An example with multiple limited sinks that does not admit an earliest arrival flow. The solution to the MinTravelTime problem does not have the minimum time horizon either. Note that such travel times can arise from Euclidean distances. (b) The two integral solutions for this example. The earliest arrival pattern would consist of one flow unit arriving immediately and one after two time steps. But this combination is impossible. time in the case of limited sinks does not guarantee that the time horizon is minimal. From this it should be clear that instances with a single sink have special properties. It is worth noting that single-source single-sink instances admit an even more structured solution. In this case, the SSP algorithm can be applied to the static network with the transit time as the cost function. The paths found in the residual networks are recorded with their bottlenecks and then flow is sent through them repeatedly, this time considering the temporal dimension. That is, if a path has length τ(p ), then flow at a rate of the recorded bottleneck is sent into this path during the interval [0, T τ(p )). By construction, the last flow units arrive just within the time horizon T. The sum of all these path flows over time is then an earliest arrival flow [71, 90]. 2.5 Interval-Based Successive Shortest Paths The purpose of the following sections is to specialize the existing SSP algorithm to compute earliest arrival flows in less time and using less memory than standard approaches, in particular for fine discretizations and large time horizons. The main ingredient is to store the required data not in an array for each time step, but using intervals of constant values in a suitable data structure.

45 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 45 This enables a different algorithmic approach which performs the shortest path search directly on these intervals. Efficient data structures and the addition of various heuristic improvements lead to an acceptable algorithm. What refinements we tried is the topic of Section 2.5.5, while the computational results in Section 2.6 summarize their effectiveness Assumptions Our algorithm builds upon several assumptions and has a few restrictions as well. We want to make these clear in this section. Most importantly, we restrict ourselves to the basic setting of singlecommodity network flows over time as introduced in Section 1.5. We further restrict ourselves to instances with a single supersink or, equivalently, multiple unlimited sinks. However, Section shows how to use many of the ideas of our algorithm in a setting with limited sinks. There are a few quantitative assumptions as well. These try to capture what a typical instance of an evacuation model is like and the algorithm is tailored towards them, but they are not actual restrictions on the input. While there can only be a single supersink, we assume that there are multiple sources as these represent the starting positions of the evacuees. We also assume that the arcs have constant transit times and capacities over time, but with the important exception that each arc only exists during a (possibly infinite) interval. We call an arc available during this interval. To put this in our framework of flows over time, an arc a with constant capacity u that only exists during the window W(a) = [t 1, t 2 ) R 0 {+ } has a capacity u(a, t) = u for t W(a) and 0 elsewhere. Consequently, we can model arcs where the transit times or capacities change over time but each change requires an additional arc in the input. We assume that there are few of those changes. For example, every arc might have a time after which it becomes slower to traverse due to spreading hazards. Note that the data structures can be changed to consider more changes per arc more efficiently. This is touched upon in Section We argue that realistic instances contain mostly sparse graphs, that is A O( V ). This is intuitively true for road maps, where intersections rarely join more than 4 streets. Floor plans of buildings, no matter whether they are represented as a vertex per room or a fine grid, are planar graphs with only few exceptions to link different levels of the building. Planar graphs in general have an average degree of at most 6 (in the undirected sense and without parallel arcs). Such instances are sparse as well. Furthermore, the total supply, d(v)>0 d(v), should not be more than O( V 2 ) in most instances. Note, that we use the O-notation loosely here to put the supply in perspective with the input size. This assumption deals with a suitable scale for the problem. This is also the topic of Section For now, it suffices to realize that (interesting) instances probably have at least a thousand vertices. The entire population of the earth is then at most 7 V 3, which fits well into O( V 3 ).

46 46 CHAPTER 2. EAF S AND THEIR COMPUTATION But no global-scale model with only a thousand vertices could claim to describe every single person in detail. Considering evacuees in groups of thousands would probably be just as precise and actually a more useful amount of output. This results in a total supply that one can describe as O( V 2 ). A similar observation holds for a city. The finer the model is, the less people live there per vertex. (A city of millions likely has thousands of intersections.) But the rougher the model, the less need there is to model individuals; the geography in such a rougher model is already abstracted as well, so the movement of a single person cannot be tracked precisely anyway. Note that instances describing buildings are even likely to describe only a few thousand evacuees, a linear term in the graph size. The assumption of a reasonably small total supply is crucial for the SSP algorithm because its worst-case running time contains the total supply as a factor. But given these numbers, it is hard to imagine a real-world instance where a truly exponential number of evacuees starts on a vertex. A side effect of the small supply is that the time horizon is typically not exponential either. Given that severe bottlenecks occur in most evacuation scenarios, once flow units start to reach the sink, at least some flow will arrive at every following time step as well. This roughly bounds the time horizon by the distance from some source to the sink plus the total supply. This is favorable for existing algorithms that depend on the full time-expanded network Storing Intervals There are multiple types of time-dependent data that we need to store for our purposes, for example, flow rates and vertex labels in the time-expanded network. Let us consider some 1-constant data D : R 0 W that either ends or is constant after some time horizon T. In general, we can group such data into piecewise constant parts [t 1, t 2 ) w. Since the intervals partition the R 0, we just need to store the lower bound for each interval and the associated value and can then reconstruct the entire function. From an algorithmic point of view, the uncompressed data would usually be stored in an array, which allows for O(1) read/write access. This speed cannot be matched when the function is stored in a condensed form and the performance must decrease when the number of intervals stored increases. In this chapter, we will use D to denote the number of intervals used to store the function D, which is trivially bounded by D T. There are several well-known data structures that can be used to provide O(log D ) access to a data collection containing key-value pairs (t, w), in particular search-trees can do so. Our most common operation, however, is to find the interval [t 1, t 2 ) containing a given t. Search-trees can easily be modified to find the element with the largest key t 1 t. Insertions and deletions in the tree can also be done in logarithmic time. These happen, when a change to the function splits an interval into up to 3 resulting parts or joins up to 3 consecutive intervals into one. An important aspect for us is also the ability to iterate over the intervals in their natural order. Each step to the next interval can be done in amortized constant time in search trees.

47 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 47 Intuitively, storing the flow over time function as a piece-wise constant function for each arc trades computation time for memory: Most likely, the flow will not change at each time step, so there are significantly less than T intervals to store. But we pay for this with logarithmic access times. This by itself would probably not be a good trade-off in most situations. But storing data in intervals enables many of the following ideas Shortest Paths in the Time-Expanded Network Given the special relationship between the time-expanded network and the travel time cost function on it, we can simplify the shortest path computation. For notational reasons, let us consider the natural cost function which places costs on each arc except the arcs A leading to the sinks. Then the distance from the supersource s to a time-expanded vertex v t is always t in the time-expanded network if v t is reachable at all. This is the direct effect of the cost being equal to the travel time. However, this may not be true in the residual network, even for optimum flows (without negative cycles) as they occur in the SSP algorithm if multiple limited sinks exist. In the original network, the distance of a sink s cannot affect the distance of any v t because there are no arcs from the sinks back to the network. However, in the residual network such arcs may exist. But these arcs do not need to be considered during the SSP algorithm if there is a single sink. The following lemma simply excludes sinks altogether: Lemma 2.4. Consider the time-expanded network G T with cost equal to the travel time on all arcs. For a flow x on the time-expanded network, consider a walk W in the residual network G T x from the supersource s to a vertex v that does not use a vertex in S. If v S ++, then cost(w ) = 0. If v V t for some t, then cost(w ) = t. Proof. This can be proven by induction along the arcs in the path. The residual arcs have to be considered as well. In particular, not all sources in S ++ may be directly reachable from s, but any backward arc of an arc in A ++ goes from (s + ) t to s ++ with cost t. A more detailed explanation why the residual arcs pointing away from the sinks S are never used in the SSP algorithm with unlimited demands is in order: Lemma 2.5. Suppose x is a minimum cost flow on the time-expanded network for the travel time cost function. Suppose the shortest s -s = -path in G T x contains a sink s but not the arc a = (s, s = ). Then either the path can be shortened using arc a without increasing the cost, or x(a) = u T (a), meaning that the demand of the sink has been satisfied. Proof. Let P be this path. Suppose that the arc a is not saturated. Since it has cost 0, any s -s -path gives rise to an s -s = -path of the same cost by appending a. If P has at least this cost, then P can be shortened. In the

48 48 CHAPTER 2. EAF S AND THEIR COMPUTATION other case, P has a strictly lower cost than the shortest s -s -path, so its tail from s to s = must have negative cost. This tail must use a residual arc from s back to some (s ) t, so there already is flow coming into s, which can only continue along a towards s =. Thus, u T x ( a ) > 0. This closes a negative cycle from s along P to s = and then back to s in the residual network, which contradicts the optimality of x. If we assume that all sinks have unlimited demands, the algorithm will never have to consider a path that leaves s again. Thus, Lemma 2.4 applies to all considered paths and the shortest path computation becomes a mere reachability problem: A vertex has either distance according to the lemma or is not reachable at all from s without passing through a sink. One still needs to keep track of the distances of the sinks, but this is the same as tracking the earliest time t at which (s ) t is reachable. Of course, the same effect can be seen if one puts the costs on the arcs to the sinks: The paths admitted in the lemma use only arcs with cost 0 in that case Propagating Intervals The most expensive operation in the SSP algorithm is the shortest path computation, which also happens very often. We have already seen the properties of the special cost function and how this changes the problem to a reachability problem. We can also store data in intervals, in particular the flow on the time-expanded arcs is stored in intervals. We now try to improve the speed of this search by putting these pieces together. Forward Search For what we call the forward search, we assume that the search starts in the supersource and propagates reachability in the time-expanded network, for example, using a breadth first search. Our algorithm maintains for each vertex v in the time-expanded network a distance label label(v) Z, which represents an upper bound on its distance dist(v) from s in G T x. We store these labels in intervals for all time-expanded copies of the same vertices. Note that to represent this as a piece-wise constant value we cannot store t, t + 1, t but rather store a boolean flag whether the vertex is reachable, which suffices by Lemma 2.4 except for the sinks S and the supersink s =. For those labels, we do store integer values. This is easy because these vertices have no time-expanded copies. The vertices from which we want to propagate make up a task for the search. In our setting, a task is either an interval of time-expanded vertices v t 1 to v t 2, which we write v@[t 1, t 2 ), or one of the special vertices v {s, s = } S ++ S. A task always signifies some vertices that have been marked as reachable and are therefore eligible for propagation to other vertices. There never occurs a task for s = in the forward search, though. For example, consider the task v@[t 1, t 2 ). For brevity, let I = [t 1, t 2 ). The task implies that v t has been reached for all t I Z. If there is an arc a = (v, w) in G with transit time τ(a), then reachability can be propagated

49 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 49 task destination arc type capacity comment from supersource s s ++ forwards unused supply never used backwards of s ++ from/to sources s ++ s ) forwards s s ++ backwards flow out of undoes flow out of the source, breadcrumb not unique regular arcs v@t w@t + τ(a) forwards u x (a, t) with a = (v, w) w@t v@t τ(a) backwards x(a, t τ(a)) with a = (v, w) to sinks s forwards label(s ) decreases to t, never used backwards with unlimited sinks to supersink s s = forwards label(s = ) decreases to label(s ), never used backwards Table 2.1: Propagation of reachability in the forward search for the various classes of arcs in G T x. The regular arcs and the steps into a not yet reached (therefore empty) source work on an entire interval at once. However, stepping into a source requires only a single time at which flow can be undone. along a t = (v t, w t+τ(a) ) for all t I Z, subject to the residual capacity of this arc. We only need to determine the intervals describing all t for which u x (a, t) > 0. (Recall that we avoid the cumbersome notation u T x (a t ) for regular arcs.) The data structure storing x(a t ), equivalently f(a, t), should support this operation. We have to consider the availability window W(a) as well. Say, the capacity is positive for some intervals I1 u,..., Iu k. Then the intersection of I with i Iu i yields the times at which the arcs can be entered. Shifting all these intervals up by τ(a) results in the times when w is reachable due to v being reachable at I. The distance label of w can then be set to reachable where it was not before. Similarly, one can propagate reachability across the backward arcs a t by subtracting the travel time from the reachable interval. The other kinds of arcs in the time-expanded network behave slightly different because they do not propagate an interval to an interval, but rather between a single distance label and an interval, e. g., a source s ++ and the time-expanded vertices {(s + ) t } t. All these cases are listed in Table 2.1 and the major ones are also illustrated in Figures 2.4, 2.5, and 2.6. Note that tasks for time-expanded vertices often involve multiple cases, like propagating along various arcs and back into an empty source.

50 50 CHAPTER 2. EAF S AND THEIR COMPUTATION v@[t1, t2) [t1, t2) + τ(a) w@[t1, t2) [t1, t2) τ(a) v τ(a) w w τ(a) v (a) (b) Figure 2.4: How the forward search propagates along (a) time-expanded arcs and (b) their backward arcs, using the respective residual capacities. s ++. [0, ) s t2). s + (a) s + (b) Figure 2.5: (a) The forward search propagates out of a source to an unlimited interval. (b) Undoing flow leaving a source only matters if that source has run out of supply. In this case we arbitrarily choose to reverse the earliest flow leaving the source.

51 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 51. t2) s Figure 2.6: The forward search updates a sink s with the earliest time t 1 in the task 1, t 2 ). s Propagating intervals like this can still result in a regular breadth first search. Assuming that intervals are processed according to a first-in-first-out queue, all time-expanded vertices in the same interval were reached from the same interval, and thus have the same distance in the number of arcs to the supersource, which characterizes a breadth first search. Note that to obey a time horizon one only needs to limit the upper end of all intervals. The given time horizon can also be very large, essentially infinite. We generally rely on the intervals to quickly fill the time-expanded network past the current arrival time or to find a feasible path to bound the length of the shortest path. (See Section 2.5.5, Ending the Search Early or Late, on this.) This has worked well so far, but one can easily craft instances that lead to arbitrarily long paths when arcs are unavailable for a long time, see Figure 2.7. The same idea shows that infinite loops on infeasible instances could occur if it was not for the time horizon. A better approach would be to use a smaller time horizon for each shortest path search that is still large enough to not impair the search. For example, if all arcs are available throughout time, one can use the most recent arrival time plus the shortest path between a not yet depleted source and a sink in the empty network as a time horizon. Theoretically, if T is the optimum time horizon, this gives a bound of 2T on the number of layers in the timeexpanded network that ever need to be considered. For real-world instances, adding a constant to the current arrival time should be just as good. If arcs can become unavailable, similar bounds can be derived from the time when the last arc becomes unavailable and the sum of the arc lengths. Such a bound does not relate to T, though. We want to add more information to the vertex labels: Each label can have an additional boolean flag scanned that describes whether it has been propagated yet. This is not necessary in a breadth first search, as the unscanned labels are exactly those stored in the queue, but we deviate from this concept later on. We also store a reference where from the label was reached,

52 52 CHAPTER 2. EAF S AND THEIR COMPUTATION T.... W = [0, 1) W = [T 1, T ) Figure 2.7: The first and last arc are mostly unavailable. This trivially causes arbitrarily long paths like the yellow one. Note that we omitted the many task intervals in the ladder. If the final arc would never become available again, this could potentially lead to an infinite loop if the time horizon is unlimited. Allowing holdover would help a lot in this case, though. See Figure 2.12 for a similar example that does not rely on the limited availability of arcs and where holdover gives no benefit either. which we call breadcrumb. Typically, this is arc a that was used to reach this vertex in the search, but the exact time-expanded copy used can be inferred. Breadcrumbs are only used to facilitate backtracking once the sink has been reached. Note that the previous vertex is not uniquely determined in all cases. For example, undoing flow out of a source propagates an entire interval of equivalent candidates onto a single vertex, compare Figure 2.5b. During backtracking, the predecessor could be freely chosen from all possible vertices in the interval. We choose the earliest candidate. The basic steps of the forward search are summarized in Algorithm 2.1, which is essentially a generic search with a lot of case distinctions to handle all kinds of arcs in the time-expanded network. However, it serves as a basis for later when we add improvements. Note that the propagation step along arcs (and also along backward arcs) in line 12 consists of two operations, as discussed above: The intersection of the task interval with the intervals describing the residual capacity of the arc, and then again intersecting this with the not-yet-reached intervals of the target vertex. This is where most of the work happens. Everything else contributes little to the running time of the algorithm. Let us point out the running time of the propagation along a single arc in the original network in the following lemma. This depends on how many intervals are needed to describe the flow on that arc, how the reachability of its tail vertex is discovered in the search, and similarly for the head vertex. This information is only available after the path search.

53 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 53 Algorithm 2.1: The basic forward search from the supersource. input : Time-expanded network G T and flow x. output: A shortest path from s to s = if it exists, or 1 Initialize task queue Q := {s } 2 label(s ) := 0, all others 3 while Q do 4 get task from Q 5 switch kind of task do 6 case task is s 7 label(s ++ ) := 0 for all s ++ S ++ with supply left 8 case task is s ++ 9 propagate to s ); 10 case task is v@[t 1, t 2 ) 11 forall the a = (v, w) A do 12 propagate to w@[t 1, t 2 ) + τ(a), obeying u x (a, ) 13 forall the (w, v) A do 14 propagate to w@[t 1, t 2 ) τ(a), obeying u x ( a, ) 15 if v = s + S + then 16 propagate to s ++ if flow leaves s ++ in [t 1, t 2 ) 17 if v = s S then 18 propagate to s with distance t 1 19 case task is s 20 propagate to s = with distance label(s ) 21 Q := Q {relabeled relabeled intervals} 22 if label(s = ) = then return 23 P := backtrack path from s = 24 return P Lemma 2.6. Consider a = (v, w) A. Let K a := f(a, ). Assume that there are K v tasks for v during the run of the path search, and K w for w, respectively. Then the algorithm for propagating the intervals along the time-expanded copies of a can be designed to have an amortized running time of O(K v + K w + K a + K v (log(k a ) + log(k w ))). (The same holds for a.) Proof. We can break the work into the different aspects of the propagation. First of all, the data structure for the vertex labels of v does not need to be accessed to propagate [t 1, t 2 ) from v. The interval itself is already known. (If we still want to access the label, the task could include a reference to the vertex label so that it can be accessed in O(1) without needing to search for it in the data structure.) There certainly is some constant effort to start the propagation of the interval, which sums up to K v for all tasks. Then we need to determine for which subset of [t 1, t 2 ) the arc has residual

54 54 CHAPTER 2. EAF S AND THEIR COMPUTATION capacity. This requires accessing the flow at time t 1 and searching from then until t 2. The access requires O(log(K a )), so O(K v log(k a )) for all tasks. The search until t 2 can be done in amortized linear time because the tasks are for disjoint intervals because each vertex label is only propagated once. So this takes O(K a ) in total. A similar argument shows that we need O(K w +K v log(k w )) to access and scan through the label of the head vertex, and potentially create new intervals and set reachability flags during the scanning. We do not give an overall worst case running time for the search, though. While it is common knowledge that a breadth first search runs in a linear number of propagation steps, note that here this means linear in the size of the time-expanded network. A good way to think of the breadth first search is that it needs exponentially many steps in the length of the path to be found as it guarantees to check all shorter paths first, and we have seen that paths can be arbitrarily long. The considerations for the bound on the time horizon and thus the time-expanded network still hold, though. We consider alternative search strategies in Section under Guided Search. Let us note that on many instances the breadth first search performs reasonably well, though. Reverse Search In the forward search, the search always starts at the supersource. As a variant, we can instead start searching from the supersink s = using a reverse search. The goal is to reach the supersource from the supersink with a shortest path which uses all arcs in their opposite directions. Assuming there is an arc (v, w) and w was reachable, then the reverse propagation would mark v as reachable and the path would later use (v, w). Thinking strictly in terms of reachability, both the forward and the reverse search can find a lot of undesired paths between the supersource and the supersink, namely those that travel through too late time layers. The forward search corrects this by updating labels on the sinks regarding the shortest path found so far. The reverse search could manage integral distance labels as well, but it would have to keep track of such labels on every vertex in the time-expanded network except the sinks. This is undesirable compared to the much simpler forward search. The desired meaning of reachability in the reverse search is that there is a path to the supersink that arrives at the correct time. The conundrum is that we need to know the correct shortest path distance, dist(s = ), before the search starts. To circumvent this problem, we work with a guessed value of the shortest path distance which we denote with t? and call estimated arrival time. Recall from Theorem 1.6 the property of the SSP algorithm that the shortest path distance never decreases from one iteration to the next. Thus, we assume that t? is no less than this lower bound on the arrival time. By default, we even assume that the shortest path distance has not changed since the last iteration. Consequently, whenever the distance has increased, the reverse search will not return a path because there is none that arrives at t?. In that

55 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 55 case, one could increase t? or even do a combination of a binary or geometric search to determine the correct value of t?. One can also fall back to the forward search to determine the correct arrival time as this requires only a single additional search, which can find the correct arrival time or prove that the flow has maximum value. Also, the paths returned by the forward search can be used just as well. We decide what will happen with a parameter. After a certain number of consecutive increments of t?, the forward search will be used if still no path could be found. This parameter might also be set to trigger immediately after the first failed reverse search. The decisive step to use the estimated arrival time t? is when a vertex s S propagates back to a time-expanded copy of a sink, (s ) t. Intuitively, only (s ) t for t = t? needs to be marked reachable. This is equivalent to the almost trivial interval [t?, t? + 1). If any earlier copy of s contributed to a path, this path would arrive at the sink earlier than the arrival time from the most recent iteration, which is not possible. However, this proved troublesome because of a certain fill-in effect: Suppose there is a short cycle, say with length 1, consisting of only forward arcs. Since the propagation started out with an interval of length 1, the propagation would traverse this cycle in reverse for all time steps, one time step at a time, until it reaches time 0. Not only is this rather inefficient with the way the intervals are stored, it is also unnecessary. If we start out by marking the sinks reachable in the entire interval [0, t? + 1), such a cycle would run into already labeled vertices after the first round. Therefore, we define a vertex as reachable for the reverse search when there is a path from this vertex to the supersink that arrives no later than t?. The actual propagation is then rather similar to the forward search. The general guideline is that the reverse search treats an arc a like the forward search would treat a, and vice versa, except for the capacities. We refer to Table 2.2 and Figures 2.9, 2.8, and 2.10 for details on the various propagation steps. The cases are assembled in Algorithm 2.2. Note that we use label to denote the labels in the reverse search.

56 56 CHAPTER 2. EAF S AND THEIR COMPUTATION task destination arc type capacity comment from supersink s = s forwards label(s ) := t?, never used backwards from sinks s t? + 1) forwards t? = label(s ), never used backwards with unlimited sinks regular arcs w@t v@t τ(a) forwards u x (a, t τ(a)) with a = (v, w) v@t w@t + τ(a) backwards x(a, t) with a = (v, w) from/to sources s s ++ forwards breadcrumb not unique s ++ s ) backwards flow out of undoes flow out of the source to supersource s ++ s forwards unused supply of s ++ never used backwards Table 2.2: Propagation of reachability for the various classes of arcs in G T x in the reverse search. The regular arcs and stepping into a source propagate an entire interval accordingly, although for stepping into a source any point in the interval suffices.. s. t? + 1) s Figure 2.8: The reverse search propagates out of a sink to the interval [0, t? + 1), where t? is the estimated arrival time. Actually, [t?, t? + 1) would suffice but tends to be slower in practice.

57 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 57 t2) [t1, t2) τ(a) t2) [t1, t2) + τ(a) w τ(a) v v τ(a) w (a) (b) Figure 2.9: The propagation in the reverse search along timeexpanded arcs. (a) A forward arc can be used according to its residual capacity at the relevant time at the destination vertex v. (b) Backward arcs use the time at the task vertex v to determine their capacity. s t2). s ++. [0, ) s + (a) s + (b) Figure 2.10: (a) When the reverse search steps into a source, it can always propagate the earliest time of the interval because these are unlimited capacities (but any other time would also work). (b) When stepping out of the source, it has to undo existing flow.

58 58 CHAPTER 2. EAF S AND THEIR COMPUTATION Algorithm 2.2: The basic reverse search from the supersink. input : Time-expanded network G T and flow x, arrival time t?. output: A path from s to s = with length t? if it exists, or 1 Initialize task queue Q := {s = } 2 label(s ) := t?, all others 3 while Q do 4 get task from Q 5 switch kind of task do 6 case task is s // nothing to do 7 case task is s ++ 8 if s ++ has supply left then 9 If s not yet reached, label(s ) := 0 10 else 11 propagate to s t + 1) for all t when flow leaves s + 12 case task is v@[t 1, t 2 ) 13 forall the a = (w, v) A do 14 propagate to w@[t 1 τ(a), t 2 τ(a)), obeying u x (a, ) 15 forall the a = (v, w) A do 16 propagate to w@[t 1 + τ(a), t 2 + τ(a)), obeying u x ( a, ) 17 if v = s + S + then 18 propagate to s ++ with infinite capacity 19 case task is s 20 propagate to t? + 1) 21 case task is s = 22 label(s ) := t? for all s S 23 Q := Q {relabeled relabeled intervals} 24 if label(s ) = then return 25 P := backtrack path from s 26 return P

59 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 59 Mixed Search Finally, there is one more possibility that we want to examine: One can search simultaneously from the supersource and the supersink, running the forward and reverse search interleaved. We call this the mixed search, though it is usually called bidirectional search in the literature. Again, we need a guess of the correct arrival time for the reverse search to be useful. (Technically speaking, for a wrong estimate this is not a bidirectional search as the search does not start at the goal vertex, but it remains a mix of two search strategies.) Each vertex also needs two labels, one for the reachability of each search. The searches could be performed transparent to each other, but when a vertex is labeled as reachable by both searches, this closes a shortest path. Therefore, no further tasks are created from such an overlap. If this does not happen (because the reverse search was supplied a wrong arrival time t? ), the forward search will eventually reach the supersink and update the label there. At this point in the algorithm, the reverse search is not restarted, but rather the forward search finishes as usual and this supposed bidirectional search degrades to a forward search and an unsuccessful reverse search. The algorithm for the mixed search contains nothing new, except that it has to handle all cases from the forward and reverse search. It also needs some more elaborate labeling on each vertex and in the task queue to distinguish the searches and to detect when an interval has been labeled by both searches. The final algorithm that we present combines the forward and reverse search in Algorithm Improvements These three basic search algorithms produce correct results in that they produce a shortest path (at least with the correctly estimated arrival time in the case of the reverse search), but there is still plenty of room for improvements. We study the effects the following options have on the running times in Section 2.6, but already hint at their respective usefulness here. Multiple Paths A generally important technique is to try to produce multiple shortest paths with each call of the shortest path search. These may overlap, so that it is not clear whether flow can be augmented on all of them. However, when the bottleneck of a path is determined in the SSP algorithm, it is trivial to omit paths on which no flow can be augmented. This allows us to try out many different paths, hoping to augment more flow per iteration and reduce the number of iterations in the SSP algorithm. One way to create more paths is through the search itself. The search algorithms are all written in such a way that they terminate when the queue is empty, not when the first shortest path is found (but see the quick cutoff option below). Therefore, the forward search creates a tree rooted at the supersource that spans the vertices that were marked reachable. While there is only one supersink, multiple sinks in S could be contained in this tree,

60 60 CHAPTER 2. EAF S AND THEIR COMPUTATION and many of them might have the correct minimum arrival time label(s ) = dist(s = ). It is easy to find these and backtrack paths from them. So in the best case, one could find one path per sink. Therefore, looking at the sinks individually is highly recommended. If paths originate from different sources in S ++, they are guaranteed to be disjoint because those sources are themselves roots of their own subtrees. Similarly, the reverse search spans the reachable vertices rooted at the supersink. This allows the algorithm to find up to one path per source. Since we usually assume that there are more sources than sinks, this can be quite beneficial. The mixed search also produces multiple paths, namely one per vertex that was labeled by both search directions as reachable. In light of this, one can also see the forward search that looks for multiple sinks as a mixed search doing only a single step in the reverse search, namely from s = to S. Similarly, the reverse search that tries to find a path for each source does a single step of the forward search from s to S ++. Finding multiple paths in this way is useful because it cuts down on the expensive calls of the shortest path computations. But the usefulness typically decreases throughout the execution of the algorithm as more and more terminals become inactive. Also, the mixed search tends to find too many overlapping paths, while flow can only be augmented on a small subset of them. This increases the time spent on determining the bottleneck capacity, which can then become a small concern for the running time instead of being almost negligible. Finally, all possible paths can be sorted by the number of arcs before the flow is augmented along them. Long paths are generally a bit less desirable because they might be unnecessarily convoluted or overlap with too many other paths. This tries to bring back the breadth first search as used in the Edmonds-Karp-algorithm. The effect of sorting paths like this is only relevant if an abundance of paths can be found at the same time. As the computational results will show, it is often not effective, though. Repeated Paths Not related to the actual search algorithms, the SSP algorithm itself can also produce new shortest paths by considering the repeating structure of the timeexpanded network. For this, it has to remember the paths that have been successfully augmented for the current shortest path length. Once the shortest path search indicates that the length of the shortest path has increased, the stored paths can be delayed at each vertex to arrive at the correct time. Such new paths are repeated paths and they have the right cost by construction. They might have 0 residual capacity for various reasons, though, e. g., the source running out of supply. After the algorithm tries to augment along the repeated paths, the paths returned by the actual shortest path search are used as usual. But they might not be feasible anymore if some of the repeated paths were augmented successfully. Of course, any path that is augmented successfully is stored again.

61 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 61 Therefore, once a path has been found, delayed copies of it will be considered for all later arrival times until its bottleneck is determined to be 0. This approach is inspired by the behavior of the single-source single-sink algorithm for finding earliest arrival flows, that uses the SSP algorithm in the static network. As mentioned in Section 2.4, that algorithm simply sends flow continuously along the paths that are found during the execution of the SSP algorithm but interprets them as paths over time. While this approach is too simple for multiple sources, making use of repeated paths improves the speed of our algorithm in instances with large supplies. Specifically, for suitable instances (single-source, single-sink, all arcs being available all the time), our approach with repeated paths reverts nicely to a behavior that closely resembles the single-source single-sink algorithm, even though it requires extra work to do so. (Some of this extra work is unavoidable as the flow is explicitly constructed.) Let us explain the similarities: Our algorithm will need one call of the shortest path computation in the timeexpanded network to find each shortest path that could also be found by the algorithm working on the static network. Additionally, there is a shortest path search for each arrival time that determines that no more paths of this length exist. In other words, assuming consistency in the way the shortest path searches work, the set of paths to be repeated for arrival time t is equivalent to the paths with length less than t found by the single-source single-sink algorithm. In the following iterations for time t (until the next increase in the arrival time), our algorithm can find and add the paths over time that correspond to paths found by the SSP algorithm in the static network with the same length t. Whether this ideal of the two algorithms producing essentially the same set of paths actually happens depends on the shortest path search in the time-expanded network, though. Nevertheless, in our instances, repeating paths can nicely improve the algorithm if there are sources with large supplies that have little interaction with other sources. It also seems likely that solutions produced using repeated paths are more regular than those without explicit repetition. This can help to keep the representation of the time-expanded network small. As a bonus for the reverse search, a successfully repeated path also indicates the arrival time t?. Since unsuccessful repeated paths are no longer considered, the cost of this technique is at most one additional bottleneck computation for each path included in the final flow. Thus, it is computationally cheap, often effective, and there is little reason not to use it. Ending the Search Early or Late While the previous techniques happened at a high level mostly independent of the shortest path search, the search itself can be modified as well. Typically, one wants to avoid running the shortest path computation for longer than necessary, at the risk of not finding as many paths as possible. Whenever it is detected that a shortest path has been found before the queue has been emptied (because its length matches the length from the last iteration), the decision has to be made whether the search should be stopped. To control this

62 62 CHAPTER 2. EAF S AND THEIR COMPUTATION with a bit more than just a binary switch, a parameter called quick cutoff determines how many more propagation steps are performed in the shortest path search. This is relative to the number of propagation steps already performed in this invocation of the search so that the running time can be conveniently bounded. A typical search that stops immediately when the first shortest path has been found can be achieved by setting this value to 0, while would always run until the queue is empty. In the benchmarks, we either use 50% more steps when we hope for many paths or 10% for variants where we expect almost no gain from this and want to sacrifice little time. When the forward search (or the forward part of the mixed search) has already found any path to the supersink with some length label(s = ), then the search can treat the network as if the time horizon was T = label(s = ) + 1. Because label(s = ) dist(s = ) and [dist(s = ), dist(s = ) + 1) is the latest time at which the network can contain any flow so far, there are no residual arcs that start later than this. So labeling these layers of the time-expanded network can never result in a shorter path than the one already found. One could also exclude [label(s = ), label(s = ) + 1) from the rest of the search and try to find only shorter paths than label(s = ). However, whenever label(s = ) is the true distance dist(s = ), we would thereby exclude further paths arriving at this time which is contrary to the goal of finding multiple paths. Note that the reverse search does not have such a provision as it only searches for paths of the given length anyway. Vertex Label Clean-up So far the idea was that the path search retrieves a task from the queue and propagates from the associated interval [t 1, t 2 ). However, Lemma 2.6 shows that each propagation step involves reaching once into the data structures of the flow and (usually) the target vertex, which incurs a logarithmic cost, no matter how small the interval is. Iterating through the data structures is faster, though, with amortized constant time per interval. This suggests to find a larger interval [t 0, t 3 ) containing [t 1, t 2 ) that can be propagated, which we call vertex label clean-up, or vertex clean-up for short. Of course, one should not propagate an interval twice, so the algorithm considers the scanned-flag of each interval. With this, we can move forward and backward in time from t 1 and t 2, respectively, to find the largest continuous interval that has already been reached but not scanned. Call it [t 0, t 3 ). Then [t 0, t 3 ) is substituted for [t 1, t 2 ) in the propagation and all contained intervals are flagged as scanned. Note that all intervals in [t 0, t 1 ) and [t 2, t 3 ) will be processed earlier than they would be in the breadth first search, causing more and more deviation from the breadth first search. Also, this leaves tasks in the queue that will eventually be canceled immediately by the scanned-flag. From a theoretical point of view, all of this is a constant overhead for a little logarithmic gain. The additional binary flag is needed on all vertex labels; tasks are checked whether they are already fulfilled; and intervals are enlarged by stepping forwards and backwards, both of which operations can run in overall linear time with appropriate data structures. Note that our implementation is less perfect than the theoretical discussion:

63 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 63 The intervals associated with the task do not contain a reference to the actual interval in the vertex labels because our data structures could not guarantee their validity. (We needed this for trying out data structures and for the case of multiple limited sinks, see Section on shelters.) Vertex clean-up is still very beneficial, because it also tends to produce larger continuous intervals at the target vertex and thus less subsequent tasks. The upcoming Figure 2.11 also highlights the effect of this clean-up procedure, where it reduces exponentially many propagation steps to a linear number. Guided Search One common approach to quicken shortest path searches is to use a guiding heuristic that helps to find a shortest path more quickly. This has a lot of potential if quick cutoff is used. For example, GPS-based navigation systems use extreme levels of pre-processing to find the shortest path in road systems on a continental scale [16]. We want to make use of these ideas as well, but note that some of the typical assumptions do not hold: The residual network changes significantly during the run of the algorithm, so extensive pre-processing is probably not the correct approach. But we do explore various heuristics for guiding the search based on the distances in the static network, which are much quicker to compute than in the time-expanded network. Other factors the heuristics consider are the intervals to be propagated. Of course, such a search no longer resembles a breadth first search. This does have practical and theoretical drawbacks. Note that all such a heuristic does is to determine the order in which the various time-expanded vertices are examined. Recall that all these choices have the same distance from the source in the forward search, namely 0, if we assume that the costs reside only on the arcs to the sink. So all of the heuristics below can be thought of as specifications to resolve the many ties in Dijkstra s algorithm. Similarly, in the reverse search, all propagated intervals have the same distance in that they all arrive at the correct time (or cannot be part of a path). While all of our heuristics produce a shortest path if it exists, there is the caveat that with an infinite time horizon, bad heuristics can be stuck in a loop where they consider the same set of vertices at later and later times without ever finding a path, similar to the problem shown in Figure 2.7. Unlike this example for the breadth first search, though, heuristics can encounter this problem as soon as a finite interval is created and its resulting tasks are prioritized despite being very late. Thus, a heuristic should guarantee that every time layer is eventually entirely scanned. Our heuristics have this property. The following is phrased in terms of the forward search. Our heuristics all give highest priorities to tasks for sources (in S ++ ) because they are required to start the search and even empty sources produce intervals of the kind [0, ) and large intervals are generally desirable. Sinks are also immediately processed because this just needs an update of the supersink label but produces no new tasks. Thus, when we discuss these heuristics, we only need to

64 64 CHAPTER 2. EAF S AND THEIR COMPUTATION discuss how a time-expanded vertex is treated. Suppose we perform a forward search and the task is v@[t 1, t 2 ). Let l denote the distance to the closest sink in the static network. 1. Greedy guidance favors tasks at early times, sorting them by the lower bound t 1 of the interval in increasing order. This approach projected to the static network imitates Dijkstra s algorithm, at least for the initially empty time-expanded network. 2. Dynamic guidance uses the lower bound of the interval plus the static distance to the nearest sink, so t 1 + l sorted in increasing order. It is similar to the classic A -search [48], which guides according to a lower bound on the estimated arrival time. 3. Seeking guidance requires an estimated arrival time t? in the forward search as well. It gives high (and equal) priority to tasks with intervals [t 1, t 2 ) such that t? [t 1 + l, t 2 + l), as these are intervals that (in an empty network) contain a part of a shortest path. They are on time. If t 2 + l t?, the interval is too early. Such intervals are processed after all intervals that are on time, sorted by t 1 as in the greedy guidance. In the remaining case, the interval is too late, that is t 1 + l > t?. Such an interval has lower priority than the intervals that are on time or too early. Among the too late intervals, they are sorted by their expected arrival time, that is t 1 + l, just like in dynamic guidance. Note that ties are always resolved according to FIFO, so that this aspect of the breadth first search is kept intact. Of course, the reverse search needs a slightly different treatment, so that guidance aims towards sources instead of sinks. However, the reverse search showed less benefit from all these approaches than the forward search and we eventually settled on not using guidance for the reverse search at all. In addition, the mixed search needs to know how to choose between forward and reverse tasks. We maintain two independent priority queues for the forward and reverse tasks and interleave them according to the sequence in which forward and reverse tasks were created. In the absence of further modifications, this interleaves the phases of the two breadth first searches so that all vertices of the same distance to the origins of the searches are processed together. Note that an alternating task queue (one forward task, one reverse task) seems to be a bit more effective. However, we made this discovery only after the computational results. We also note that our implementation of the mixed search is currently hard-coded to use breadth first search for its reverse component, although the code supports the full range of options. Just like the reverse search, the mixed search worked generally best with the reverse part as a breadth first search, so that there was little incentive to add more parameters to the algorithm. In effect, all of these heuristics can help on some instances. For example, a simple breadth first search could have a worst-case exponential running time in the graph in Figure 2.11, where greedy guidance would always result in the efficient Dijkstra-like exploration of the graph. On the other hand, Figure 2.12

65 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 65 1 T = 2 n.. [3 2 n 2, 2 n ) [6, 8) [2 n 1, 2 n ) [4, 8) [2 2 n 2, 3 2 n 2 ) [4, 6) [0, 2 n ) [2 n 2, 2 2 n 2 ) [2, 4). [0, 4). [0, 2 n 1 ).. [0, 2 n 2 ) [0, 2) 2 n 1 2 n Figure 2.11: A worst case for the interval-based forward BFS. If for each task the arc of length 2 i happens to be processed first, the intervals will be created top to bottom and lose half their length from each vertex to the next. This corresponds to propagating each task along all the drawn arcs and their later copies. Thus, pseudo-polynomially many intervals will be created even in an empty network where all arcs are available all the time. All of our guidance schemes would prevent this, though. Vertex label clean-up would also ensure that [0, T ) would be propagated at each vertex even in the BFS. shows a situation where the only possible path consists of θ(t ) arcs in the time-expanded network, so there cannot be a polynomial guarantee for any search strategy. One might think that even if a guidance scheme does not improve the search, that it incurs at most a logarithmic overhead to manage a priority queue or a similar data structure. But all these guided search strategies suffer from a very serious drawback that is not obvious: They tend to produce less parallel paths, increasing the number of iterations needed to determine the maximum flow for each time step. Also, they all assume the original network, that is, not the current residual network. This can lead to situations where large parts of the network are examined before finding the shortest path because it depends on a residual arc far from the shortest path in the static network. This situation is more likely to occur if the network is almost saturated with flow. From these options, we chose the seeking guidance as the best-of combination for our tests, as it handles various problems that can come up in an instance. It often performs very well but can become very slow on some instances. This happens most often when the flow is dense, that is, the problem has a large total supply arriving within a small time horizon. The reliance on static distance information raises the question why the distance to the closest sink is not made time-dependent and updated throughout the course of the algorithm. The major problem is that this information

66 66 CHAPTER 2. EAF S AND THEIR COMPUTATION..... T T 1 0 Figure 2.12: An example where a pseudo-polynomially long path must be augmented. There are unit capacities, and the transit times are denoted on the arcs. The red source has supply equal to the time horizon T, the blue source has supply 1. The red and blue paths are the already established flow, which could arise naturally in many search strategies. The yellow arcs represent the unique shortest path tree established by any search strategy (that propagates all arcs leaving S ++ at once) in the current iteration. The only augmenting path consists of 3T + 2 arcs. The problem is that to increase the flow by the final unit, the red source must switch to not using the arcs with length 1, which already requires T 1 changes to the flow. But any change requires an arc in the augmenting path. Note that this problem could have been avoided if the red source would not have used those arcs of length 1 in the first place. But many strategies could conceivably fall into this trap, in particular greedy ones. is not readily available. Essentially, we would need to run an actual shortest path search from the supersink in the residual network that determines what the earliest time is at which a vertex can reach a sink. This is about the same as we already try to do, but for all vertices instead of just the sources. However, we do employ a similar idea successfully for the reverse search, which we discuss next. Track Unreachable The reverse search differs from the forward search in that it needs to know the correct arrival time to produce a shortest path. With this, it classifies the vertices of the time-expanded network into those that can reach the sink by that time and those that cannot. The forward search, however, reaches all vertices that are reachable at all from the supersource, even beyond the

67 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 67 correct arrival time (unless this is prevented). Furthermore, the distance from the supersource to a vertex monotonically increases during the SSP algorithm, as stated in Theorem 1.6. Because a time-expanded vertex is reachable or not, the latter being equivalent to infinite distance, it can never become reachable again after it was found to be not reachable from a source in some iteration. Thus, it can never be part of an augmenting path again. This is valuable information and we can explicitly stop propagating such a time-expanded vertex. But this does not matter at all for the forward search, because by construction it can never encounter such a vertex again. However, the reverse search still can. For memory reasons and simplicity, we just maintain (a lower bound on) the earliest time at which a vertex can be reached from a source. Most likely, it can also be reached at later time steps (which is even guaranteed if unlimited holdover is allowed), so there is not much point in keeping a more detailed structure. Then all propagated intervals on this vertex in the reverse search are simply cut off to start no earlier than the lower bound from the forward search. We call this option track unreachable. This has a tremendous positive effect on the performance of the reverse search. Actually, without it, the reverse search is often not competitive with the forward search, despite being so similar. However, we need to update these lower bounds on the earliest reachable copy of a vertex from time to time. A complete forward search (that ends only when the queue is empty and never discards tasks) can accomplish this and is therefore run regularly. (The heuristic in use enforces a forward search whenever the arrival time has increased by more than 3 steps since the last forward search.) Note that the forward search may already be used to find the correct arrival time when it cannot be guessed easily, so adding an iteration with a forward search between iterations of the reverse search is not a drastic change Pseudo-Code In Algorithms 2.3 to 2.6 we provide the pseudo-code for our adapted SSP with the improvements described above. This is purposefully in a rather abstracted way. In particular, we gloss over the data structures required. (Suitable implementations are discussed in Section 2.5.7, though.) The main goal is to show the key points where the improvements were added to the basic Algorithms 2.1 and 2.2 described in Section If only the forward search is desired, the cases for the reverse search and the specialties of the mixed search can be removed.

68 68 CHAPTER 2. EAF S AND THEIR COMPUTATION Algorithm 2.3: The outer loop of our adapted SSP algorithm. input : Graph, transit times, capacities, availability windows, arbitrary supplies at the sources, infinite demands at the sinks, time horizon T Z 0 { }, all integral; Options forward/reverse/mixed search,... output: Earliest arrival flow x on the time-expanded network 1 Add supersource s, supersink s = 2 x := empty flow (in intervals) 3 P old := // the paths to repeat 4 recentarrival := 0 // current arrival time 5 reversefailed := 0 // force forward search? 6 while true do 7 if reversefailed is too large or track unreachable is outdated then 8 P := run full forward search // Alg reversefailed := 0 10 else 11 P := run search with given parameters // Alg if P = then 13 if search was reverse search then 14 newarrival := recentarrival reversefailed := reversefailed else 17 return x // forward or mixed search determined end 18 else 19 reversefailed := 0 20 newarrival := length of the paths in P 21 if newarrival T then 22 return x // time horizon exceeded 23 if newarrival > recentarrival then // apply repeated paths 24 P := delay paths in P old to newarrival 25 P old := 26 for p P : p has positive bottleneck in G T x do 27 augment bottleneck along p and adjust x 28 P old := P old {p} 29 reversefailed := 0 // found arrival time 30 recentarrival := newarrival 31 sort P by number of arcs 32 for p P : p has positive bottleneck in G T x do 33 augment bottleneck along p and adjust x 34 P old := P old {p}

69 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 69 Algorithm 2.4: The shortest path search in our adapted SSP algorithm. input : Graph, transit times, capacities, availability windows, supersource s, supersink s =, supplies, time horizon T Z 0 { }, flow x on G T, estimated arrival time t? Z 0, all integral; Options forward/reverse/mixed search, quick cutoff, guidance output: A set of shortest s -s = -paths; empty if there is none or, only for the reverse search, if they all arrive later than t? 1 initialize (priority) queue Q // depends on guidance 2 if forward or mixed search then 3 label : // initialize forward labels 4 label(s ) := 0 5 insert s into Q 6 if reverse or mixed search then 7 label : 8 label(s = ) := t? 9 insert s = into Q // initialize reverse labels 10 while Q and not maximum polls reached do 11 get task from Q // according to priority/guidance 12 if forward task then 13 J := process forward task // Alg else 15 J := process reverse task // Alg if path was found and not doing full forward search then 17 T := min{t, length of found path+1} 18 if length is t? and first time this happens then 19 maximum polls := current polls (1+quick cutoff) 20 insert new tasks J into Q 21 P := 22 if mixed search then 23 P := backtrack paths for all vertices reachable both from s and s = // only use lowest value in each interval 24 if forward search or mixed search produced no results then 25 if label(s = ) < then 26 P := backtrack paths for all s S with label(s ) = label(s = ) 27 else if reverse search then 28 if label(s ) < then 29 P := backtrack paths for all s++ S ++ with label(s ++ ) < and supply left 30 if full forward search then 31 update track unreachable 32 return P

70 70 CHAPTER 2. EAF S AND THEIR COMPUTATION Algorithm 2.5: The steps of forward propagation input : General instance input, a forward task output: Resulting tasks 1 switch kind of task do 2 case forward from s 3 propagate to all s ++ with supply left 4 return task for each such s ++ 5 case forward from s ++ 6 propagate to s T ) 7 return task for s T ) 8 case forward from v@[t 1, t 2 ) 9 if t 1 T then 10 return 11 if [t 1, t 2 ) was already scanned then 12 return 13 enlarge [t 1, t 2 ) to adjacent reachable but unscanned intervals // vertex clean-up 14 forall the a = (v, w) A do 15 propagate to w@[t 1 + τ(a), t 2 + τ(a)), obeying u x (a, ) 16 forall the a = (w, v) A do 17 propagate to w@[t 1 τ(a), t 2 τ(a)), obeying u x ( a, ) 18 if v = s + S + then 19 propagate to s ++ if flow leaves s ++ in [t 1, t 2 ) 20 if v = s S then 21 propagate to s with distance t 1 22 return tasks for all successful propagations 23 case forward from s 24 propagate to s = with distance label(s ) 25 return

71 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 71 Algorithm 2.6: The steps of reverse propagation input : General instance input, a reverse task, estimated arrival time t? output: Resulting tasks 1 switch kind of task do 2 case reverse from s ++ 3 if s ++ has supply left then 4 If s not yet reached, label(s ) := 0 5 return // found a shortest path 6 else 7 propagate to s t + 1) for all t when flow leaves s + 8 case reverse from v@[t 1, t 2 ) 9 if [t 1, t 2 ) was already scanned then 10 return 11 if track unreachable then 12 t 1 := first reachable time of v 13 if t 1 t 2 then 14 return 15 enlarge [t 1, t 2 ) to adjacent reachable but unscanned intervals // vertex clean-up 16 forall the a = (w, v) A do 17 propagate to w@[t 1 τ(a), t 2 τ(a)), obeying u x (a, ) 18 forall the a = (v, w) A do 19 propagate to w@[t 1 + τ(a), t 2 + τ(a)), obeying u x ( a, ) 20 if v = s + S + then 21 propagate to s ++ with infinite capacity 22 case reverse from s 23 propagate to t? + 1) 24 case reverse from s = 25 label(s ) := t? for all s S 26 return tasks for all successful propagations

72 72 CHAPTER 2. EAF S AND THEIR COMPUTATION Implementation Details The performance of the algorithm depends on a suitable implementation of the interval data structures, which are essentially search trees. Self-balancing trees are the typical choice for this, but there are many variants. In the end, we did not use a single type of data structure for all interval-based data, but chose the most suitable one for each application. Some of these are actually not optimal in theory. Vertex Labels For the labels on the vertices, we use Treaps [84], which are binary search trees that achieve balance through (pseudo-)randomization and achieve the desired logarithmic access times on average. We chose not to join consecutive intervals transparently in the data structure. Doing so would be beneficial if read access to the labels was far more common than write access. However, the number of incident arcs defines how often a label is read (at each time step) in the worst case. Since we assume sparse graphs, we do not prioritize reads over writes and rather use the simpler vertex clean-up as described in Section Each label only stores the two boolean values for each interval, though: Whether the interval is reachable and whether it has already been scanned. The breadcrumbs are stored in a linear data structure (an array), as these are only written to (but not referenced) during the search. During the backtracking phase, it suffices to scan these in linear time to find the corresponding breadcrumb for a single point in time. Before settling on Treaps, we had also tried AVL-Trees, Red-Black-Trees, and Skiplists. The first two showed very good performance close to that of Treaps, while our implementation of Skiplists was only half as fast. Descriptions of these data structures can be found in many textbooks. Note that even after a lot of optimizations, reading and writing the labels make up roughly 70% of the entire execution time, so speed is crucial. Flow Values To store the flow values, we still use an AVL-tree. However, these are only needed for the flow augmentations. Our shortest path search just needs to determine whether an arc has residual capacity or not. Thus, we project the residual capacity functions down to a boolean function. To store this function, we use a sorted array denoting the points in time when the boolean value changes. Each two consecutive values denote an interval (adjacent intervals share one value). The first interval always denotes true, which is upheld by prepending an interval from [ 1, 0) if necessary. Thus, binary search can be used to find an interval, and the index of its starting point modulo 2 yields its boolean value. (This is almost equivalent to storing the beginning and end of each interval that is true. But our approach is more symmetric in that the distinction between true and false only happens in the very last step.) This results in very small data structures that are very fast for read access. The downside is that we need linear time for insertions and deletions. These

73 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 73 happen when intervals are split or merged (a mere shifting of a border causes no problem). However, the only write access to the flow function happens during the augmentation of flow, which is far less often than reads occur during the search. It is also plausible that not all changes to the flow function affect the boolean value. Task Queue When we use selection rules for tasks, we typically incur a logarithmic cost for maintaining a heap or similar data structure to determine the next task. (In contrast to the constant time for a simple queue.) However, in practice this overhead becomes almost negligible because all of our guidance heuristics can be boiled down to a rating of the tasks within O(T ). Thus, we use a bucket queue, that uses an array large enough to fit all possible rankings and maintains a simple queue for each possible rank of a task. In the end, the time spent on the priority queue was only about 2.5 times as slow as the default (unprioritized) queue implementation in Java. This translates to very roughly 12%, respectively 5%, of the overall time spent in the search. Memory Consumption, Memory Management and the Java Virtual Machine It is time to discuss the typical memory consumption pattern of our implementation, and what problems arise from this. There are two aspects to consider. The actual amount of memory required (space) and the time overhead for allocation and deallocation of objects. In the following, we argue that it is beneficial to manage the memory manually rather than leave it to Java and its garbage collection. When running the benchmarks on some of the larger instances, we observed that the entire Java virtual machine (JVM) could consume up to 160% CPU (1.6 cores), despite our algorithm being single-threaded. (On many instances, this stayed below 120%, though.) We do not know what exactly causes this behavior, but strongly suspect the huge number of (de-)allocations in our implementation and Java s garbage collection to be responsible for this. The garbage collection is the only part of the Java virtual machine that can run in parallel, as far as we know. When choosing an older single-threaded garbage collector (option -XX:+UseSerialGC), the JVM only used 100% CPU in total, which further supports our theory that the garbage collector is doing a lot of work in parallel for our algorithm. Because we report wall-clock time for our algorithm, less than the CPU time is actually called out in the benchmarks. Thus, we benefit from parallelization while all competing algorithms run on a single core. We could have performed the benchmarks with the aforementioned older garbage collector as well. This would have increased the running times (wall-clock times) by about a tenth, while utilizing only a single core. (The other Java program in our tests, ZET, uses only approximately 100% CPU overall even with the default garbage collector.) On the other hand, we believe that the memory management in our application is far from ideal because we create unnecessarily many small objects. In

74 74 CHAPTER 2. EAF S AND THEIR COMPUTATION general, modern garbage collectors can be similarly fast as traditional memory management, see, e. g., the Garbage Collection FAQ [35]. So there is no serious inherent drawback. But this combination of unusual memory requirements and the strange behavior of the JVM prompts us to address this nonetheless. In an ideal implementation, our algorithm would hardly need a complicated memory management, thereby cutting down on the overhead and solving the dilemma. Most data structures simply grow in size as the algorithm progresses. In the shortest path search, these are most notably the vertex labels and the breadcrumbs. Tasks are produced at the same rate as intervals for the vertex labels, so the queue also grows at this rate. Given this relationship to the vertex labels, there seems to be little point to freeing tasks from memory once they have been executed. This allows us to rely on a rather simple memory management for parts of our code: We allocate arrays to store the growing data structures. With a suitable initial size and re-allocating the arrays twice as large as needed, this adds a constant factor to the memory requirements. To store, say, 2 k + 1 elements, we use an array of size 2 k+1 and might not be able to re-use the previously used arrays with total size 2 k+1 due to memory fragmentation. Thus, we need at most 4 times as much memory as we have data. Since the overall memory consumption of our algorithm is good compared to an approach that works on the full time-expanded network, this is tolerable. The advantage is that this drastically reduces the number of objects that need to be allocated or freed during the run of the search. Also, by storing an entire vertex label including its search tree within the same array as opposed to a collection of objects spread out over the memory space, we hope to benefit from increased cache efficiency. Note, though, that Java is less than ideal for such an implementation. As it lacks a simple flat data structure like C s struct and the associated pointer arithmetic, such an array-based approach is rather inelegant in Java. Additionally, Java enforces bound checks on arrays, which can be detrimental to the performance. Due to these coding restrictions, the structures for which we implemented this are only the vertex labels (but not the breadcrumbs), the queue itself (but not the tasks) and the projections of the flow functions (but not the flow functions). Each iteration of the search allocates new arrays, although the old ones could be reused. Because memory cannot be explicitly freed in Java, this leads to a sawtooth pattern for memory consumption: More and more memory is allocated over the course of multiple iterations until it reaches the set memory allowance (2 GB for our code). Then Java performs a full garbage collection and the memory footprint drops sharply. One can also invoke the full garbage collection manually. Doing so after each iteration gives a better picture of the memory usage of our code but is slower. See Figure 2.13 for the memory consumption pattern under the standard behavior and when the garbage collection is invoked after each iteration. The latter also resembles the expected memory consumption if this program was written in a language without garbage collection.

75 2.5. INTERVAL-BASED SUCCESSIVE SHORTEST PATHS 75 (a) (b) Figure 2.13: The heap usage of our Java program (a) with the default behavior of the garbage collection, and (b) when the garbage collection is called after every iteration of the SSP algorithm to remove obsolete data. This heavy-handed experiment increases the overall running time by about 30%, but shows the actually rather low memory requirements of our code. The instance used is Padang-1s-1g, one of our largest, see Section

76 76 CHAPTER 2. EAF S AND THEIR COMPUTATION 2.6 Computational Results In the following, we show the computational results for our algorithm using various settings on real-world and artificial instances. As a comparison, we provide results for existing algorithms, only some of which are aware of the properties of the time-expanded network Rounding Before we discuss the instances, we remark that for real-world instances with non-integral parameters a suitable discretization has to be found. Somewhat surprisingly, a finer discretization is not always better. The problem lies with the capacities that have to scale with the discretization step. Once is sufficiently small, fractional capacities will introduce severe rounding errors. (Because we want to keep each evacuee intact we have to find an integral solution.) Thus, we have to decide between an accurate representation of the capacities and the transit times. There is no rule for this and there are theoretical instances which defy simple rounding schemes, see Figure For practical purposes, we suggest to consider the average relative error introduced in both parameters and to choose a that results in about equal errors. Randomized rounding might also prevent artifacts from developing in real instances. Of course, all algorithms benefit from a rougher discretization and ours is no exception, although the difference might not be as pronounced. In addition, any SSP-algorithm benefits from rounding the supply/demand-function and the capacities to a multiple of some g Z >0, which we call the group size because groups of so many flow units travel together. This ensures that the lowest amount of flow that will be augmented in each iteration is g, reducing the worst case running time of the SSP algorithm by a factor of g. The downside is, that this generally increases the error in the capacities by the same factor. This implies a convenient way to decrease the running time for instances that should benefit most algorithms. An instance, that might already have a suitable scale, can be discretized with some and, simultaneously, the supply/demand-function and capacities are rounded to multiples of g =. Thus, the capacity precision is unchanged, while the precision in time has become coarser, and the smaller number of time steps and the smaller supplies allow for a quick estimate of the solution. Recall that from a theoretical point of view, there is a result by Fleischer and Skutella [27] that essentially shows that an ε-approximation to the Min- CostFlowOverTime problem can be determined by rounding the network to O( V 2 /ε) time layers. (The approximation exceeds the time horizon by a factor of ε.) This is only advantageous if the time horizon is larger than this term to begin with, which is not the case in our instances, though.

77 2.6. COMPUTATIONAL RESULTS 77 k arcs τ = l, u = 1 k. l arcs τ = 1, u = 1 Figure 2.14: For a fractional flow, each of the upper (parallel) and lower (sequential) half of the network could be replaced by an arc with τ = l and u = 1. However, rounding to < k/2 yields total capacity 0 instead of 1 on the upper half, while rounding to > 2 yields an effective transit time 0 instead of l on the lower half Instances We describe the origins and properties of the instances in the following. The accompanying diagrams show for each instance the number of arcs in the timeexpanded network and the approximate number of paths used in an optimum solution. They are useful to roughly compare the difficulty of instances. The reasoning is that the size of the time-expanded network is an important factor in the running times of all algorithms, while the number of paths used has a significant impact on the running times of SSP algorithms. For more numbers on the instances, like the size of the graph before time-expansion, see Tables 2.5 and 2.6 at the end of this section. When we compare our approach to other algorithms, we need to provide the time-expanded network explicitly. We choose the time horizon a few percent larger than the optimum for these algorithms, as one would do if one had a good estimate of the correct time horizon. The specific values are noted in the overview tables. We also collected results for both equivalent formulations of the cost functions, that is, whether the costs are on most arcs (the default) or just on the arcs leading to the sinks, named cost on sinks or cost on A. Real-world Instances: Padang The first is the city of Padang, Indonesia, or more precisely the part of downtown Padang most threatened by tsunamis. In case of a tsunami, the population should flee towards higher ground away from the coast line. Because safe areas are not modeled in detail, the network consists of a thin strip along the coast line. See Figure 2.15 for a map of Padang and Figure 2.16 for the graph.

78 78 CHAPTER 2. EAF S AND THEIR COMPUTATION The total area modeled is roughly 32 square kilometers large. The population of this area is estimated at approximately 250,000 and they all start at their respective homes. This represents a night-time scenario. Actually, as the map suggests, no agents start in the northernmost part of the map, parallel to about two thirds of the airstrip. Still, very many vertices are sources. The sinks are those vertices located on the long side opposite of the coast line. We also were given simulation data on how a tsunami would flood the streets. This gives rise to end-times for some arcs, after which they are no longer usable, as shown in Figure Having fewer arcs available increases the necessary time horizon by about 20%. We consider the Padang instance with and without the flood simulation. A more detailed description of the scenario is given by Lämmel et al. [63], who also provided us with the data. As these are large real-world instances, we use both scenarios of Padang in various discretizations and also with rounded supply/demand functions. See Figure 2.18 for the sizes of the resulting instances. Table 2.3 gives an overview of the relative rounding errors, as defined in Section By construction, the time step determines the error in the transit times τ, while divided by the group size g determines the error in the capacities u. That the errors follow such a flawless reciprocal progression stems from the distribution of the real-world parameters which offer an almost continuous range of values. A sweet spot for accuracy is around = 2s and g = 1 (no further rounding of the supplies), while the roughest discretization with = 10s even with g = 10 has the same error in the capacities as the much finer (arguably too fine) case with = 1s. Of course, there is an additional error in the transit times, but the lack of accuracy in the other parameters cannot be undone.

79 2.6. COMPUTATIONAL RESULTS Figure 2.15: The coast-line part of Padang. The area to be evacuated is marked by the yellow dashed line. (Map from OpenStreetMap.org, CC-BY-SA-2.0) 79

80 80 CHAPTER 2. EAF S AND THEIR COMPUTATION Figure 2.16: The network used for Padang along some large features of the city. The gray area is a fenced-off airport. Note the waterways, which will be inundated particularly early during a tsunami.

81 2.6. COMPUTATIONAL RESULTS Figure 2.17: The Padang network with the arcs underlaid depending on when they become inaccessible. The first effects are noticeable at t = 600s and occur along the waterways shown in Figure The optimum time horizon is roughly T = 3000s and all arcs affected later than this have the same black background. The remaining arcs stay available.

82 82 CHAPTER 2. EAF S AND THEIR COMPUTATION Approximate no. of paths in solution Padang- s-1g Padang- s-2g Padang- s-5g Padang- s-10g Arcs in the time-expanded network (in millions) Figure 2.18: The size of the various discretizations for the Padang instances, Padang- s-gg. The instances with simulated flooding have about 20% more arcs but almost the same number of paths. g rel. error τ rel. error u T / T total cost 1s % 5.53% ,726,726 2s % 2.75% ,657,782 5s % 1.10% ,593,140 10s % 0.54% ,533,810 2s % 5.53% ,097,236 5s % 2.24% ,191,240 10s % 1.10% ,139,220 5s % 5.53% ,266,175 10s % 2.75% ,378,050 10s % 5.53% ,212,100 Table 2.3: The rounding errors for the Padang instance at various discretizations for the step size and group size g, and the resulting differences in the objective.

83 2.6. COMPUTATIONAL RESULTS 83 Approximate no. of paths in solution Berlin- s-1g Berlin- s-2g Berlin- s-5g Berlin- s-10g Arcs in the time-expanded network (in millions) Figure 2.19: The discretizations of the Berlin instances Berlin- s-g. Note that the group size g has almost no effect on the number of paths in the solution. Real-world Instances: Berlin The second large instance represents a circular cutout of a part of inner Berlin, compromising about 44 square kilometers. See Figure 2.20 for the modeled area and Figure 2.21 for the network. The outside area is considered safe and one would quickly reach a sink when walking in any direction. This instance was created by Rost for his Bachelor thesis [79]. The street network is converted from OpenStreetMap [75] data from 2009, which contains accurate lengths of the streets. A walking speed of 1.3 m/s is assumed. However, the capacities of the streets are only very roughly categorized in the underlying Open Street Map data and had to be estimated. This led to capacities of 12, 16, 20, and so on people per second for various classes of streets in the data. At the given walking speed, the smallest capacity corresponds to about 12 people walking side by side. The overall supply is given as more than 800,000, distributed over the area and assigned to the nearest vertices. (The number of residents in that area is around 500,000 from our estimates, though.) We had to modify the instance for our comparisons slightly. That all capacities are divisible by 4 would lead to undesired effects because the supplies at each vertex were multiples of 10. There would not be any difference in the instances for group sizes g = 1 and g = 2. We therefore randomly and uniformly perturbed the capacity of each arc by ±50% to counter such effects (with the downside that opposing arcs now have different capacities). We consider the same discretizations as for Padang. However, the large capacities on every possible path made grouping almost negligible, as can be seen by the number of paths in Figure 2.19 and Table 2.5 changing little.

84 84 CHAPTER 2. EAF S AND THEIR COMPUTATION Figure 2.20: The inner city of Berlin. The area to be evacuated is determined by the circle. (Map from OpenStreetMap.org, CC-BY-SA-2.0) Tiergarten Tempelhof Figure 2.21: The network used for the model of Berlin with some landmarks. Note that the Airport Tempelhof was closed but not open to the public yet when the network was created in 2009.

85 2.6. COMPUTATIONAL RESULTS 85 Real-world Instances: Buildings The third set of instances stems from our collaboration with the ZET evacuation simulator [92]. It is mostly used to model buildings in great detail, with a vertex typically representing a handful of cm 2 squares. These squares are also the building blocks for ZET s cellular automaton for simulating pedestrian flows. Thus, the software offers both network flow algorithms and a simulation approach. One building modeled is the Telefunken-Hochhaus in Berlin, Germany, a slender 20-story building containing mostly offices. Most of the upper floors can only reach the ground through two staircases, one on each side of the building. The model was created from plans of the building. The number of occupants in each floor was determined through a poll, and there are 369 evacuees in total. Another instance from ZET is that of a full lecture hall, the Audimax at the Technische Universität Dortmund, Germany, with 601 occupants. And finally, another building of TU Dortmund, OH14 (Otto-Hahn-Straße 14), which mostly consists of offices and small seminar rooms. The latter instance has many (emergency) exists. The graphs obtained from the ZET editor already consist of integral data. Because the model is so detailed, the arcs also have very small capacities and transit times. Rounding these any further would introduce unacceptably large errors. Real-world Instances: Airplanes The fourth and final set of instances from real-world data contains models of airplanes, a Boeing 747 and an Airbus 380, which we received from Schulz [83]. These models were created from seat layouts of the planes. The graphs are similar to those from ZET in that each vertex represents a very small area, e. g., a seat. The arcs are correspondingly short and have low capacities. The global structures of the models is simple, in that they model two alleys from which the seats are accessible and symmetrical emergency exits along its length. The set also contains an artificial instance, dubbed A380plus, which was created by modifying the layout of the Airbus 380 to include more rows. It holds 972 people. Variations of these instances close all but one set of the emergency exits. Such a change increases the time horizon significantly. Figure 2.22 and Table 2.5 summarize the properties of the building and airplane instances. Artificial Instances: Netgen The details for all artificial instances are given in Table 2.6, and their sizes for the purposes of the algorithms are shown in Figure For the artificial instances, we first looked at the classic netgen program [58]. It creates instances for the MinCostFlow problem. To turn these into instances of the MinTravelTime problem, we interpreted the costs of the arcs as transit times and the capacities as capacities per time step. However, the problems that netgen creates are feasible instances for static flows,

86 86 CHAPTER 2. EAF S AND THEIR COMPUTATION Approximate no. of paths in solution Audimax, OH14, Telefunken A380, A380-front B747, B747-front A380plus, A380plus-front Arcs in the time-expanded network (in millions) Figure 2.22: The airplane instances and the building instances have much lower supplies than the city instances. With the exception of the Telefunken instance, their time-expanded networks are also much smaller. that is, the supply/demand function can be satisfied by using the capacities just once. This hardly creates interesting problems for flows over time. This problem is remedied by choosing suitable parameters and changing the instances a bit. The input to netgen consists of the desired number of vertices and arcs, the number of sources and sinks, and the total supply at the sources. The supply is randomly distributed among the sources and an equal demand among the sinks. The capacities are chosen uniformly from a given interval. If needed, netgen ensures the feasibility of the instance by adding a path from each source to some sink with capacity equal to the supply of the source, even if this exceeds the desired maximum capacity. We change the capacity of all arcs exceeding the maximum capacity to the rounded down average of the capacity limits. Choosing a large supply and small capacities, we then obtain an instance which will not be feasible for a static flow. We tried to create large instances with netgen, but all of them turned out to be easy to solve with low time horizons. The set of parameters chosen for τ and u does not reflect a specific model, either, although they are in the general vicinity of other instances. Given the unknown and random structure of the netgen instances, it is difficult to draw any conclusions from them, but we include them in the results anyway.

87 2.6. COMPUTATIONAL RESULTS 87 total supply increases total supply increases (a) (b) Figure 2.23: (a) The short and wide instances, Gridshort-D consist of a 30x100 grid and the total supply D {4k, 20k, 100k, 250k} is varied. (b) The long and narrow instances, Grid-long-D, D {4k, 20k, 100k, 250k}, consist of a 100x30 grid which causes much longer paths and more congestion. (The figures only show 3x10 and 10x3 grids, though.) Artificial Instances: Short vs. Long Grids Our own generator creates instances where the graph is a bidirected grid of a specified length and width. A given number of sinks are randomly placed on the left-most column. A parameter determines the first column (counted from left) that may contain a source. The chosen number of sources are then distributed randomly in those columns that are not too close to the sinks. A total supply is distributed by choosing a source for each flow unit. The capacities are derived from a normal distribution with a given average and variance, by rounding and setting negative values to 0. Despite its simplicity, this approach has many promising features. The basic instances of a city or a model from ZET have a lot in common with grids. The models in ZET are derived from a grid and, in areas with high detail, are still grids. A city is often at least in parts a grid as well. Unlike in our generator, there might be multiple ends with sinks, e. g., when evacuating a city district by going into all neighboring districts. But the various directions are mostly independent of each other. Thus, we chose to isolate one of those directions by having the entire flow go from the right to the left in the grid. The parameter to block some of the left-most columns from the sources was added to address that the breadth first search is at an advantage if the paths from sources to sinks are short in the number of arcs. Overall, this makes it easy to create large instances that contain sparse graphs with given supplies, and to impose various difficulties. We use this to create three different sets of grid instances that all consist

88 88 CHAPTER 2. EAF S AND THEIR COMPUTATION τ increases grid doubles, u and τ compensate (a) (b) Figure 2.24: (a) The increasingly long set, Grid-longer-n, n {1, 2, 3, 4, 5}, consist of a 100x30 grid but with the sources only on the far half of the network. The transit times τ are increased, resulting in time horizons from about 500 to (b) The increasingly fine set, Grid-fine-n, n {1, 2, 3} starts out with an instance similar to Grid-long-20k using a 100x30 grid, but then doubles the grid dimensions to 200x60 and again to 400x120. The overall time horizon is kept between about 300 and 400 by adjusting τ and u. of grids with one dimension being 30 and the other being 100 vertices, and 10 sinks on one of the sides. The majority of vertices are sources. In the short and wide set, the long edge contains the sinks, as shown in Figure 2.23a. Essentially all other vertices are sources. So the average distance (in arcs) to the sinks is short and there can be many parallel paths between sources and sinks. The arcs themselves are rather short, too, with an average length of 4, and have low capacities also with average 4. These settings are inspired by the coast line of Padang. However, we vary the total supply from 4,000 to 20,000, 100,000 and 250,000. This is not the same as (un-)grouping because the capacities remain unchanged. Increasing the supply thus saturates the network more and more, and also increases the time horizon. The instances are named after the length of the paths and their supplies: Grid-short-4k, Grid-short-20k, Grid-short-100k, and Grid-short-250k. The long and narrow set places the sinks on the short side of the grid, but no other parameter is changed. See Figure 2.23b. Consequently, the flow units have a longer path on average to the sinks. The optimum time horizon is larger due to this, and there is also more congestion on the paths to the sinks than in the short and wide set. With a large total supply, again between 4,000 and 250,000, as well as long paths, this creates the most challenging instance for the SSP algorithm. The instances are named analogously to the first set, Grid-long-4k, Grid-long-20k, Grid-long-100k, and Grid-long-250k.

89 2.6. COMPUTATIONAL RESULTS 89 Artificial Instances: Large Time Horizons We want to use the increasingly long set, Grid-longer-1 to Grid-longer-5, to demonstrate how our algorithm deals with large time horizons when they stem from increased distances and not more supply. For this, we place the sinks on the short side, as in the long and narrow instances, but do not place any sources in the half of the grid that is close to the sinks. See Figure 2.24a. We also fix the total supply at 20,000 originating from the other half of the grid. As before, the capacity is normally distributed with an average of 4. The arc lengths, however, are varied from the previous 4 on average to 8, 20, 50 and 100. The largest of these instance requires a time horizon of nearly 8,000 and makes up our largest instance overall. However, this instance is strange in that the 20,000 flow units trickle into the 10 sinks between times 3,700 and 8,000. At many time steps, some sinks are not reached by any flow units. Judging from this, there seems to be little competition between the flow units. This is different in the instances with shorter transit times. Artificial Instances: Large Graphs The last set of grid instances we consider is the increasingly fine set, Gridfine-1 to Grid-fine-3. Here, we increase the size of the grid from 100x30 to 200x60 and 400x120 vertices, as shown in Figure 2.24b. The supply stays at 20,000. The transit times and capacities are adjusted so that the time horizon does not vary too much but stays between 287 and 385. With the base graph containing 48,000 vertices, Grid-fine-3 is still the second largest time-expanded network in our set.

90 90 CHAPTER 2. EAF S AND THEIR COMPUTATION Approximate no. of paths in solution Netgen-n Grid-short-D Grid-long-D Grid-longer-n Grid-fine-n Arcs in the time-expanded network (in millions) Figure 2.25: The various artificial instances. The ones created using netgen form no particular series. But the sets Grid-short-D and Grid-long-D, D being the total supply D {4k, 20k, 100k, 250k}, are designed to contrast each other because they mostly differ in the orientation that the grid is to be traversed. Also, Grid-longer-n, n {1, 2, 3, 4, 5}, and Grid-fine-n, n {1, 2, 3}, are designed against each other: The former increases the size of the time-expanded network through increasing the transit times on the arcs, while the latter subdivides the grid.

91 2.6. COMPUTATIONAL RESULTS 91 Instance V A S + supply paths used T / T/ used A T/ Padang 4,444 12,436 3, ,970/g group size g {1, 2, 5, 10} Padang-1s-1g k 2,422 2, M Padang-2s-1g 90k 1,214 1, M Padang-5s-1g 50k M Padang-10s-1g 30k M Padang-2s-2g 70k 1,213 1, M Padang-5s-2g 40k M Padang-10s-2g 27k M Padang-5s-5g 28k M Padang-10s-5g 20k M Padang-10s-10g 15k M Padang flood 4,444 12,436 3, ,970/g wave arrives at t = 600s Padang-1s-1g-flood k M. Padang-2s-1g-flood 91k 2,050 2, M Padang-5s-1g-flood 50k M Padang-10s-1g-flood 29k M Padang-2s-2g-flood 69k 2,126 2, M Padang-5s-2g-flood 40k M Padang-10s-2g-flood 25k M Padang-5s-5g-flood 28k M Padang-10s-5g-flood 19k M Padang-10s-10g-flood 15k M Table 2.4: The real-world instances of the city of Padang that we use for testing.

92 92 CHAPTER 2. EAF S AND THEIR COMPUTATION Instance V A S + supply paths used T / T/ used A T/ Berlin 4,174 12,814 3, ,950/g group size g {1, 2, 5, 10} Berlin-1s-1g k 2, M Berlin-2s-1g 36k 1, M Berlin-5s-1g 16k M Berlin-10s-1g 9k M Berlin-2s-2g 34k 1, M Berlin-5s-2g 15k M Berlin-10s-2g 9k M Berlin-5s-5g 15k M Berlin-10s-5g 9k M Berlin-10s-10g 8k M Buildings Audimax 1,493 5, M OH14 3,516 14, M Telefunken 5,966 21, M Airplanes A , M A380-front 932 1, M only front exits open B , M B747-front 687 1, M only front exits open A380plus 1,367 2, M a hypothetical, expanded A380 A380plus-front 1,367 2, M only front exits open Table 2.5: The real-world instances of Berlin, buildings and airplanes that we use for testing.

93 2.6. COMPUTATIONAL RESULTS 93 Instance V A S + supply paths used T / T/ used A T/ Netgen all with 10 sinks Netgen-1 10,000 99, ,000 12k M τ [10, 100], u [1, 10] Netgen-2 5,000 25, , k M τ [1, 20], u [1, 10] Netgen-3 5,000 99, ,000 24k M τ [5, 10], u [1, 3] Netgen-4 5,000 99, ,000 24k M τ [10, 20], u [1, 3] Grid short & wide 30x100 E(τ) = E(u) = 4 Grid-short-4k 3,000 11,740 2,170 4,000 4k M Grid-short-20k 3,000 11,740 2,897 20,000 18k M Grid-short-100k 3,000 11,740 2, ,000 87k M Grid-short-250k 3,000 11,740 2, , k M Grid long & narrow 100x30 E(τ) = E(u) = 4 Grid-long-4k 3,000 11,740 2,203 4, k M Grid-long-20k 3,000 11,740 2,965 20,000 19k M Grid-long-100k 3,000 11,740 2, ,000 95k 1,360 1, M Grid-long-250k 3,000 11,740 2, , k 3,476 3, M Grid increasingly long 100x30 sources in far half E(u) = 4 Grid-longer-1 3,000 11,740 1,500 20,000 20k M E(τ) = 4 Grid-longer-2 3,000 11,740 1,500 20,000 20k M E(τ) = 8 Grid-longer-3 3,000 11,740 1,500 20,000 20k 1,752 1, M E(τ) = 20 Grid-longer-4 3,000 11,740 1,500 20,000 20k 3,965 4, M E(τ) = 50 Grid-longer-5 3,000 11,740 1,500 20,000 20k 7,991 8, M E(τ) = 100 Grid increasingly fine Grid-fine-1 3,000 11,740 2,895 20,000 18k M E(τ) = 4, E(u) = 3 Grid-fine-2 12,000 47,480 9,653 20,000 20k M E(τ) = 2, E(u) = 2 Grid-fine-3 48, ,960 16,242 20,000 20k M E(τ) = 1, E(u) = 1.8 Table 2.6: The artificial instances. For τ and u, netgen draws integral numbers uniformly from the given intervals. The grid instances use normal distributions and then round. The expected values are given in the table, the standard deviation is always half the respective expected value, σ = E/2.

94 94 CHAPTER 2. EAF S AND THEIR COMPUTATION Settings for Our Algorithm We show our approach with a selection of suitable sets of parameters. All of these use the same data structures and the basic options like vertex clean-up and repeated paths are enabled, as well as making use of multiple sinks (in the forward search; sources in the reverse search) to produce more paths. 1. Forward BFS. This is a forward search using the breadth first search with a quick cutoff value of 0.5 to obtain many paths. 2. Forward Seeking. This is a more aggressive forward search using the seeking guidance and a quick cutoff value of 0.1 to achieve faster shortest path searches. 3. Reverse BFS. This is a reverse search using the breadth first search, again with a quick cutoff value of 0.5. It runs a forward search whenever the reverse search (and the repeated paths) fail to produce a path. It employs the track unreachable option and updates the lower bounds on each vertex whenever the forward search is run. A forward search is also enforced when the arrival time has increased by at least 3 since the last update. 4. Mixed BFS. This is a mixed (= bidirectional) search, using the breadth first search for its forward and reverse component. The quick cutoff value is 0.5. The reverse search uses track unreachable just like in the option above, updating it whenever the mixed search degenerates to a forward search. 5. Mixed Seeking/BFS. This only changes the forward path of Mixed BFS above to seeking guidance, and adjusts the quick cutoff value to 0.1. The reverse part remains a breadth first search. We use Java version from the OpenJDK Runtime Environment in 64- bit, run with the -server flag to activate more optimization in the compiler. The memory allowance (the heap in addition to the memory of the virtual machine) was 2 GB Algorithms for Comparison There are many algorithms and implementations to choose from for the Min- CostFlow problem, but not that many specialized to the MinTravelTime problem. The fastest solver for MinCostFlow problems that we are aware of is cs2 (version 4.6) by IG Systems and notably developed by Goldberg [39]. It relies on multiple maximum flow computations, for which a push-relabel algorithm is used. Various heuristics are also important. The binary was compiled with gcc through the distributed makefile, and we used the program without any optional parameters. The code is free for non-commercial purposes. We also include another established algorithm, the network simplex. This is a specialization of the simplex algorithm to MinCostFlow problems, which

95 2.6. COMPUTATIONAL RESULTS 95 replaces the linear algebra steps by graph theoretic operations. (A generic simplex algorithm is much slower.) We chose the network simplex implementation mcf (version 1.3) by Löbel [64]. We used the binaries provided by the author of the code at their default settings, which employs a primal network simplex. The program is free for academic purposes. We also considered the network simplex implementations net simplex by Jensen and Berthelsen [52], and netopt in the commercial program IBM ILOG CPLEX (version ), but found them slower than mcf on our instances. To contrast our algorithm against other implementations of the SSP algorithm, we considered the MinCostFlow implementation provided by the commercial library LEDA (version 5.1.1) [3]. According to the documentation, this is a SSP with capacity scaling. However, it turned out to be quite uncooperative, often throwing errors or being stuck for hours on instances, while similar ones were solved in seconds. It also failed to solve a single of the larger instances within 24 hours. We compiled it with gcc 4.5.2, but only with the parameter -O2, because the highest optimization level -O3 produced slower binaries. As LEDA often quit instances with errors and is generally quite sensitive to compilers, we tried gcc 3.4 and 4.0 as well, which LEDA explicitly supports. But it still failed in the same way as with the latest compiler. Being one of the faster SSP algorithms, we still try it on a few more instances than the following SSP implementations. But with the unreliability we observed, the results should be taken with a lot of salt. Two more algorithms are specializations of the SSP algorithm to the MinTravelTime problem: lodyfa [66] is a suite of algorithms for network flows over time, which implements a SSP algorithm by Tjandra and Hamacher [46, 89]. But it consists of 32-bit programs for Microsoft Windows and we could not recompile them to a more suitable target. (The 32-bit memory allowance is too limited for some of our larger instances.) The particular code in that suite works on time-expanded networks where the capacities and transit times may change at every time step. In our terminology, it is a reverse search because the search starts at the supersink. ZET (version beta, not yet released) [92] also implements a solver for the MinTravelTime problem, which is similar to the algorithm of Tjandra and Hamacher. Being written in Java it can also run as a 64-bit program for more available memory. We use ZET with Java version using the HotSpot64-bit virtual machine, but the same command-line options as for our implementation. Like lodyfa, we only test it on instances with low total supply. The algorithm would benefit from allowing holdover, but it is disabled in these tests Results Against Established Algorithms We first present results comparing our approach against the two established MinCostFlow solvers, cs2 and mcf. All computations were done on a quadcore Intel i5-2500k CPU clocked at a fixed 3.3 GHz with 16 GB RAM, running the Ubuntu-flavored Linux generic in 64-bit. Because all the algorithms are single-threaded, the light desktop-work and occasional fully utilized second core should have little impact on the measured times. These times are

96 96 CHAPTER 2. EAF S AND THEIR COMPUTATION CPU times reported by cs2 and mcf, which do not include the time spent on reading the problem, and wall-clock time for our implementation, also excluding input. Since the Java virtual machine itself is multi-threaded, reporting wall-clock time for our code hides a bit of the overhead that Java causes, but might also hide a bit of memory management cost. We think that this mostly cancels each other out, but see the discussion on memory management in Section for more details. To begin, we want to summarize the central themes that occur in most of the results. Observation 2.7. The variants with seeking guidance, Forward Seeking and Mixed Seeking, perform best among our configurations on all real-world instances. The Forward BFS variant has the best worst-case behavior of our configurations. A large time horizon has by itself little impact on performance, but often occurs due to large supplies which do affect our algorithms. cs2 performs very well on all of our instances. It is often faster than our implementation when the supply is at least in the thousands or the time horizon is small. Still, our variants are competitive on the real-world instances. The choice of the cost function has little effect on cs2. mcf often has the worst performance. Putting the costs on the arcs to the sinks seems to be better overall for mcf, although there are exceptions. The Diagrams We will visualize the results by pitting the running times against the number of arcs in the time-expanded network. We use log-log-scale diagrams because of the wide range of values. They also exhibit how the algorithms scale to larger instances. We chose the axes such that they always cover the same multiplicative range. Thus, the slopes, which represent the exponent k assuming a polynomial running time n k, are comparable across diagrams. The downside is that some diagrams look a bit empty or crammed. Note that for the real-world instances of Padang and Berlin, the x-axis is labeled according to the discretization step size and not the size of the time-expanded network. There is no discernible difference, though, as the step size is inverse proportional to the size of the time-expanded network. For cs2 and mcf we only display results for the computations where we put the costs on the arcs to the sinks, A, and 0 elsewhere. As can be seen in the full tables, there is little difference for cs2, while mcf typically profits from this, especially on the Padang and Berlin instances.

97 2.6. COMPUTATIONAL RESULTS 97 Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Discretization 1 Figure 2.26: The running times of the algorithms on the Padang instances Padang- s-1g for {1, 2, 5, 10}. The Seeking variants cope very well with the finer discretizations. Real-world Instances: Padang Let us first consider the results on the various Padang instances. For reference, all numbers are listed in Table 2.7. The series with varying discretizations but no grouping (g = 1) are shown in Figure 2.26 for the instances Padang- s-1g, and in Figure 2.27 for the instances Padang- s-1g-flood. Overall, cs2 performs best here, with Forward Seeking being a close second. It is evident from the plots that the variants with seeking guidance scale much better to finer discretizations than cs2, a trait that was one of our design goals. The worst scaling of one of our variants, namely Reverse BFS, is very similar to the one of cs2, although the absolute running times are about 4 times as slow. Note that, as we will see often, mcf scales worse than any other algorithm and is also the slowest in absolute terms. The instances with the simulated flooding have a larger time horizon due to arcs being removed from the network at certain points in time. The net effect are time-expanded networks that are roughly 25% larger. For cs2 and mcf, this is reflected in the running times. The performance of our algorithm, however, improves by about 30% for all configurations that make use of the breadth first search, and does not change for the seeking variants. This is likely due to the more restricted network making the path search easier, while the slightly increased time horizon has little impact on our approach. We also compare how the instances can be scaled down or up by adjusting the step size and the group size g simultaneously. As discussed in Section and captured in Table 2.3, finer discretizations introduce errors in the capacities. By starting with rough capacities and refining them, we obtain

98 98 CHAPTER 2. EAF S AND THEIR COMPUTATION a set of instances that only increases the accuracy in the transit times. The error in the capacities stays at about 5% throughout. The results are shown in Figure One can also view this plot backwards as a way to speed up the SSP algorithm when necessary. The roughest instance Padang-10s-10g has an 8% error in the transit times and 5% in the capacities. Using Padang-10s-1g (without grouping) to reduce the error in the capacities to 0.5% might be unnecessary. In any case, on these instances Padang- s- g, Forward Seeking is similarly fast as cs2 even for coarse discretizations, on which cs2 previously won due to the small time horizon. The results are analogous on the instances with flooding, again with our algorithms having a slight edge due to the increased time horizons. Comparing just our configurations among themselves, the seeking guidance has a clear advantage over the breadth first search in the larger instances. On the instances with flooding, the breadth first search catches up, though. This could possibly be because the seeking guidance does not consider the unavailability of arcs.

99 2.6. COMPUTATIONAL RESULTS 99 Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Discretization 1 Figure 2.27: The running times of the algorithms on the Padang instances with the simulated wave, Padang- s-1gflood for {1, 2, 5, 10}. The worst scaling exhibited by one of our variants (namely Reverse BFS) still matches cs2. Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Discretization = group size g Figure 2.28: The instances Padang- s- g, for {1, 2, 5, 10}. By scaling the time axis and the group size together, the rounding error in the capacities remains constant at 5%. Grouping is a good approach to speed up the SSP algorithms on coarse discretizations, and Forward Seeking is similarly fast as cs2 on this series.

100 100 CHAPTER 2. EAF S AND THEIR COMPUTATION SSP Forward SSP Mixed SSP Reverse cs2 mcf Instance BFS Seeking BFS Seeking BFS cost = τ cost on A cost = τ cost on A Padang-1s-1g 0:28:18 0:10:11 0:27:21 0:14:02 0:31:55 0:08:33 0:07:18 2:32:10 1:29:52 Padang-2s-1g 0:14:53 0:06:28 0:12:15 0:08:33 0:12:37 0:03:53 0:03:36 1:01:32 0:33:42 Padang-5s-1g 0:06:37 0:03:33 0:04:32 0:04:35 0:04:02 0:01:08 0:01:11 0:16:26 0:09:00 Padang-10s-1g 0:03:04 0:01:56 0:02:12 0:02:41 0:01:39 0:00:31 0:00:31 0:03:58 0:02:10 Padang-2s-2g 0:09:41 0:03:41 0:09:37 0:04:55 0:09:12 0:03:26 0:03:04 0:37:25 0:23:03 Padang-5s-2g 0:04:33 0:02:14 0:03:32 0:02:58 0:03:11 0:01:06 0:01:19 0:11:00 0:06:27 Padang-10s-2g 0:02:22 0:01:25 0:01:44 0:01:58 0:01:21 0:00:28 0:00:34 0:03:06 0:01:47 Padang-5s-5g 0:02:31 0:01:03 0:02:22 0:01:32 0:02:03 0:01:01 0:01:16 0:06:04 0:03:59 Padang-10s-5g 0:01:25 0:00:45 0:01:11 0:01:06 0:00:54 0:00:27 0:00:28 0:01:56 0:01:17 Padang-10s-10g 0:00:54 0:00:28 0:00:52 0:00:42 0:00:43 0:00:28 0:00:28 0:01:28 0:01:12 Padang-1s-1g-flood 0:16:54 0:10:24 0:18:04 0:14:13 0:22:34 0:09:37 0:10:33 2:55:09 1:53:22 Padang-2s-1g-flood 0:10:22 0:06:54 0:08:41 0:08:49 0:10:04 0:04:18 0:05:03 1:27:24 0:42:27 Padang-5s-1g-flood 0:04:38 0:03:30 0:02:53 0:04:51 0:03:08 0:01:36 0:01:33 0:19:48 0:10:25 Padang-10s-1g-flood 0:02:16 0:01:58 0:01:25 0:02:43 0:01:17 0:00:40 0:00:40 0:04:22 0:02:22 Padang-2s-2g-flood 0:06:18 0:03:50 0:05:13 0:05:16 0:06:46 0:04:00 0:04:41 0:43:30 0:27:26 Padang-5s-2g-flood 0:03:19 0:02:16 0:02:18 0:03:06 0:02:30 0:01:38 0:01:50 0:11:19 0:06:48 Padang-10s-2g-flood 0:01:49 0:01:26 0:01:13 0:01:59 0:01:06 0:00:45 0:00:43 0:03:54 0:02:08 Padang-5s-5g-flood 0:01:51 0:01:09 0:01:32 0:01:38 0:01:35 0:01:30 0:01:33 0:06:56 0:04:42 Padang-10s-5g-flood 0:01:10 0:00:48 0:00:54 0:01:12 0:00:48 0:00:41 0:00:45 0:02:35 0:01:37 Padang-10s-10g-flood 0:00:47 0:00:29 0:00:40 0:00:44 0:00:36 0:00:34 0:00:37 0:01:38 0:01:25 Table 2.7: The computational results for all Padang instances.

101 2.6. COMPUTATIONAL RESULTS 101 Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Discretization 1 Figure 2.29: The Berlin instances Berlin- s-1g, for {1, 2, 5, 10}. Our approaches work very well at all discretizations, but also scale slightly better than cs2. Real-world Instances: Berlin The results on the Berlin instances are given in Table 2.8, and we also look at the same diagrams that we used for Padang. Namely, Figure 2.29 shows the results on Berlin- s-1g, while Figure 2.30 shows the results for scaling transit times and capacities at the same time. As we mentioned in the description of the Berlin instances, grouping has almost no effect here because the capacities are large on all paths. Even the finest discretization Berlin-1s-1g uses only approximately 66,000 paths for the 823,000 flow units. Thus, the results from these two scaling schemes look almost identical. The obvious difference to Padang is that our algorithms are all significantly faster than cs2 even at = 10 and scale a bit better, too. Even mcf performs better on the smaller instances than cs2, but again exhibits the worst scaling and is the slowest algorithm on the largest instance, which is about the same size as Padang-1s-1g. While our approaches with seeking guidance again perform best, Reverse BFS delivers about the same speed. This is one of the few instance sets where this happens, though. The instance seems to be generally favorable for all of our algorithms because the many and close sinks allow to find multiple paths at the same time without many conflicts.

102 102 CHAPTER 2. EAF S AND THEIR COMPUTATION Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Discretization = group size g Figure 2.30: The instances Berlin- s- g, for {1, 2, 5, 10}. Again, the time axis and the group size is scaled simultaneously as in Figure Our variants have a small edge in scaling over cs2. Real-world Instances: Buildings Next, we look at the results from the set of instances describing buildings in great detail, that stem from the ZET project. These are summarized in Table 2.8 and Figure Clearly, our SSP approach performs very well on these instances with low total supply, no matter which variant we employ. (We will see in the next section how other SSP-based approaches perform on these instances.) All our variants significantly outperform the other algorithms on the two larger instances with mcf being the slowest again. Note that the instances have increasingly large base graphs as well as time horizons, but the largest instance happens to have the smallest supply (369 versus 600 and 661). We want to point out that we see the good results of our algorithm as a chance to enable interactive support for evaluating buildings based on evacuation routes. With low memory requirements and fast results, our approach is suitable for testing various scenarios with changing positions of the evacuees, environmental hazards and other aspects that cannot be considered in just a single instance. Real-world Instances: Airplanes The instances describing airplanes posed no challenges to any of the algorithms. All algorithms finish in 1 or 2 seconds, except on the largest (artificially enlarged) instance A380plus-front, where cs2 takes 5 seconds and mcf 14 seconds. See Table 2.8. This is due to the time-expanded networks being

103 2.6. COMPUTATIONAL RESULTS 103 Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Arcs in the time-expanded network (in millions) Figure 2.31: The building instances Audimax, OH14 and Telefunken sorted by the number of arcs in the respective timeexpanded networks. Our algorithms are difficult to distinguish because we only recorded the running times in seconds. smaller than in the other instances and the supplies are also moderately sized. Except for mcf, all algorithms could be used for quickly evaluating varying scenarios in the way we suggested above. Indeed, the variants with only the front exits accessible are an example of such a technique.

104 104 CHAPTER 2. EAF S AND THEIR COMPUTATION SSP Forward SSP Mixed SSP Reverse cs2 mcf Instance BFS Seeking BFS Seeking BFS cost = τ cost on A cost = τ cost on A Berlin Berlin-1s-1g 0:02:07 0:00:59 0:02:01 0:01:11 0:01:01 0:07:50 0:07:50 1:15:27 0:11:37 Berlin-2s-1g 0:01:14 0:00:31 0:01:10 0:00:38 0:00:33 0:03:36 0:03:43 0:22:06 0:03:19 Berlin-5s-1g 0:00:32 0:00:13 0:00:32 0:00:16 0:00:14 0:01:14 0:01:16 0:02:20 0:00:45 Berlin-10s-1g 0:00:17 0:00:07 0:00:18 0:00:09 0:00:08 0:00:32 0:00:34 0:00:24 0:00:13 Berlin-2s-2g 0:01:06 0:00:30 0:01:06 0:00:36 0:00:31 0:03:23 0:03:29 0:16:03 0:03:24 Berlin-5s-2g 0:00:31 0:00:13 0:00:32 0:00:16 0:00:14 0:01:13 0:01:16 0:02:54 0:00:44 Berlin-10s-2g 0:00:17 0:00:07 0:00:17 0:00:09 0:00:08 0:00:32 0:00:34 0:00:23 0:00:14 Berlin-5s-5g 0:00:28 0:00:12 0:00:29 0:00:15 0:00:13 0:01:07 0:01:11 0:02:46 0:00:43 Berlin-10s-5g 0:00:16 0:00:07 0:00:16 0:00:09 0:00:08 0:00:29 0:00:32 0:00:22 0:00:14 Berlin-10s-10g 0:00:15 0:00:06 0:00:16 0:00:08 0:00:08 0:00:30 0:00:31 0:00:26 0:00:13 Buildings Audimax 0:00:02 0:00:01 0:00:02 0:00:02 0:00:02 0:00:01 0:00:01 0:00:02 0:00:02 OH14 0:00:03 0:00:02 0:00:03 0:00:02 0:00:02 0:00:11 0:00:11 0:00:21 0:00:17 Telefunken 0:00:04 0:00:04 0:00:04 0:00:04 0:00:04 0:01:10 0:01:25 0:06:04 0:04:40 Airplanes A380 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:00 0:00:00 0:00:01 A380-front 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:02 0:00:02 0:00:02 B747 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:00 0:00:00 0:00:00 0:00:00 B747-front 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:02 0:00:01 0:00:03 0:00:02 A380plus 0:00:01 0:00:01 0:00:02 0:00:01 0:00:02 0:00:02 0:00:02 0:00:02 0:00:02 A380plus-front 0:00:01 0:00:01 0:00:02 0:00:02 0:00:02 0:00:05 0:00:05 0:00:18 0:00:14 Table 2.8: The computational results for the real-world instances except Padang, which has its own Table 2.7.

105 2.6. COMPUTATIONAL RESULTS 105 Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf 1 Netgen-1 Netgen-2 Netgen-3 Netgen-4 Figure 2.32: The four netgen-instances Netgen-n, n {1, 2, 3, 4}, do not show a clear picture. Note that there is no justification for interpolating from one instance to another in this plot because the instances are not a coherent set. Artificial Instances: Netgen As was said in the description of the netgen instances, we have little control over their graphs. As it turns out, all of these instances have a low time horizon (only up to 225) but all have a very large number of arcs in the static network. In the end, the time-expanded networks have a size similar to Padang at = 2s, except for the much smaller Netgen-2 instance. In any case, performance of all algorithms is better on the netgen instances than on the similar sized Padang instances, see Figure 2.32 or Table 2.9. Among our set of configurations, Mixed Seeking performs best and even outperforms cs2 despite the supplies of 5,000 to 30,000 and the low time horizons. Forward Seeking is essentially tied with cs2. Artificial Instances: Short vs. Long Grids The first set of grid instances, the short and wide ones Grid-short-D, D {4k, 20k, 100k, 250k}, feature a grid where the sources are generally close to the sinks and more parallel paths exist. As the total supply D is increased from 4,000 to 20,000, 100,000, and 250,000, the time horizon increases from 116 to 203, 908, and 2,196, respectively. The number of paths used is about 80% to 90% of the respective total supply. For comparison, the most complicated instances so far, Padang-1s-1g and Padang-1s-1g-flood, require only approximately 135,000 paths. Looking at the running times in Figure 2.33 or Table 2.9, we see that

106 106 CHAPTER 2. EAF S AND THEIR COMPUTATION Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf 10 Arcs in the time-expanded network (in millions) 10 Figure 2.33: The instances Grid-short-4k, Grid-short-20k, Grid-short-100k, and Grid-short-250k increase the total supplies from 4,000 to 250,000. The instance has short paths from the sources to the sinks, though, which helps our approach. Already the second instance appears to be significantly more difficult than the first. Afterwards, all algorithms but mcf scale similarly. all algorithms start out between 8 and 17 seconds for the smallest instance, Grid-short-4k. However, there are already large differences on the second instance, Grid-short-20k. While from then on our approaches and cs2 scale similarly, cs2 is much faster. The seeking variants are again among the best of our configurations, but the simple Forward BFS works similarly well on these instances. Overall, the performance on the largest instance is similar to Padang-1s-1g, which inspired the short and wide grid series. Twice as many paths to augment take their toll on the SSP algorithm, though. The long and narrow grid instances have the same parameters as the short and wide grids, but the sinks are positioned at the short side. This increases the distance between sources and sinks, leading to somewhat larger time horizons, but also making the path search (and the augmentation) slower. This is to be expected for a SSP algorithm and the effects on our variants are severe. Compared to the previous set of short and wide grids, all of our variants are slower and scale worse on this instance, as seen in Figure It is particularly discouraging to see the seeking variants not finish within 24 hours, because one would expect the guidance to handle the larger distances to the sinks better. Only Forward BFS remains somewhat usable on all of these instances. Our best interpretation of these results and our log files is that the long grid with random lengths and capacities makes establishing a maximum flow

107 2.6. COMPUTATIONAL RESULTS 107 for each time step an intricate process. The breadth first searches have two related advantages over the seeking guidance in such a situation: First, they find the paths with nearly the minimum number of arcs needed. (Due to vertex clean-up, they do not have to have truly the minimum length.) In the later stages of the algorithm, the paths may consist of about 120 arcs each on average, which is not unusual for a grid that is 100 arcs long. The seeking guidance, however, returns paths with hundreds or even more than a 1,000 arcs! Many of these are residual arcs. The explanation lies in the behavior of the guidance scheme to emulate Dijkstra s algorithm and seek out early time layers in the time-expanded network. But if these are already filled with flow (and there are a lot of flow units in these instances), this is a futile attempt. The result are paths that slowly descend to very early time layers and then quickly rise again before they start another descend. This up and down serves no purpose and the propagated intervals are also likely to be very small. Still these propagation steps will be prioritized as they are earlier in the time-expanded network and will eventually find a steep ascend to the sinks, for example through a depleted source vertex. Tied into this is a second effect: The long and convoluted paths prevent the seeking variants from finding parallel paths. They typically return only a single path per search, while the breadth first searches return about five. Finally, there is little difference in how cs2 handles these instances compared to the short and wide set. The network simplex mcf is about as lost as the seeking variants, though. Artificial Instances: Large Time Horizons The next set is the one with the increasingly long travel times, where the total supply does not change but the time horizon increases from 500 to 8,000 nevertheless. This results in some astonishing measurements, shown in Figure 2.35 and Table 2.9. The running times of the reverse search, Reverse BFS, are roughly equal throughout this set, while Forward Seeking and to a lesser degree Mixed Seeking solve the larger instances faster than the smaller ones. The worst scaling is exhibited by Forward BFS and Mixed BFS, which take about 3 times longer for the 16 times as large time horizon. On the other hand, cs2 scales roughly with the size of the time-expanded network and performs very well again, beating our algorithms on all instances except the largest ones. The trend is obviously against cs2, though. Note that mcf seems to hit a limit on the size of the network it can work on, because it fails immediately upon reading the largest instance. Artificial Instances: Large Graphs The final set of grid instances increases the size of the grid from 100x30 to 200x60 and 400x120. These instances have the same supplies and similarly sized time-expanded networks as the previous set. The results are shown in Figure 2.36 and Table 2.9.

108 108 CHAPTER 2. EAF S AND THEIR COMPUTATION Running time in s cancelled >24h Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Arcs in the time-expanded network (in millions) Figure 2.34: The instances Grid-long-4k, Grid-long-20k, Grid-long-100k, and Grid-long-250k have total supplies 4,000 to 250,000. Presumably, the long grid coupled with the high supply makes establishing a maximum flow very difficult for our algorithms. The Seeking variants and mcf do not finish within 24 hours on the largest instance, while cs2 still performs very well. Our algorithm performs well except for Reverse BFS and Mixed BFS. The latter is probably affected by its reverse component. Both of the seeking variants are faster than cs2, although not much. This highlights how the increasing size of the base graph affects our algorithm just like any other. This is unlike the previous set, where the increasing time horizon made a much smaller difference, if at all. Again, mcf does not load the largest instance.

109 2.6. COMPUTATIONAL RESULTS Running time in s F-BFS F-Seek Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Arcs in the time-expanded network (in millions) Figure 2.35: The instances Grid-longer-1 to Grid-longer-4 increase the average transit times of the arcs, while keeping the total supplies constant at 20,000. Note that the seeking variants become faster in absolute terms as the discretization becomes finer. Overall, all of our variants scale very well on this set of instances. mcf did not load the largest instance Running time in s F-BFS F-Seek Mixed BFS Mixed Seeking Reverse BFS cs2 mcf Arcs in the time-expanded network (in millions) Figure 2.36: The instances Grid-fine-1 to Grid-fine-3 use grids 100x30, 200x60, and 400x120, respectively, while adjusting the transit times and capacities to keep the time horizon roughly equal. The Seeking variants handle the finer grids well.mcf did not load the largest instance.

110 110 CHAPTER 2. EAF S AND THEIR COMPUTATION SSP Forward SSP Mixed SSP Reverse cs2 mcf Instance BFS Seeking BFS Seeking BFS cost = τ cost on A cost = τ cost on A Netgen Netgen-1 0:02:17 0:00:24 0:01:39 0:00:26 0:00:15 0:00:29 0:00:28 0:01:11 0:00:52 Netgen-2 0:00:03 0:00:01 0:00:03 0:00:02 0:00:01 0:00:02 0:00:02 0:00:02 0:00:02 Netgen-3 0:02:32 0:00:53 0:00:27 0:00:25 0:02:16 0:00:51 0:00:49 0:02:14 0:01:39 Netgen-4 0:02:35 0:00:45 0:00:50 0:00:32 0:03:24 0:00:42 0:00:41 0:01:51 0:01:23 Grid short & wide Grid-short-4k 0:00:16 0:00:11 0:00:13 0:00:12 0:00:11 0:00:07 0:00:08 0:00:16 0:00:17 Grid-short-20k 0:03:50 0:02:38 0:04:45 0:02:28 0:10:58 0:00:34 0:00:35 0:05:08 0:05:13 Grid-short-100k 0:18:30 0:26:56 0:36:09 0:25:56 2:00:10 0:03:13 0:03:23 2:08:01 2:30:21 Grid-short-250k 0:48:25 0:46:38 2:24:25 0:43:43 6:12:43 0:11:56 0:11:32 9:41:43 13:06:48 Grid long & narrow Grid-long-4k 0:00:34 0:00:26 0:00:36 0:00:29 0:00:37 0:00:31 0:00:36 0:06:07 0:05:19 Grid-long-20k 0:05:52 0:08:47 0:05:44 0:06:19 0:12:38 0:01:17 0:01:33 0:24:09 0:23:06 Grid-long-100k 1:15:58 6:03:29 2:51:26 21:24:13 3:33:23 0:06:12 0:06:47 7:59:38 7:57:49 Grid-long-250k 8:00:34 >24h 19:39:14 >24h 22:28:43 0:17:32 0:19:51 >24h >24h Grid increasingly long Grid-longer-1 0:17:08 0:32:07 0:16:28 0:22:10 0:42:28 0:01:21 0:01:24 0:42:50 0:41:19 Grid-longer-2 0:25:52 0:30:12 0:24:21 0:20:40 0:45:45 0:02:25 0:02:05 1:08:01 1:06:03 Grid-longer-3 0:31:18 0:18:58 0:27:31 0:16:19 0:49:49 0:05:03 0:04:55 2:13:56 2:11:31 Grid-longer-4 0:46:44 0:19:41 0:45:13 0:19:37 0:50:25 0:11:32 0:11:17 5:25:24 5:15:52 Grid-longer-5 0:49:23 0:17:05 0:45:46 0:18:14 0:44:47 0:20:00 0:18:44 Error Error Grid increasingly fine Grid-finer-1 0:01:12 0:00:38 0:01:17 0:00:21 0:04:34 0:00:44 0:00:51 0:05:21 0:05:19 Grid-finer-2 0:06:26 0:04:53 0:19:05 0:03:09 0:49:52 0:05:08 0:05:35 1:24:29 1:22:49 Grid-finer-3 0:47:02 0:27:01 3:33:00 0:18:49 10:53:58 0:37:15 0:32:35 Error Error Table 2.9: The running times for all the artificially generated instances.

111 2.6. COMPUTATIONAL RESULTS 111 Forward LEDA Lodyfa ZET Instance Seeking cost = τ cost on A Padang-10s-10g 0:00:28 (E: 0:02:10) > 24h 5:08:02 4:30:36 Audimax 0:00:01 > 24 h (E: 0:00:05) 0:00:32 0:06:25 A380 0:00:01 0:00:02 0:00:02 0:00:10 0:00:26 Netgen-2 0:00:01 0:00:15 0:00:17 0:06:05 0:01:15 Grid-short-4k 0:00:11 0:00:56 0:00:56 0:26:52 0:06:57 Table 2.10: The results for the other SSP algorithms in contrast to Forward Seeking. Note that LEDA reported errors, upon which we list the time spent so far Results for the SSP Algorithms Here we compare our code against other implementations of the SSP algorithm and show where the advantages in our approach stem from. For the hardware and software environment used, see the beginning of the previous section. Just like for our approach, we report wall-clock time for the Java program ZET. For LEDA we report CPU time without the time spent reading the input. For lodyfa, we report overall CPU time including reading the input data. We have to run it under the Windows-like environment Wine. Being a commandline program with almost no user or system interaction, this should not affect the speed significantly. However, lodyfa crashes upon writing the solution under Wine, which incidentally makes the measurement more useful than if the output was also included. There clearly are limits to SSP-based approaches, and we only run the other SSP algorithms on a few selected instances that neither have a too large time-expanded network nor an unusually large supply. These are the instances Padang-10s-10g, Audimax, A380, Netgen-2 and Grid-short-4k. For the results, see Table Even though LEDA had trouble finishing some of these instances, it was generally the fastest. We looked for larger instances on which it would finish correctly. This proved troublesome, but on those that did work, LEDA performed well, see Table It is interesting to see that the general-purpose SSP algorithm in LEDA is faster than the special-purpose algorithms lodyfa and ZET. This at least speaks for the optimization efforts that presumably went into LEDA. But it is still slower than any of the algorithms we tested in the previous section. Besides, benchmarking LEDA was somewhat tiresome because it seems to have another bug that does not lead to crashes but to infinite loops on some instances. The Effect of the Improvements We want to highlight what makes our approach work significantly faster than the other SSP algorithms. For this, we have summarized an evolution of the additions that we made in our implementation, as described in Sections and This does not necessarily reflect the order in which we developed

112 112 CHAPTER 2. EAF S AND THEIR COMPUTATION Forward LEDA Instance Seeking cost = τ cost on A Grid-short-20k 0:02:38 0:11:24 0:11:37 Grid-longer-1 0:32:07 0:54:12 (E: 0:14:19) B747-front 0:00:01 0:00:10 0:00:10 Netgen-3 0:00:53 0:07:47 0:06:47 Table 2.11: The results for LEDA on some instances which it could finish successfully. those, but rather the complexity of these options. This is to enable others to achieve good results more quickly by implementing only the necessary features. We show these advances for the forward and the reverse search. The process is the same for both search directions, except for the last one, respectively, two steps. See Figure 2.37 and Figure 2.38, respectively Table 2.12, for the results on the instances Padang-1s-1g, Telefunken and Grid-long-20k, which we believe to highlight the effects. We also include the number of iterations of the outer loop of the SSP algorithm in the plots to show which options affect these. We start with a breadth first search that already uses intervals but only augments a single path in each iteration. (The search stops as soon as a path has been found.) This uses standard AVL-trees for the data structures. Of course, we cannot completely unroll our development and so the data structures and the algorithm are not as optimized for these settings as possible. For example, the vertex labels already include the flag whether an interval has been processed and this is also read and written, even though there is no use for this with a breadth first search and without vertex label clean-up. The first and big improvement is to let the search find multiple parallel paths per iteration. By design, this can only help instances with multiple flow units traveling between different pairs of terminals at each time step, but this is all too common. Enabling the repetition of paths is the next step and helps a lot for instances that have many flow units starting on the same vertex. Even if this is not the case, it is computationally inexpensive so should always be considered. Then we sort paths (but not repeated paths) before we try to augment them. Recall that this is meant to emulate a breadth first search (or Edmonds- Karp algorithm). However, there is little benefit to this and it also decreased performance noticeable on some instances. We would therefore have to recommend against it. However, it might be useful once other settings (like guidance) shift the search away from a breadth first search. We did not test this further, though. Reducing the quick cutoff setting from to 0.5, so that the search will stop after 50% more propagation steps once a shortest path has been found, is mostly a safety switch. It prevents finding too many overlapping paths, which can happen in the reverse (and mixed) search, but usually does not cut off too many paths. Such a setting should also prevent some of the almost-infinite

113 2.6. COMPUTATIONAL RESULTS Padang-1s-1g Grid-long-20k Telefunken Running time in s Iterations 1 Single path / iter., quick cutoff 0 Multiple paths / iter., quick cutoff Repeated paths Sort paths before augmentation Figure 2.37: How options affect the running times of the forward search (red line) and the number of iterations (black line). Quick cutoff 0.5 Vertex label clean-up Faster data structures Seeking guidance Quick cutoff 0.1 loops that might occur on bad instances even though a path has already been found. Here, the performance of the forward search suffers on the Padang instance, so a larger setting might have worked better overall for the forward breadth first search. A major point is then the option which we called vertex label clean-up. It enlarges the interval to be propagated before the actual propagation. (This could be performed on-the-fly in the data structures as well.) This generally halves the running times, and we do not see how such a feature would be possible in a general-purpose SSP algorithm. Faster data structures require a lot more work, but as mentioned before, they are also crucial and much can be gained by choosing the best solution for each of the interval-based data sets. See Section for details. This results in the settings of Forward BFS in the benchmarks. At this point, we enable the seeking guidance in the forward search. As we have seen in the previous section, this is not always beneficial. It is, in a way, a step back to the beginning, towards a single path per iteration and, thus, significantly more iterations over all. The effect is not just in the number of

114 114 CHAPTER 2. EAF S AND THEIR COMPUTATION Padang-1s-1g Grid-long-20k Telefunken Running time in s Iterations 1 Single path / iter., quick cutoff 0 Multiple paths / iter., quick cutoff Repeated paths Figure 2.38: How options affect the running times of the reverse search (yellow line) and the number of iterations (black line). Note that the reverse search would initially spend up to 1 second on each iteration. Sort paths before augmentation Quick cutoff 0.5 Vertex label clean-up Faster data structures Track unreachable paths found, though: Despite its sophistication, it does not find paths quicker than the breadth first search on some instances with a dense flow, like Grid-long-250k or even Grid-long-20k used here. As we explained for those instances, seeking guidance tends to produce very convoluted paths that run wildly through backward arcs in the early parts of the time-expanded network only to then climb back to the much later current arrival at the sink. However, if seeking guidance is employed, one should also adjust the quick cutoff value further down because fewer paths are found in each iteration anyway. This is Forward Seeking in the benchmarks. For the reverse search, though, there is a last major improvement to the breadth first search: Tracking vertices that will never be reachable again from the sources (by an occasional forward search) makes the algorithm twice as fast. This is Reverse BFS in the benchmarks.

115 2.6. COMPUTATIONAL RESULTS 115 Forward Search Settings Padang-1s-1g Telefunken Grid-long-20k Single path / iter., quick cutoff 0 5:00:20 0:00:11 1:03:39 Multiple paths / iter., quick cutoff 2:17:16 0:00:09 0:12:34 Repeated paths 1:26:19 0:00:09 0:11:57 Sort paths before augmentation 1:24:57 0:00:09 0:12:25 Quick cutoff 0.5 1:35:07 0:00:09 0:12:25 Vertex clean-up 0:41:04 0:00:04 0:08:08 Faster data structures 0:27:50 0:00:04 0:05:23 = Forward BFS Seeking guidance 0:11:13 0:00:04 0:08:12 Quick cutoff 0.1 0:10:39 0:00:04 0:08:12 = Forward Seeking Reverse Search Settings Padang-1s-1g Telefunken Grid-long-20k Single path/ iter., quick cutoff 0 17:13:15 0:05:58 2:54:30 Multiple paths / iter., quick cutoff 5:44:13 0:04:14 1:09:06 Repeated paths 3:40:05 0:04:55 1:08:05 Sort paths before augmentation 3:47:33 0:04:38 1:08:05 Quick cutoff 0.5 3:46:02 0:04:37 1:06:17 Vertex clean-up 1:39:13 0:00:10 0:33:11 Faster data structures 1:03:53 0:00:10 0:21:46 Track unreachable vertices 0:31:29 0:00:04 0:15:29 = Reverse BFS Table 2.12: The improvements done to the searches (beyond using intervals).

116 116 CHAPTER 2. EAF S AND THEIR COMPUTATION Running time in s Forward BFS Forward Seeking Mixed BFS Mixed Seeking Reverse BFS cs2 mcf 10 Paths in solution Figure 2.39: The running times on the Padang instances plotted against the approximate number of paths used by the SSP algorithm. From left to right, the instances are Padang- 10s-10g, -5g, -2g, -1g, Padang-5s-5g, -2g, -1g, Padang-2s-2g, -1g, and Padang-1s-1g. Clearly, our algorithm depends on this number of paths, while cs2 depends mostly on the size of the time-expanded network. The Problem with Large Supplies The benchmarks showed that large supplies, or more precisely a large number of paths to augment, can pose serious problems for the SSP algorithms, leading to running times much worse than cs2. Indeed, if we plot the running times against the number of path augmentations instead of the size of the timeexpanded network, a clearer picture emerges. See Figure 2.39 for all Padang discretizations (without flooding), Figure 2.40 for the short and wide grids, and Figure 2.41 for the challenging narrow grid set. Of course, the number of iterations of the SSP algorithm, which are a major factor in the overall running time, and the number of paths required are correlated, so it is no wonder to see such a dependency on the number of paths. But this explains the sudden jumps in the running times from Grid-short-4k to Grid-short-20k in Figure 2.33, and from Grid-long-4k to Grid-long-20k in Figure As a general guideline for instances of this size, we would be cautious if more than 10,000 paths might occur. Of course, judging this from just the supplies is difficult, but a supply of less than 10,000 should be safe. Larger numbers of paths might work well, e. g., on the Berlin instances. The Padang instances with up to 135,000 paths are also doable, and our algorithms are still among the fastest on them, but some of the grid instances with 20,000 paths already take too long compared to cs2. As mcf behaves similarly to

117 2.6. COMPUTATIONAL RESULTS Running time in s Forward BFS Forward Seeking Mixed BFS 100 Mixed Seeking Reverse BFS cs2 mcf e+06 Paths in solution Figure 2.40: The running times on the short and wide grid instances plotted against the approximate number of paths used by the SSP algorithm. This seems to express the dependency better than Figure cancelled >24h Running time in s Forward BFS Forward Seeking Mixed BFS 100 Mixed Seeking Reverse BFS cs2 mcf e+06 Paths in solution Figure 2.41: The running times on the long and narrow grid instances plotted against the approximate number of paths used by the SSP algorithm. Again, this is a better representation than the previous Figure 2.34.

118 118 CHAPTER 2. EAF S AND THEIR COMPUTATION our approaches, we believe that the very efficient push-relabel maximum flow subroutine gives cs2 a strong advantage over other ways to establish a flow. One property of our approach, and ZET as well, on instances with large supplies is that the already established flow value increases much slower the further the algorithm progresses. It is quite common that about 50% of the total flow value are already established within the first 10% of the execution, and the last 10% of the flow requires 50% of the time. There are three effects causing this, which we try to explain using Figures 2.42 and First, the shortest path search itself becomes slower with each iteration. This is definitely instance dependent and the shortest path search does not become monotonously slower, but the overall trend is noticeable, and seems to be linear or slightly worse in practice. This can be due to the paths from the sources to the sinks becoming longer (nearby sources are depleted sooner), as well as the scanned portion of the time-expanded network and the corresponding data structures becoming larger. Second, the number of paths returned by the search decreases. While in the beginning multiple (parallel) paths are typically found (due to many sources close to the sinks being active, for example), in the long run each search returns only a single path. Third, the average flow augmented per path may decrease as well, which can be seen in the charts where the lines of the paths and flow values coincide in later iterations Conclusion from Computational Results The computational results are a mixed bag for our approach. It was clear from the beginning that a SSP algorithm cannot deal with arbitrarily large supplies efficiently, and that there would be instances where a standard algorithm on the time-expanded network beats our approach. This becomes more likely the smaller the time horizon and the larger the total supply is. However, this happened much sooner as expected because the performance of the existing solver cs2 is on such a high level, in absolute terms as well as how it scales, that there was no instance on which it could not be used. Its downsides are that it is only free for academic purposes and that it has higher memory requirements than our or any other of the algorithms that we tested. On the bright side, our algorithm performs well and often best on the large real-world instances, especially when we employ the guidance heuristic. The most suitable place for our algorithm, though, is in the area of building evacuations. There, detailed networks and small supplies make an efficient SSP-based approach like ours very attractive, achieving speeds that are suitable for an interactive computation of an earliest arrival flow. For those considering an implementation of their own, we would recommend to start with an interval-based breadth first forward search, then add the following important options: Multiple paths per iteration and repeated paths, vertex clean-up, and the seeking guidance heuristic.

119 2.6. COMPUTATIONAL RESULTS Flow per iter. Paths per iter. Time per iter. (s) Iteration Figure 2.42: The behavior of Forward Seeking on the instance Padang-1s-1g. The later iterations are slower and return less paths with fewer flow units on each of them Flow per iter. Paths per iter. Time per iter. (s) Iteration Figure 2.43: The behavior of Forward Seeking on the already challenging instance Grid-long-20k. A similar trend occurs as on the Padang instance.

120 120 CHAPTER 2. EAF S AND THEIR COMPUTATION 2.7 Using the Solution The solution to the MinTravelTime problem is mostly independent of the algorithm used to achieve it. (The actual flows computed can differ, but the effect of this in real-world instances should be marginal.) However, as we already mentioned in the introduction, the results obtained from network flow theory are not the solution to an actual evacuation problem because the assumptions are too strong: The model cannot account for human behavior, varying speeds or special needs of evacuees, for example. That is not to say that such a solution is without merits, though, as we will see in the following. For the Adaptive Verkehrssteuerung project, we used the computed solution as the starting solution for MATSim. (See Section for a short introduction to MATSim.) By default, the starting solution in MATSim sends every agent on a shortest path to the nearest sink. This is typically a configuration of very low quality, leading to much jamming. By contrast, the solution computed using flows over time had a much better objective value when executed in MATSim, although there were small differences as the models are not completely equivalent. In any case, MATSim proceeds to refine the solution iteratively using a best-response approach. This process can and often does end in different states (where the end is defined by a user-specified number of iterations). A sample of such a process (using two distinct strategies within MATSim) is given in Figure Using the optimum network flow solution as a starting solution compared to starting with shortest paths has little effect on the quality of the final solution in the Padang instance. This may be due to the large changes that MATSim performs in each iteration, rerouting 10% of the agents on shortest paths given the observed network loading. This quickly degrades the overall quality of the solution in the beginning, as can be seen in the figure. Whether a more careful rerouting would improve the results (after all, the theory calls for an ε-fraction of the agents) or whether this is a necessity of the price of anarchy (see [80] for an introduction), is out of the scope of this work. However, the network flow solution at least provides a good estimate of the social optimum. Without this, judging the quality of the outcome in MATSim would be more difficult, especially with multiple final states. The same holds true for other simulations, which rarely give a lower bound on the optimum egress time or total travel time. Another example of how to use network flows is given by the authors of ZET: They discuss various methods to assign evacuees to exits [21]. This first requires a path decomposition of the optimum flow. However, the idea is to ignore the precise paths taken by each evacuee and only consider which person goes to which exit. This lets the evacuees decide on the actual path taken by themselves. Because the optimum network flow solution distributes the evacuees optimally among the possible exits (sinks), one can hope to carry this partition over to more complex models like the cellular automaton employed in ZET or draw conclusions from this how emergency exit signs should be placed. (The latter also points in the direction of Chapter 4 on confluent flows, a

121 2.7. USING THE SOLUTION 121 Average evacuation time in s EAF MSCb NE Iteration Figure 2.44: The quality of the solution over many iterations of MATSim. The earliest arrival flow is the starting configuration and also the reference. The NE line aims for a Nash Equilibrium, while MSCb is the social cost approach described in Lämmel et al. [63] and aims for a global (social) optimum. network flow model inspired by emergency exit signs.) Such an approach also seems viable for the Padang instance because the flow units starting at similar places are often headed for the same sink. This is a reasonable outcome and could serve as the basis for further planning. Figures 2.45 and 2.46 show how the exit assignments partition Padang and Berlin. We chose similar colors for sinks close to each other on purpose because there are many sinks in these examples that are only a block apart from each other. With a few exceptions (see the description of the figures), a clear pattern emerges which agents should head into which direction. Where the assignment is ambiguous, one can expect to find bottlenecks. Because for this to occur, the paths to the sinks must be saturated enough to make up for the extra time spent traveling to another sink. If that other sink has available capacity, this can simply be seen as a shift in the border between the areas and the exit assignment would be clear. But if flow units from the same area change between various sinks, there is no good choice and traffic jams everywhere. The same behavior can be seen in a supermarket with many check-out lanes: If all lanes are similarly long, people might end up choosing one randomly instead of the one they were originally closest to. If instead only a single lane is too long, people fan out to neighboring lanes. One of the short and wide grids, Grid-short-20k, shown in Figure 2.47, displays a pattern that indicates severe bottlenecks as well because flow units travel far to reach the sinks. Figure 2.48 shows the flow units assigned to a single sink on the long and narrow grid, Grid-long-20k. Naturally, the further away flow units start from the column of sinks, the less reason there is to choose one sink over another.

122 122 CHAPTER 2. EAF S AND THEIR COMPUTATION Figure 2.45: The exit assignment in Padang resulting from an earliest arrival flow. The larger circles are the exits, which are generally on the right edge of the graph. The direction to the exit is usually clear, except around the red/blue part in the middle. This might be an indicator of bottlenecks in front of all sinks in that area. In such a case, flow units starting further away need to wait wherever they go, and they can choose any of those sinks at no additional cost. Instead of only looking at a single instance, a set of variations of one instance could also be computed. This could account for environmental hazards, or unknown or unusual occupation patterns. From this, more reliable evacuation plans could be crafted. For models describing large buildings with a number of evacuees in the low thousands, to the best of our knowledge, our approach is the most suitable to run such comparisons on current desktop computers.

123 2.7. USING THE SOLUTION 123 Figure 2.46: The exit assignment in Berlin shows clear directions for almost all vertices. This might be due to the capacities being large enough to avoid bottlenecks. Still, one can see where to draw the border between close-by sinks or how to balance splitting the north of Tempelhofer Feld.

124 124 CHAPTER 2. EAF S AND THEIR COMPUTATION Figure 2.47: An exit assignment for Grid-short-20k. The sinks are the blobs in the leftmost column. The spotted pattern indicates bottlenecks. The sinks near the top and bottom show more pronounced areas. There is a slight upwards trend in the overall direction, which can be explained by the random placement of the sinks. Figure 2.48: The exit assignment of a single sink in Gridlong-20k. Flow units traveling from the far right to the left column containing the sinks have little reason to choose a specific sink after such a long and probably congested path. Sinks located closer to the top or bottom have slightly better defined areas from which they draw flow units.

125 2.8. OUTLOOK Outlook We have seen how our algorithm performs on various instances and for which it is suitable and for which not. Based on this, it would be interesting to include our algorithm in evacuation tools like ZET, that use highly-detailed models for simulations and optimization. There are a few desirable features still missing from our discussion, though. We implemented some, not all, of these and try to pass on our experiences with and ideas for them. Note though, that our current code base does not support these features anymore Meta-Heuristics We see further possible improvement by introducing meta-heuristics to steer the behavior of our algorithm, in particular the shortest path search. For example, one could change the direction of the shortest path search (forward, reverse or mixed) or the guidance scheme at any iteration depending on previous success. This would also make the program more applicable to a wider range of problems without needing user input. A larger set of typical instances would be needed to tune such meta-heuristics, though, which we do not have access to at this point Holdover Holdover, or storage at vertices, is not only important for real-world instances but can also improve performance. The holdover arcs in the time-expanded network can be derived by timeexpansion of arcs (v, v) (i. e., loops) with transit time 1. Thus, holdover can already be modeled without any special code. However, this does not do justice to the effect that holdover could have on the shortest path search. It is much more efficient to think of consecutive usage of holdover arcs as a single instance of waiting. For example, this easily avoids the unnecessary pseudopolynomial effort that occurred in Figure 2.7. The forward propagation for holdover should be from [t 1, t 2 ) to [t 2, ) for infinite holdover capacity or, more generally, until the first time a holdover arc has no residual capacity. Backward holdover arcs behave vice versa, but cannot have positive capacity indefinitely. It would be technically correct to propagate holdover at some point when a task is processed, but our suggested solution for holdover appends the processing of holdover immediately to every propagation step. That is, when the label of a vertex is marked as reachable, holdover is immediately applied. The algorithm is already working through the vertex label at that point and should simply continue to consider the holdover arcs as well. By induction, the program never needs to look past already reachable intervals as holdover for those has already been considered. This also ensures that when a task is considered, the implications of holdover are already known. Otherwise, one task like this would actually be two tasks as the original interval label and the interval label created from the holdover arcs are one phase apart in the

126 126 CHAPTER 2. EAF S AND THEIR COMPUTATION breadth first search. Finally, this treatment of holdover creates larger, and thus preferable, intervals in the data structures. A different effect of unlimited holdover on the shortest path search is that it would allow us to reduce the reachability label to the earliest time at which a vertex can be reached. Such a single reachability label t 1 on each vertex would encode being reachable at [t 1, ). This can already be expressed in the intervals, of course, but might require multiple intervals as opposed to a single integer. However, we argue that such a fundamental change to the algorithm is not justified for most usage cases and that it is better to maintain the vertex labels in intervals like we already do. First note that unlimited holdover is not realistic in many instances. For example, in the instances from ZET vertices may represent the place that a single person occupies. Holdover arcs with limited capacity, e. g., a single flow unit, are therefore appropriate. They require time-dependent labels, though. Besides, with vertex clean-up our suggested approach already emulates the behavior of the algorithm optimized for unlimited holdover. When a vertex is reachable at t 1 at the earliest, the union of the not yet scanned intervals must be [t 1, t 2 ), where t 2 is the lower bound of the most recent task performed for this vertex. All of [t 1, t 2 ) would also be marked as reachable by propagating holdover immediately as we suggested. Vertex clean-up then guarantees that this interval [t 1, t 2 ) would be used the next time any part of it is to be scanned. Of course, this still requires larger data structures and more costly access than a single integer on each vertex. But unless one expects only instances with unlimited holdover, ours seems like an acceptable solution. In any case, adding holdover to instances would probably improve the performance of our algorithm due to larger intervals in the propagation steps, while standard approaches would have to consider slightly more arcs. There also is a result by Möhring et al. [73] which shows that the search with unlimited holdover but ignoring the residual arcs can be done in time polynomial in the interval description of the flow. The algorithm is a variant of Dijkstra s algorithm, which could be replicated by the greedy guidance scheme from Section to always select the earliest task. However, the polynomiality breaks down under the negative transit times of the residual arcs. The example in Figure 2.12 still applies with holdover, because holdover offers no improvement there Time-dependent Capacities and Travel Times Our original model for flows over time used time-dependent capacities, while in the algorithm arcs have constant capacity on the interval W( ) when they are available at all, and no capacity otherwise. This can describe arcs with changing capacities only inefficiently by multiple parallel arcs. To avoid this, one can store the capacity of an arc as yet another time-dependent data structure built from intervals and consider these interval boundaries when determining the residual capacities and their projections to boolean functions, too. None of these preparations would have to occur during the search, and they should therefore have little impact on the running time. The search benefits from having to consider fewer arcs, though.

127 2.8. OUTLOOK 127 Similarly, time-dependent travel times can be modeled by multiple parallel arcs that exist at disjoint times. This, too, can be handled more efficiently by accessing only the needed arcs for each interval. However, multiple timeexpanded arcs may arrive at the same time (if τ(a, t) + t is not injective), which makes this more troublesome than time-dependent capacities. If timedependent travel times occur commonly in the instances, though, it would be very likely worth to implement them with appropriate look-up tables for determining the incoming arcs at each point in time. However, neither of these cases arose in our instances so we did not implement this Warm-Starting the Path Search Because the shortest path search is the most costly function and called many times on similar data, a warm-start heuristic could prove helpful. For example, one could use the shortest path labels from the last iteration to guide the propagation. Note that this needs to be done efficiently, as simply replaying the old search in full detail would take similar time as the current search itself. Rather one should try to condense the gained information into a useful guidance heuristic. At a very early stage we experimented with storing the earliest time at which a vertex is reachable (like the data for track unreachable) to restart the search. But not having a reasonable guidance framework yet, this was not effective. One big step further would be the idea of updating the shortest-path tree as residual arcs appear or disappear. Dynamically updated shortest path trees specialized to time-expanded networks with the travel time cost function might even be an interesting research direction in its own. General data structures for dynamic trees have been known for a while, though, see Sleator and Tarjan [87] for the original. Sleator and Tarjan already used their data structure to improve Dinic s algorithm [18] for maximum flow computations. So one could also try to use such an approach instead of multiple path searches. This directly leads to the following general idea: Alternatives to the Path Search Upon closer inspection, the SSP algorithm does not depend on adding individual paths one at a time. Any sum of path flows, all with minimum cost, could be augmented at once. In our case, the most useful thing would be a maximum flow from the supersource to the supersink through copies of the sinks at the current arrival time. Once this has been added, by definition of the maximum flow, the arrival time has to increase. So the shortest path search discussed above is only one way that will eventually construct this maximum flow. Much faster general-purpose maximum flow algorithms are known as already seen in Table 1.1. These could be run on the relevant portion of the time-expanded network for each arrival time within the frame work of the SSP algorithm. However, we decided to stay closer to the SSP algorithm to explore its potential and because, as stated in the assumptions in Section 2.5.1, we do not expect large total supplies. This is even more relevant here, because

128 128 CHAPTER 2. EAF S AND THEIR COMPUTATION each maximum flow computation would only replace at most arrival(t) path searches, which might be in the extreme case a single path. Furthermore, the finer the discretization is, the less flow units arrive per time step and the less there is to gain from such a maximum flow computation. Thus, the SSP is more in line with our goal of handling fine discretizations better than standard algorithms Shelters One aspect that came up in the course of the Adaptive Verkehrssteuerung project were sinks with limited demands. In the case of the city of Padang, there was the idea to identify buildings able to withstand an earthquake and the subsequent tsunami. The building has to be high enough so that people are above the effects of the tsunami while still in the threatened area. This is called vertical evacuation in disaster management. Suitable buildings, which we call shelters, could be, for example, hotels or hospitals. The downside is that such buildings can only contain a limited number of evacuees. Besides, entrances to reach higher floors can be narrow, restricting the sustainable flow rate into such a shelter. To measure the effect that shelters can have on the evacuation time, the MATSim model as well as our flow algorithms had to be adapted to handle limited sinks. The earliest arrival property is generally not achievable with limited sinks, but a flow with minimum travel time still exists (see Section 2.4). The most fundamental change is that the shortest path search now truly has to search for shortest paths, that is, not all paths from the supersource to a time-expanded copy of a vertex have the same length. (Lemma 2.4 cannot be applied for limited sinks.) Hence, the vertex labels need to store a distance label and not just a boolean reachability flag. This is easily possible, but raises the question whether intervals still are a suitable representation for such a general distance label. The answer is yes, with slight modifications. We will state the following idea in the cost model that puts the costs for travel times on the actual timeexpanded arcs and no costs on the arcs to the sinks. The reachability labels are equivalent to cost labels such that label(v t ) = t for all vertices v and times t. This can be generalized to an offset plus a linear increase with slope of 1. That is, for an interval [t 1, t 2 ) we only store the distance of the vertex at t 1 and assume that the label then increases with slope 1. Of course, if slope 1 is possible, we could allow arbitrary slope values. But the only other slope we need to consider is slope 0, that is, the distance label stays constant throughout the interval. To see this, we have to reexamine the various propagation steps under this assumption, which we will do for the forward search. A time-expanded arc a t simply translates the offset of such an interval-based distance, independent of its slope. The arcs out of a source s ++ create intervals with slope 1. The residual arcs out of a sink s, which were not needed for unlimited sinks, create intervals with slope 0. Using such a residual arc corresponds to undoing flow into a shelter s, while the same path also sends flow into s at another time. Note that such a path has a higher cost than just the time at which

129 2.8. OUTLOOK 129 it arrives at the supersink. The opposite operations, entering a source or a sink, can simply choose the label from the lowest time in the interval, as both slopes are non-negative. Holdover arcs, if modeled, have a cost of one, so create labels with slope 1 when propagated. Propagating along residual holdover arcs starting within an interval with slope 0 may require an update of earlier parts of the same interval. Of course, all the propagation steps need to take into account that the target vertex might already be reachable and that the propagation must improve its label. In essence, we perform an interval-based Moore-Bellman-Ford algorithm to find the shortest paths. For implementation purposes, consider that this can break one label interval into 3 smaller parts that need scanning, even though only two tasks were involved, which makes a single pointer to the label per task insufficient. We therefore chose to store the interval bounds with each task instead and accessed the data structure of the originating vertex based on these. The vertex label clean-up also has to account for the two types of cost labels on vertices when joining them. Intervals of the form [t, t + 1) can be interpreted as having slope 0 or 1, though. In particular, there is some freedom if the original task interval to propagate had length 1. We only implemented the forward and backward search with shelters, but this functionality was no longer maintained when the code was refactored at various stages. We quickly want to summarize our experiences with these searches, though. The limited sinks had only a small influence on the search most of the time, because the paths to the supersink that actually used backward arcs from S ++ back into the time-expanded network only need to be examined when the current distance is about to increase. Therefore, our algorithm gave the lowest priority to these steps. Only when the regular search could find no path for the current arrival time, tasks from satisfied sinks were propagated. With the relatively small overall capacity of the shelters in our instances, however, only few additional paths were used that way. So for most of the time, the vertex labels were simply reachability labels and the search behaved as without shelters.

130 130 CHAPTER 2. EAF S AND THEIR COMPUTATION

131 Chapter 3 Flows with Aggregate Arc Capacities Flows with aggregate arc capacities, less formally bridge flows, stem from a relatively new model for arc capacities in flows over time. In such a flow, capacities bound the amount of flow that is allowed to enter an arc within a sliding time window of some given length. This is opposed to traditional capacities that limit the flow rate into an arc. See Figure 3.1 for a comparison. This allows modeling capacities that limit flow rates on average. In our work, the lengths of the capacity windows are independent from the transit times of the arcs. We will see that choosing a small enough time window additionally lets us model traditional flow rate capacities. On the other end, aggregate arc capacities with infinite time windows can bound the amount of flow that ever enters an arc within the time horizon. Traditional flow rate capacities lack this ability. A natural motivation leads to flows with aggregate arc capacities and originally coined the term bridge capacities. In the context of a street network, flow rate capacities can model the number of lanes of a road. But to model a load limit on a bridge that supports fewer trucks than the number of lanes might suggest, one can use aggregate arc capacities with a measuring window as long as the transit time of the arc. Thus, the aggregate arc capacity exactly bounds the amount of flow on the arc at any given point in time, that is, the number of vehicles on the bridge. The same reasoning could be used to model long tunnels which usually have a load limit as well. These are regulated by traffic lights just before the entrance, to avoid dangerous traffic jams inside the tunnel. In the context of evacuations, the ability to model this aspect of a bridge or a tunnel has apparent uses. We use the term bridge capacity to denote this special case where the length of the sliding window is exactly the same as the transit time. Note though, that we use the term bridge flow as an equivalent shortcut for flow with aggregate arc capacities, that is, a bridge flow does not need to have only bridge capacities. A second motivation for bridge flows arises from a more theoretical problem. As we have seen in Chapter 2.6.1, choosing the right discretization for an evacuation model with fractional parameters can be difficult. Small dis- 131

132 132 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES f(a, t) f(a, t) l(a) 4 u(a) 4 u(a) l(a) u(a) 0 5 t 0 5 t (a) (b) Figure 3.1: (a) An arc with traditional flow rate capacity u(a) = 4. Higher flow rates violate this constraint. (b) An arc with aggregate capacity u(a) = 12 and window l(a) = 3, which has the same average capacity u(a)/l(a) = 4 as in the left figure. Higher flow rates are possible as seen in the first spike. The second time the average capacity is exceeded, there is a whole range of windows for which the aggregate capacity u(a) is not obeyed, though. The earliest is ( ε, ε), the latest (7 1 2 ε, ε). As Lemma 3.3 will show, this 1-constant flow also violates windows with integer endpoints, so (6, 9) and (7, 10) would be the canonical windows to consider but they are certainly not the only ones. cretization steps model the transit times accurately while large time steps model capacities accurately. Thus, unless fractional flows are allowed, fractional capacities pose a problem. This necessary trade-off leads to the question of what is actually meant if an arc has flow rate capacity 1.25 in the chosen discretization, but the flow should be integral. The fractional part of the capacity can never be used by an integral flow, so we could equivalently round down the capacity to 1. However, MATSim (see Section 2.1.2) and likely other simulation tools allow capacities to be slightly exceeded to let one flow unit pass where otherwise only a fractional flow unit would have been possible. This is then evened out by reducing the capacity in the following time steps until the correct average capacity has been achieved again. Only then a new flow unit is allowed to surpass the capacity. Thus, the effective capacity alternates. There are multiple ways to achieve the same average capacity with such alternating integral capacities. A fixed schedule in which the capacity just depends on the time step is the easiest to implement and reduces the problem to a traditional flow over time. However, this is a somewhat arbitrary solution, which can lead to undesired effects if the flow units arriving at the tail of the

133 3.1. RELATED WORK 133 arc and the rhythm of the capacity changes are aligned. A more elaborate flow model (and closer to the simulations) would instead always allow the first flow unit to enter the arc and, assuming sufficient incoming flow, from there on keep a certain pattern of capacities. After a break in the flow, the pattern could restart whenever needed without exceeding the average capacity in the long run. Aggregate arc capacities can be used to model this effect by using two capacity conditions on an arc (easily achieved with an extra arc with zero transit time): One capacity for the flow rate rounded up, and the aggregate capacity to bound the average capacity over multiple time steps. Thus, this new model brings network flows closer to the capabilities of simulations. In the general setting of aggregate arc capacities, we study the MinCost- TransOverTime problem, i.e, whether there is a flow over time that balances the supplies of the sources with the demands of the sinks under some cost bound. We will call this variant the BridgeTrans problem. 3.1 Related Work Melkonian [70] introduced aggregate arc capacities in the context of flows over time but only considers time windows as long as the transit times, i. e., bridge capacities. Melkonian proves that already the BridgeTrans problem with a single source and sink and no costs is weakly NP-complete, but can be decided by solving a linear program of pseudo-polynomial size. He mentions networks with a mix of traditional and bridge capacities as an interesting research direction, which we also pursue with our model. A related capacity model has earlier been proposed by Klinz and Woeginger [59]: They study dedicated arcs that are entirely blocked as long as even a small amount of flow is traveling along them. (As for bridge capacities, the duration of the block and the transit time are identical.) All their flows are discrete, meaning that flow travels in whole packets sent once per time step as opposed to continuous flow rates. They also restrict themselves to integral flow functions, which often prohibits the use of linear programming techniques. They derive interesting complexity results for their setting: For instance, even for a fixed time horizon of 4, one of their variants of the MinCostTransOverTime problem is NP-hard. They also translate a complexity result of Papadimitriou et al. [76] into the language of flows over time. This implies that the MinCostTransOverTime problem for integral flows and dedicated arcs with unit capacities is strongly NP-complete. This translates to bridge flows, because for unit capacities and integral flows, dedicated arcs and bridge capacities are equivalent. Köhler and Skutella [61] study flows over time with load-dependent transit times. In this model, the speed at which flow travels along an arc always depends on the amount of flow currently on that arc. The model of Melkonian can be considered as a special case by letting the transit time of an arc be constant up to its capacity and infinite if the load exceeds the capacity. Fleischer and Skutella [27] present a simple 2-approximation algorithm for a general class of problems including the QuickestTrans problem. An analogue 3- approximation for the setting with aggregate arc capacities can be obtained

134 134 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES by suitably bounding the total capacity of an arc. Furthermore, their fully polynomial time approximation scheme for the MinCostTransOverTime from the same article serves as the basis for our FPTAS in this chapter. 3.2 Our Contribution We generalize bridge capacities to aggregate arc capacities, where the lengths of the windows are independent from the transit times of the arcs. In particular, this allows mixing bridge capacities (time windows as long as transit times) with flow rate capacities (very short/infinitesimal windows), and arcs that can only be used by a certain amount of flow in total (infinite windows) in the same network. If necessary, one can even combine several of such capacity constraints on a single arc. We consider the analogue of the MinCost- TransOverTime problem in this model, which we call the BridgeTrans problem. For this model, that generalizes flows over time, we discuss important properties, namely integrality and whether weak flow conservation can improve the flow value. We show that forbidding holdover or forcing integrality qualitatively restricts the set of solutions. We also prove that it is already NP-hard to compute optimal integral flows with aggregate arc capacities for time horizon 2. Our main contribution is a fully polynomial-time approximation scheme with resource augmentation for the BridgeTrans problem. If there is a feasible flow for the original instance, for any ε > 0, we can compute a flow over time that satisfies the same supply/demand function and the same cost bound (if applicable), but violates the capacities and the time horizon by a factor of at most 1 + ε. Our approximate solution requires holdover. We also show that the slightly stronger approach of Fleischer and Skutella [27] for the classical flow over time model, that does not require a violation of capacities, cannot be generalized to the setting of aggregate capacities. We remark, that the presented FPTAS can be generalized in a straightforward way to the setting with multiple commodities. 3.3 Preliminaries Aggregate Arc Capacities The BridgeTrans problem, that we consider throughout this chapter, needs a parameter to describe the lengths of the sliding windows. We denote this length by l(a) R 0 for each arc a A. The total amount of flow entering arc a within every time window of length l(a) must be bounded by u(a). Note that these capacities u(a) are not allowed to be time-dependent. The capacity requirement then becomes t+l(a) t f(a, θ)dθ u(a) t R 0 a A.

135 3.3. PRELIMINARIES 135 In a flow with aggregate arc capacities or shorter a bridge flow, this condition replaces the usual capacity constraints. This new capacity condition is trivially satisfied if l(a) = 0, and these arcs effectively have no capacity. Traditional constant flow rate capacities, however, are of the form f(a, t) u(a), and this cannot be expressed precisely with aggregate arc capacities. This is no serious drawback, as we will soon see that a small enough l(a) can essentially model traditional capacities. We now have everything in place to state our main problem precisely. We will assume that all parameters are integral. Definition 3.1. A bridge flow instance consists of a digraph G = (V, A), transit times τ : A Z 0, capacities u : A Z 0, time windows l : A Z 0, a supply/demand vector d : V Z and matching sources S + and sinks S, a time horizon T Z >0, a cost function c : A Z 0 and a bound on the cost C Z 0. Problem 3.2 (BridgeTrans). Input: A bridge flow instance. Question: Is there a bridge flow with time horizon T satisfying supplies/ demands d with cost at most C? Note that we can replace the multiple terminals of a bridge flow instance by a single source and a single sink. While this is not possible for standard flows over time, we can limit the flow through each terminal using time windows that span the entire time horizon Discrete Bridge Flows and Time-Expansion To be able to meaningfully use -constant flow over time functions, we require that τ as well as T are multiples of > 0. Recall from Section that we then say that is admissible. One might expect that l, being a parameter related to time as well, must also be a multiple of. But this is not a requirement to define and consider -constant bridge flows. The caveat is, that to discretize an instance without loss of precision (which is the usual assumption), l has to consist of multiples of as well. For now, though, the parameter l, the supply/demand function d, the costs c and the cost bound C may be arbitrary integers. Since a bridge flow is a flow over time, albeit with different capacity constraints, we can understand -constant bridge flows as static flows on the time-expanded network. The aggregate arc capacities, however, cannot be expressed within the usual conditions for static flows. But at least they can be expressed as finitely many conditions on the static flow: Lemma 3.3. Assume is admissible. Let f be a -constant flow over time. Then for a A, the flow f obeys the aggregate arc capacity on arc a if and only if the aggregate capacity constraints are obeyed for t 1 = i and for t 2 = i + l(a) l(a) for all i Z 0, i < T. That is, it suffices to look at the integrals over intervals which start or end at a multiple of.

136 136 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES Proof. This follows from the linearity of the terms when we replace integration by summation for -constant flows, as discussed in Section 1.5.1: The maximum used capacity is attained when one end of the sliding window falls upon a multiple of. As mentioned before, an odd advantage of this formulation is that it does not impose requirements on l. This implies a straight-forward finite linear program based on time-expansion to find the optimum -constant bridge flow for a given BridgeTrans instance. What it does not imply is a linear program to find the best bridge flow with arbitrary flow functions. In other words, we can solve discrete problems with arbitrary l, but not all of the admissible discretizations solve the original problem. To discretize a given instance, without restricting the quality of the solution, we have to require that the lengths of the sliding time windows l must be integral multiples of as well. A quick example like the one in Figure 3.2 shows that the averaged - constant flow f might not be feasible if l is not a multiple of, even though is admissible. Consider the flow f into a single arc a, with some aggregate arc capacity u(a) and a time window l(a) = 2. The transit time τ(a) = 0 is irrelevant for this example. This flow f sends a maximum impulse of flow every l time units, that is, a very large spike of length ε > 0 with flow rate u(a)/ε. This is feasible for the given capacities. If we now choose = 3, the first two impulses at [0, ε) and [2, 2 + ε) influence f (a, ) on the interval [0, ). Thus, the flow rate f (a, t) is 2u(a)/ for all t [0, 3). This violates the capacities for all t [0, 1], because t+2 t f (a, θ)dθ = 2 2u(a)/3 > u(a). On the other hand, the next interval [, 2 ) would not fully use the available capacities. Thus, such rounding errors can lead to fluctuations in the averaged flow. Therefore, the next lemma additionally restricts l to obtain the usual discretization for flows over time. Note though, that the approximate -constant solution that we will construct in our main theorem does indeed not restrict or round l. So there is a real advantage to deciding whether an admissible suffices. Lemma 3.4. If f solves BridgeTrans, and is admissible, and the values of l are multiples of, then the averaged flow f solves BridgeTrans as well. Proof. Clearly, f satisfies the same supplies/demands at the same cost. To show that it also satisfies the aggregate arc capacities, we only need to show that they hold at the points specified by the lemma above. For i Z, we have t 2 = i + l(a) l(a) = i, because l(a) is a multiple of. So the capacity constraints need only to be checked at t = i for i Z. But then, i +l(a) i f (a, θ)dθ can be decomposed into the sum of integrals on the intervals of length starting at i. On these intervals, f and f produce the same value. A subtle consequence of this lemma is that for an arc a with time window l(a) =, the flow rate will always be bounded by u(a) in a -constant flow. Assuming that is small enough (e. g., 1 for integral parameters), we can

137 3.3. PRELIMINARIES 137 f(a, t) f (a, t) l(a) = 3 0 u(a) 5 u(a) l(a) t 0 l(a) u(a) l(a) t (a) (b) Figure 3.2: (a) A feasible bridge flow with many impulses on an arc with l(a) = 2 and u(a) = 2. (b) After averaging to = 3, the aggregate arc capacities are violated because does not divide l(a) and the averaging distributes the impulses unevenly. therefore interpret these aggregate capacities as regular flow rate capacities of the form f(a, ) u(a). While there may still be flows that exceed the traditional capacity momentarily, we can always construct the equivalent - constant flow that obeys them. Corollary 3.5. An arc a with l(a) = 1 effectively models the traditional capacity constraints f(a, t) u(a). Thus, flows with aggregate arc capacities are a true generalization of flows over time with traditional capacities (under the usual algorithmic assumption that the input parameters can be scaled to be integral). Conversely, bridge flows are feasible flows over time for larger traditional capacities: The flow rates of a 1-constant bridge flow never exceed u(a). But this is a weak bound on the average capacity of an arc, which approaches u(a)/l(a) from above as the time horizon increases Complexity and Integrality We can now reduce the BridgeTrans problem to a linear program because working with -constant flows has no drawbacks, as shown in Lemma 3.4, and = 1 is always fine enough for integral data. The associated linear program can be solved in time polynomial in its size, which is pseudo-polynomial in the input. Corollary 3.6. One can decide BridgeTrans in pseudo-polynomial time.

138 138 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES Since flows with aggregate arc capacities are true generalizations of flows over time, they inherit some lower complexity bounds. In particular, flows over time with a bound on the total cost are weakly NP-hard [60]. Even without costs, bridge flows with τ l (truly bridge capacities) are weakly NP-hard as Melkonian [70] showed. Hence, additional complexity can stem from the aggregate arc capacities themselves. The model considered here generalizes both, so must be at least as hard. On the other hand, it can always be solved as a linear program in pseudo-polynomial time. This immediately leads to the following: Corollary 3.7. The BridgeTrans decision problem is weakly NP-hard. Note that there is a difference between bridge flows and -constant bridge flows that makes it inadvisable to pose a QuickestTrans problem as no discretization could be fine enough. Consider a network consisting of a single arc a with transit time 0. A bridge flow can send u(a) flow units across this arc in an infinitesimally short time. A -constant bridge flow would need time units. So the solution of a bridge flow to a QuickestTrans problem could be almost quicker than that of a -constant bridge flow. (Such a problem does not occur with traditional capacities.) Therefore, we do not delve deeper into the QuickestTrans and non-discretized bridge flows. One major difference between traditional capacities and aggregate arc capacities is how flow conservation and requiring integrality affects the flow value, which we will see in the following. These examples and constructions all use step size = 1. It is still useful to think of bridge flows as flows that tend to send impulses as opposed to more uniform flow rates, that traditional flow capacities necessitate. One might even wonder what reason there could be not to use the full capacity of an arc within a single interval (in one impulse) and then pause until the full capacity is available again l(a) time units later. We now discuss the small example in Figure 3.3. In the case of strong flow conservation, this instance already exhibits somewhat unexpected solutions that follow no such simple rule. Note that we discuss the maximum flow value for a given time horizon T, rather than asking feasibility. By choosing supplies/demands equal to the maximum value, this can easily be turned into a BridgeTrans instance whose feasibility depends on the chosen setting. First note that the arcs (v, x) and (w, x) together have an average capacity of = 1 2, which equals the average capacity of (x, y). Thus, continuously sending flow with a rate of 1 3 on (v, x) and 1 6 on (w, x), then 1 2 on (x, y) yields a flow of value T/2, and this is optimal for traditional capacities (but need not be optimal for aggregate capacities). This solution also satisfies strong flow conservation. Let us explore the possibilities of a flow with aggregate arc capacities in this network. We will consider all combinations of weak/strong flow conservation as well as fractional/integral flow functions on this example. In contrast to the flow obeying traditional capacities u(a)/l(a), a pulsed flow sends flow at a rate of u(a) for one time unit, and then waits for at least

139 3.3. PRELIMINARIES 139 T = 7 w v l = 3 x l = 2 y l = 6 Figure 3.3: The sources are v and w, the sink is y. All capacities are 1 and all transit times are 0. The lengths of the capacity windows are given on the arcs, and some of the windows are also drawn. With weak flow conservation, the pictured (single-commodity) integral bridge flow can sent up to 4 units of flow within the time horizon 7 by sending 1 flow unit on each colored arc. A fractional solution cannot improve upon this. l(a) 1 time units. Since u equals 1 for all arcs in our example, pulsed flows are exactly the integral flows. A bridge flow that satisfies only weak flow conservation can send a flow of value T 2. For this, one simply pulses flow into (v, x) and (w, x) as often as possible. These pulses start arriving at x at times {0, 3, 6, 9, 12,...} and {0, 6, 12,...}. They can be forwarded to y at times {0, 2, 4, 6, 8, 10, 12,... } and this repeats with a periodicity of 12. This fully uses the capacity constraint on (x, y). Hence, this is the optimum flow value for every integral time horizon, and the fractional and integral optimum solution coincide. But a bridge flow that satisfies strong flow conservation and uses only integral values cannot send 4 units of flow within time horizon T = 7. For this, the first flow particles from each pulse would have to arrive at x exactly at times {0, 2, 4, 6}. At most two of these could be contributed by the more restricted arc (w, x), but the remaining time steps always contain a pair less than 3 apart. Therefore, they cannot all be supplied by (v, x). Finally, the flow sending the average capacities on each arc achieves a flow value of 7 2, but this is not the optimum value of a fractional flow without storage. A flow of value 11 3 is possible, as shown in Figure 3.4. Linear programming shows that this is optimal. From this we can see that weak flow conservation really increases the set of feasible BridgeTrans instances and the proof of our approximation scheme for BridgeTrans also depends on weak flow conservation. Furthermore, there are instances where the fractional and integral solution differ. What we cannot decide from this small example is, whether weak flow

140 140 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES T = 7 w v l = 3 x l = 2 y l = 6 Figure 3.4: The same instance as in Figure 3.3. This time we require strong flow conservation. The colored arcs represent a flow rate of 1 3 each. This is an optimum fractional solution with flow value conservation always closes the integrality gap, that is, if weak flow conservation always ensures that the integral solution is also the best fractional solution. Indeed, this does not hold in general. We will show this in two different ways, both telling more about the algorithmic complexity of the problem. First, we can consider the work by Papadimitriou et al. [76], which we already mentioned with respect to dedicated arcs in Section 3.1. They study the problem of sending unit-sized messages through a network with transit times on the arcs. While a message travels on an arc, it is entirely blocked for all other messages. Because messages always have unit size, this problem can be seen as an integral bridge flow on a network with bridge capacities l τ and u 1. Then these messages behave exactly like the pulsed flows in the example above. The considered message model also allows unlimited storage at vertices. We can immediately transfer the complexity results from this article to our setting: Maximizing the flow value for integral flows with aggregate arc capacities and weak flow conservation is strongly NP-hard. Corollary 3.8. The BridgeTrans problem restricted to integral flow functions is strongly NP-hard, even for unit capacities. However, the BridgeTrans problem with or without holdover is only weakly NP-hard. Because of this, there must be instances where there is no integral solution but only a fractional one, unless P = NP. This conclusion is unsatisfactory, though, because the existence of such an instance does not depend on P = NP. Of course, reducing a sufficiently complex instance of an NP-hard problem to a BridgeTrans instance would most likely yield an instance where the fractional and integral solution differ. We take a different approach, though, to obtain further insights into the problem and, consequently, smaller instances that show an integrality gap.

141 3.3. PRELIMINARIES 141 We can derive differences between fractional and integral bridge flows from what is known about static multi-commodity flows. To see this, we outline how every static multi-commodity flow has a clear correspondence to a singlecommodity flow over time with aggregate arc capacities on a slightly modified network. For an example of the construction, see Figure 3.5. Theorem 3.9. Deciding the feasibility of static multi-commodity transshipments with K commodities can be polynomially reduced to a BridgeTrans problem with time horizon K. Commodity-independent costs and a bound on the costs can also be reduced. Proof. We are given a static multi-commodity flow instance with commodities K = {0, 1,..., K 1} on a digraph G with capacities u, costs c and cost bound C. The objective is to determine feasibility of a multi-commodity transshipment that satisfies the supplies/demands for all commodities. W. l. o. g., each commodity has its own source and sink, and the supplies equal the respective demands. The idea of the reduction is that the flow for commodity k corresponds to the flow within time layer [k, k + 1) and windows of length K represent joint capacities for all commodities. The time horizon is T = K. For the reduction, we first attach transit times τ 0 and time windows l K to all arcs. The capacities u are the original capacities given in the static instance. We also introduce a supersource and a supersink. For each commodity k {0, 1,..., K 1} we add an arc with transit time k from the supersource to the source of this commodity. The capacities correspond to the supply of each source. The time windows are the entire time horizon K. We also add arcs of length K 1 k from the sink of commodity k to the supersink. The capacities equal the demands of the sinks. The time windows have length K again. This completes the construction. We claim that there is a static multi-commodity flow which satisfies the supplies/demands if and only if the BridgeTrans instance is feasible. The aggregate capacities with windows of length K on all arcs already ensure the correct capacity of each original arc across the time layers. The original sources and sinks can only be used by the correct amount of flow, too. Therefore, any multi-commodity flow can be copied onto the time-expanded network by using the k-th time layer for commodity k and setting the flow from and to the superterminals accordingly. This yields a feasible solution to the BridgeTrans instance. The converse statement hinges on forcing the flow units that should represent commodity k to only use the k-th time layer and the correct arcs for the commodity s source and sink. Note that the following arguments for this only hold if the instance is feasible, that is, that each commodity satisfies exactly its own supply/demand. In that case, all arcs leaving the supersource and all arcs to the supersink have to be used with full capacity. In particular, the flow traveling through the source of the latest commodity k = K 1 can only reach the k-th (= last) time layer because the arc has

142 142 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES u 2 supplies 1 commodity 0 commodity 1 commodity 2 T = 3 l 3 τ = 0 τ = 1 u = demands = 1 u( ) = 2 τ( ) = 0 τ = 2 τ = 0 τ = 1 τ = 2 u = supplies = 1 (a) (b) T = 3 l 3 (c) Figure 3.5: (a) A 3-commodity instance where each commodity has to send 1 unit of flow along two arcs of the triangle. (The vertices are drawn twice to show their function as sources and sinks.) (b) The same instance reduced to a BridgeTrans problem by adding superterminals. The transit times of the arcs between the superterminals and the vertices representing the terminals for each commodity always add up to a length of T 1. This is the desired path the flow units representing this commodity should take. (c) The time-expanded network shows how the different lengths of the arcs force each commodity into its own time layer. All non-expanded arcs to or from the superterminals must be used fully to achieve flow value 3. But the yellow arc from the source can only reach the topmost time layer and must use the only possible arc from there to the sink. So no further flow can enter this time layer. Due to the aggregate arc capacities, this arc to the sink is no longer usable in earlier time layers, either. The only effect remaining from the yellow commodity is the capacity used on the triangle.

143 3.3. PRELIMINARIES 143 transit time k. From there, the supersink is only reachable for a positive amount of flow through the sink of the k-th commodity, which has an arc with transit time K 1 k = 0. The capacities on the source link and therefore also the sink link are used entirely if the instance is feasible. Thus, this time layer is no longer usable by flow traveling through sources corresponding to commodities k < k. In particular, holdover arcs into this time layer cannot be used. But the capacity windows propagate the capacity that commodity k uses to earlier time layers, restricting the remaining commodities. The induction is evident. Because the flow rates on each time layer correspond to the multi-commodity flow rates, the costs can also be modeled in the BridgeTrans instances. Newly introduced arcs have no costs. Then the multi-commodity flow and the corresponding bridge flow have identical costs. Note that this theorem shows that a combinatorial algorithm for the bridge flow problem implies an algorithm for static multi-commodity problems, for which to the best of our knowledge no exact combinatorial algorithm is known yet (only approximate techniques). For fractional flows, this reduction does not have any immediate complexity implications because the static multi-commodity flow could be solved in polynomial time with linear programming, and so can this bridge flow problem with polynomial time horizon. But this construction allows us to determine the time horizon for which integral bridge flows become NP-hard: Corollary The integral BridgeTrans problem is strongly NP-complete even for unit capacities and T = 2 (and without costs). Proof. The following 2-DisjointPaths problem is strongly NP-complete [31, 36]: Given a directed graph G with two terminal pairs (s + 1, s 1 ) and (s+ 2, s 2 ), the decision problem is whether there are vertex-disjoint paths that connect the corresponding terminals. This can be interpreted as an integral 2-commodity flow problem sending one unit of flow per commodity. Theorem 3.9 above gives the reduction to integral bridge flows. Note that for T = 1, a bridge flow is just a static flow problem, for which integral solutions can be found in polynomial time. To come back to the original question of an integrality gap, we can now use multi-commodity instances that exhibit non-integral solutions for integral capacities. The triangle instance in Figure 3.5a can be scaled down by a factor of 1 2. Then all capacities are 1 and the flow rates of each commodity in the triangle is 1 2. But the supplies/demands are also 1 2 for each commodity, which of course cannot lead to integral flows. To remedy this, one simply creates two copies of this instance and introduces joint supersources and sinks for each commodity. Then the supplies/demands of 1 can be exactly satisfied if and only if each commodity sends 1 2 in both copies of the triangle. This behavior cannot be reproduced by an integral bridge flow, only by a fractional one. We have finally obtained an example where the maximum flow value can only be achieved by a fractional flow even under weak flow conservation. Indeed, for

144 144 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES any z Z >0 one can construct multi-commodity instances with 3 commodities such that the optimum flow requires an arc with flow value 1 2z, as summarized by Schrijver [82]. So even with time horizon 3, optimum bridge flows may be arbitrarily fractional. Actually, a smaller instance with a fractional maximum bridge flow value (independent of holdover) can be created by using the instance in Figure 3.5b and setting the capacities within the triangle to 1 instead of 2. Then the maximum value is 5 2, even though all parameters are integral. Note that this bridge flow no longer represents a multi-commodity flow because the multicommodity transshipment is not feasible. (The total supply is still 3.) This gives the bridge flow additional possibilities, and flow units diverge from the desired path through the network, thereby changing which commodity they represent. Losing the automatic integrality that normal flows over time have is also a setback for applying bridge flows in practice. 3.4 Approximation Scheme with Resource Augmentation The main result of this chapter states that while it is NP-complete to decide whether an instance of the BridgeTrans problem is feasible, one can find an approximate solution that exceeds the time horizon and the capacities by a factor of (1 + ε), if the instance is feasible. For infeasible instances we might either prove they are infeasible or find feasible approximate solutions. Our approach is a non-trivial extension of the work of Fleischer and Skutella [27] on standard flows over time. Throughout this section, our aim is to approximate an instance of BridgeTrans as in Definition 3.1. This consists of a graph G = (V, A) with sources S +, sinks S, supplies/demands d, non-negative transit times τ, capacities u, sliding windows l, and non-negative arc costs c. The time horizon is T and the cost bound C. All parameters are integral values, so that we only need to consider 1-constant flows by Lemma 3.4. Note that we need to allow holdover for our approximation scheme to work. The actual algorithm is quite natural. For given ε > 0, as in [27], we choose a suitable discretization Z >0 such that T/ is polynomially bounded in the input size and ε 1. The transit times are rounded up to τ (a) := τ(a) for all arcs a A. Moreover, the time horizon and capacities are increased slightly, while the lengths l of the sliding windows and the costs c remain the same. With the time-expanded network and using Lemma 3.3 to handle the bridge flow capacities, we can formulate the resulting new instance as a polynomial-sized linear program. If the new instance is infeasible, so is the original one. Otherwise, we obtain a -constant flow that approximately solves the BridgeTrans instance. We have to prove two directions in order to show the correctness of this algorithm. The easier one is that any solution to the rounded instance is indeed an approximate solution to the original instance. Intuitively, this is

145 3.4. APPROXIMATION SCHEME 145 true because weak flow conservation is maintained when the flow is interpreted in the original network with shorter transit times. Lemma Consider transit times τ and ˆτ with τ ˆτ. If f is a feasible bridge flow for the larger transit times ˆτ with time horizon T, then f is a feasible bridge flow with weak flow conservation for transit times τ and time horizon T. For both transit times, the same capacities bound the flow and the same supplies/demands are satisfied at the same cost. Proof. Any condition on the flow not involving transit times is identical for both networks. When we decrease the transit times from ˆτ to τ without changing f in any way, this affects the balance of the vertices one-sidedly: Flow units leave the arcs potentially earlier and then have to wait at the head vertex for an additional ˆτ(a) τ(a) 0 time. Since storage is free and unlimited, this is always possible and completely emulates the longer transit times. The excess of the vertices is temporarily increased but unchanged at T again. The costs and capacities also depend only on the unchanged f, so they are not affected either. Note that we will later consider an instance where not just the transit times, but also the capacities and time horizon have been increased. The lemma just states that these will then have to be increased as well for the original instance, which is all we can guarantee. For the other direction we need to show that the existence of a feasible solution to the original instance implies feasibility of the rounded instance. Increasing transit times is much more problematic than decreasing, because even weak flow conservation is violated if flow units are sent onwards before they arrive at a vertex. We need to rearrange the flow globally to take these longer transit times into account. The main idea is to consider a path decomposition of a given feasible flow, as in [27], before we change the transit times. Then we can reassemble the path flows (now with longer transit times) and are guaranteed weak flow conservation. However, this might violate the capacities by a large factor: If multiple paths for the original transit times enter the same arc one after each other, they can possibly all be delayed to arrive simultaneously according to the rounded transit times. The solution is to make sure that the flow along each path is distributed over a larger time span than flow units can possibly be delayed by the rounding. This new smoothed flow will still be congested, but the collisions are spread out equally in order to keep the violation of the capacities bounded. We now show how to round the transit times without sacrificing too much. Lemma Let f be a feasible 1-constant bridge flow for the given BridgeTrans instance. Let 0 < ε < 1 with ε 1 Z and := ε 2 T/ V. If > 0, then there is a feasible -constant bridge flow f for transit times τ : τ and capacities (1+ε)u with time horizon T := (1+2ε)T satisfying the same supplies/demands as f under the same cost bound.

146 146 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES Proof. Using the path decomposition from Theorem 1.3 and its extension to flows over time in Chaper 1.5.2, we can view f as static on the time-expanded network (with unit time steps) and decompose it into flows on paths and cycles. All flow on the cycles can be reduced to 0, as this only decreases the total cost. We can also assume that no path uses multiple time-expanded copies of the same original vertex. Such a part of the path can be replaced by holdover, and this can again only reduce the total cost of the flow. Thus, without loss of generality, f is a flow that has a path decomposition consisting only of a set of paths over time P and each path P = (W, h) uses every vertex at most once and thus consists of W V 1 arcs. The flow rates into the paths are of the form d P (t) = b P χ [0,1) (t) with b P R 0, for P P, as discussed in Chapter That is, f(a, t) = P P f P (a, t) = P P d P (t h P i ), where h P i is the appropriate offset for arc a in path P. We now define for each path P = (W, h) P a path P = (W, h ) with the same walk that matches the transit times τ. Let h 1 := h 1 and h i := max{ h i, h i 1 + τ (a i 1 )}, for i = 2,..., W. This yields a path over time, i. e., the starting times are compatible with the transit times. They are also multiples of and h i h i, for all i. On the other hand, a simple induction yields h i h i i. Since W V 1, we can generalize this to 0 h i h i V, for all i. Instead of sending flow according to the original function b P χ [0,1) into path P, we smooth the flow as follows. (In contrast to the approach in [27], we use a different, somewhat simpler smoothing here). Let z := V /ε, which is in Z. We distribute the flow over an interval of length z εt. This can be accomplished by sending flow according to the function d P (t) := b P z χ [0,z )(t) into path P. The corresponding path flow is f P and we claim that the flow f defined by f(a, ) := f P (a, ) for all a A P :P P has the desired properties. It certainly satisfies weak flow conservation. Due to the rounding error on each arc and the smoothed flow into the path, the time horizon of each f P increases to at most T + z + V (1 + ε + ε 2 )T (1 + 2ε)T. Therefore, f also has this time horizon. Each path flow f P still satisfies a demand of b P, so that overall the supplies/demands d are satisfied. Since the inflow rates into the paths are -constant, and τ was rounded to multiples of, each path flow and f are -constant. The spreading of the flow into each path does not alter the cost of this flow, and so cost( f) = cost(f). The important task left is to show that f is feasible for the aggregate arc capacities (1 + ε)u with time windows l.

147 3.4. APPROXIMATION SCHEME 147 For a = a i W we have f P (a, t) = b P z χ [0,z ) (t h i ). We can conveniently relate these path flows with smoothed flow rates to the flow on the original transit times: f P (a, t) = b P z χ [0,z )(t h i) = b P z χ [h i h i,z +h i h i)(t h i ) b P z χ [0,z + V )(t h i ). So far this just means that the smoothed flow could be between 0 and V late on the original transit times. We decompose the characteristic function into the smaller χ[0, 1), which we use for the flow into f P : f P (a, t) b P z χ [0,z + V )(t h i ) = b P z z + V 1 θ=0 χ [θ,θ+1) (t h i ) = b z + V 1 P χ z [0,1) (t h i θ) θ=0 = 1 z + V 1 f P (a, t θ). z θ=0 As promised, by smoothing the flow, the new path flow is close to the average of the original flow, and importantly, the delayed flow units corresponding to [z, z + V ) only weigh in at 1 z their original rate. For a A, we can now determine the capacity needed by the flow f resulting from the path flows: t+l(a) t f(a, µ)dµ = t+l(a) t P :P P 1 t+l(a) z t P P f P (a, µ)dµ z + V 1 θ=0 f P (a, µ θ)dµ. We rearrange the sums and integrals to reconstruct the original flow f: = 1 z = 1 z z + V 1 θ=0 z + V 1 θ=0 t+l(a) t t+l(a) t f P (a, µ θ)dµ P P f(a, µ θ)dµ.

148 148 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES The capacities immediately imply 1 z + V 1 u(a) z θ=0 z + V = u(a) z = (1 + ε)u(a). Thus, f is feasible and has all the desired properties. Our main theorem governing the approximation scheme now falls into place. Theorem Given a feasible instance of the BridgeTrans problem and ε > 0, one can determine, in time polynomial in the input size and ε 1, a bridge flow f (with weak flow conservation) feasible for the capacities (1 + ε)u and time horizon (1 + ε)t which otherwise is ruled by the parameters of the BridgeTrans instance. Proof. We can assume ε < 1 and use Lemma 3.12 for ε chosen such that 1 4 ε < ε 1 2 ε and 1/ε Z. This yields = ε 2 T/ V. If = 0, then T V /ε 2, and we can solve the exact problem for = 1. Otherwise, we can still guarantee the existence of a flow f for the capacities (1+ε )u (1+ε)u and time horizon (1 + 2ε )T (1 + ε)t with suitable supplies/demands and costs. Since T/ O( V /ε 2 ), a flow with at least the qualities of f can be obtained efficiently by solving a linear program of polynomial size. According to Lemma 3.11, this is also a feasible flow for the transit times τ. Fleischer and Skutella obtain a slightly stronger result for the classical flow over time model. This relies on two things: First of all, in the classical flow over time model, holdover is never required to solve a MinCostTransOverTime problem, so they know that an equivalent solution without holdover exists. This is not the case for bridge flows, as seen in Section Secondly, they can trade time for capacity. More precisely, increasing the time horizon by another factor of (1 + ε) allows reducing the capacities needed by the same factor to obey the original capacities u instead of (1 + ε)u. This also cannot be generalized to the setting of aggregate capacities, as the following example shows. Consider a network consisting of a directed path with two arcs. There is a source s +, an intermediate vertex v, and a sink s. The arcs (s +, v) and (v, s ) both have transit times n, unit capacities and time windows of size 2n + 1 for some n > 0. Notice that there is a bridge flow that sends one unit of flow from s + to s within time horizon 2n + 1. On the other hand, for each ε > 0, any bridge flow that sends 1 + ε units of flow from s + to s needs to send some flow after the first window of length 2n + 1 has expired, and still has to traverse a path of length 2n. This requires a time horizon greater than 4n + 1. The time horizon has nearly doubled to send ε flow units more. For these reasons, the solutions obtained from our approximation scheme still violate the original aggregate arc capacities by a factor of 1 + ε.

149 3.5. OUTLOOK Outlook We have introduced a generalized model of flows over time and presented a fully polynomial-time approximation scheme with resource augmentation for the problem of computing optimal flows for this model. We only mention that the FPTAS can easily be generalized to the setting with multiple commodities, as the changes to the path decomposition apply to individual paths and do not change the source or sink of the path. However, the approximation scheme still requires solving a large bridge flow problem. In our experience, general linear programming is not suitable for this. Thus, it is natural to search for combinatorial algorithms for bridge flows that work on the time-expanded network, just like in the case of traditional flows over time. Such algorithms, however, would have to be able to solve static multi-commodity flow problems, as shown by the reduction in Section The progress in solving multi-commodity flows combinatorially has been somewhat disappointing, though. Only approximation schemes are known to date. The approximation itself is not a serious drawback, as an approximate solution to the already rounded time-expanded network should also suffice for an overall approximation. But this does not bode well for a quick adaptation of bridge flows to practical instances. We are even further from solving integral bridge flow instances given the hardness results even for tiny time horizons.

150 150 CHAPTER 3. FLOWS WITH AGGREGATE ARC CAPACITIES

151 Chapter 4 Confluent Flows We now turn to flows without a time component. The static MaxFlow problem is computationally easy and probably one of the best understood combinatorial problems. However, many applications call for flows with restricted structures and optimizing these often leads to NP-hard problems. One wellknown example are unsplittable flows, which have been thoroughly studied by Kleinberg [57]. In that model, the flow from each source must travel along a single path to the sink, while normally the flow from each source could split into many paths. A confluent flow takes this one step further: At every vertex, all flow traveling through the vertex must leave along one arc. This leads to a forest of flow-carrying arcs pointing towards the sinks, which must have no flowcarrying outgoing arcs. In particular, every confluent flow is unsplittable, but the reverse does not hold in general. See Figure 4.1 for an illustrated comparison of normal, unsplittable, and confluent flows. We study the maximum confluent flow (MaxConfFlow) problem in this chapter. Confluent flows can occur in many contexts, with the prime example being destination-based routing: Whenever a hub decides where to route a packet based solely on the destination of the packet, all packets to the same destination converge in a tree-like pattern. This is a common property of many routing schemes in telecommunications, e.g., on the Internet. A curious example stems from the major German railway operator, the Deutsche Bahn, which requires confluence for goods transported by cargo trains with the same destination. This Leitwege-Regel (ca. path guiding rule ) presumably helps to avoid mistakes in the wagon sorting facilities. Fügenschuh et al. [33] therefore consider confluence as a side constraint in their models for cargo train routing. In general, confluent flows are the simplest imaginable scheme for packet forwarding, and once the rules are set up, they can be applied entirely by local considerations. Managing the protocols that determine the best next hop in a changing network like the Internet is the hard part [8]. Often the restriction to confluence severely impacts the performance of a network (or more abstractly, the flow value), and is therefore avoided if possible. On the other hand, certain virtual private network design problems always admit a flow-carrying tree as the optimum solution [42]. 151

152 152 CHAPTER 4. CONFLUENT FLOWS (a) (b) (c) Figure 4.1: Different types of single-commodity flows. The colors are added to distinguish the sources. (a) A flow may branch arbitrarily. (b) An unsplittable flow uses only one path per source. (c) A confluent flow uses only one outgoing arc per vertex.

153 4.1. RELATED WORK 153 (a) (b) Figure 4.2: (a) A sample building with occupants and emergency exit signs. (b) The building modeled as a graph. A confluent flow with appropriate sources can model the same exit routes. More closely related to the overall topic of this work, emergency exit signs imply a confluent flow: If every evacuee looks at the closest exit sign and follows it and the consequently encountered exit signs, all people will converge towards the exits in a tree-like pattern as pictured in Figure 4.2. In practice, this is somewhat diluted in published exit plans: For instance, lecture halls often require multiple emergency exits, and more than one emergency exit sign is visible from a single point. Then it is often not explicit what the influence area of a specific emergency exit sign should be. A similar problem is that many buildings have broad exit areas with multiple doors, separated by columns or even behind tiny hallways that act as air locks for the air conditioning. Strictly applying confluence in such a model that distinguishes such parallel paths would have everyone leave through a single door. 4.1 Related Work Confluent flows (under the name of arboricity flows ) are also the topic of Achterberg s Diploma thesis [77], where he discusses an integer programming approach. He gives an improved linear programming relaxation and derives a class of cutting planes. As can be expected, his work considers minimum cost confluent flows and not just maximum value confluent flows, because the costs can be entirely modeled in the objective function of the linear program. To the best of our knowledge, the first combinatorial approaches to confluent flow problems are presented by Chen et al. [12,13], which also popularized the name confluent. They focus on minimizing the maximum inflow into a vertex, which in a confluent flow always occurs at a sink. It is important to

154 154 CHAPTER 4. CONFLUENT FLOWS Figure 4.3: A confluent flow with multiple sinks partitions the graph (with the possible exception of non-terminals). note that they do not consider arc capacities. Therefore, this problem is not a typical flow problem: The task is equivalent to partitioning the graph into connected components that each contain exactly one sink. Once the components have been determined, any contained spanning forest will produce the desired confluent flow, as the supply reaching each sink is already determined by the partition and arc capacities do not have to be considered, compare Figure 4.3. They are aware of this and derive further results on weighted connected graph partitions in their work. We can restate their premises as follows by introducing a supersink and imposing uniform arc capacities u U. Then the objective is equivalent to minimizing the uniform capacity U needed such that the transshipment is still feasible. Phrased like this, the connection to maximum confluent flows is clearer, which can be used to answer this problem using a search for the correct value of U. Yet another interpretation is to impose u 1 as capacities and allow the flow to exceed these. Then the objective is to minimize the maximum congestion x(a)/u(a), or simply x(a) here. No matter the interpretation, they prove a significant inapproximability result for the minimum congestion objective, in that it cannot be better approximated than logarithmically in the number of arcs leading to the supersink (i. e., the number of sinks before the introduction of the supersink). They also give a combinatorial approximation algorithm with an almost matching performance guarantee. Furthermore, the authors consider a maximum flow variant, where each source must send exactly its supply or zero flow units (an all-or-nothing flow) while not exceeding a given congestion. They derive an approximation for this from their previous result. The authors list heterogeneous capacities among their open problems, which we will address here albeit in the maximum flow setting without this all-or-nothing property. Donovan et al. [19] extend the results of Chen et al. to general k-furcated flow, where the outdegree of a vertex is at most k. Notable special cases are confluent flows (k = 1) and bifurcated flows (k = 2), as well as standard flows

155 4.2. OUR CONTRIBUTION 155 (k = ). For 2 k, they show that the congestion of a k-furcated flow is only at most 1+ 1 k 1 times that of an unconstrained flow, and give an algorithm to compute such a k-furcated flow. They also show that approximating the minimum congestion for bifurcated flows better than is NP-hard. In a series of papers culminating in [67], Mamada et al. approach confluent flow problems from a different angle. They consider flows over time on trees and how to treat them algorithmically, first with a single sink and then with multiple sinks. They call a flow over time confluent if each vertex has at most one outgoing arc that ever carries flow, and no flow must leave the sinks. Their later results are stated for finding partitions of trees that minimize a monotone function on the powerset of the vertex set. In particular, they discuss the function defined on the vertex set by the minimum time horizon for sending given supplies towards a single sink. This special case solves the confluent QuickestTrans problem on trees in polynomial time. Their work also enables the efficient evaluation of confluent flows over time on general graphs, which is not a given for flows over time. Of course, one can use a QuickestTrans algorithm for a static flow problem as well by setting all transit times to 0 and using the static arc capacities as the flow rate capacities. If and only if the confluent quickest transshipment finishes within 1 time step, the original static confluent transshipment instance is feasible. We will come back to this when discussing the complexity of the static confluent flow problem on trees. 4.2 Our Contribution Our main contributions fall into three categories. First, we show hardness results and some polynomial-time brute-force algorithms to set upper and lower bounds on the interesting cases of the MaxConfFlow problem with heterogeneous arc capacities. Second, with this clear understanding of the complexity landscape, we provide a polynomial-time algorithm for the singlesink MaxConfFlow problem on outerplanar graphs in Theorem This can be viewed as the first polynomial-time algorithm for a non-trivial case of MaxConfFlow. Third, we also consider NP-hard MaxConfFlow instances and provide a fully polynomial-time approximation scheme for graphs with bounded treewidth in Theorem Preliminaries We express the structural restrictions of confluent flows using the set of arcs with non-zero flow. Definition 4.1 (Confluent Flow). Let G = (V, A) be a directed graph and x be a non-negative flow function. Denote the set of flow-carrying arcs with A(x) := {a A : x(a) > 0}. The flow function x is nearly confluent if A(x) contains at most one outgoing arc for each vertex, while sinks must have no outgoing flow-carrying arcs.

156 156 CHAPTER 4. CONFLUENT FLOWS The flow function x is confluent if additionally A(x) contains no directed cycle. In that case, we call A(x) an in-forest pointing towards the sinks. Problem 4.2 (MaxConfFlow). Input: A digraph G = (V, A), arc capacities u : A Z 0, sources S + and sinks S and a matching supply/demand function d. Question: What is the maximum flow value over all confluent flows obeying d? This definition of an in-forest does not explicitly forbid cycles in the induced undirected graph that are not directed cycles. However, if there was such a cycle it would contain a vertex with two outgoing arcs. So an in-forest really becomes a forest when the direction is removed. It is similar to an arborescence, but in an arborescence the arcs point towards the leaves. For the purpose of confluent flows, in-forests always point towards the sinks, and a confluent flow travels along these arcs. No flow leaves the sinks, so sinks are naturally the roots of the trees. The difference between nearly confluent and confluent flows is also negligible for most purposes. A nearly confluent flow is a true relaxation of a confluent flow because it may contain directed cycles. But the flow caught in a directed cycle can never contribute to the flow value because sinks have no outgoing arcs and cannot lie on such a cycle. Thus, nearly confluent flows and confluent flows can satisfy the same supplies/demands and have the same maximum value. The advantage of considering nearly confluent flows is that only a local check of the outdegree condition is needed. The clear structure of a confluent flow also gives us a different view of MaxConfFlow problems. Essentially, we can break the flow computation into two parts. First one decides on an in-forest of arcs that may carry flow. Once these arcs are fixed, any maximum (standard) flow algorithm can compute the optimum confluent flow possible with these arcs. This subproblem is even easier than a standard flow computation because it occurs on a tree. Thus, solving a confluent flow problem essentially reduces to picking the optimum in-forest. One important consequence is that maximum confluent flows with integral parameters can always be chosen integral, because they can be computed as maximum flows, which have integral solutions according to Theorem 1.7. One final consideration is the behavior of sinks in confluent flows. We require that they have no outgoing arcs, but there are three conceivable settings for sinks: May a confluent flow use one outgoing arc at each sink, and if so, can the sink then still act as a sink? The model where they either act as a sink or as a non-terminal with one outgoing arc is actually equivalent to sinks as we define them. By removing outgoing arcs from the sinks, one obtains our model. On the other hand, the introduction of a supersink while keeping the outgoing arcs of the sinks forces such a choice. The most relaxed option is a true generalization of our definition, though. But such a confluent flow does not support the introduction of a supersink, because the sinks would need a choice of one outgoing arc and additionally they could use the arcs to the supersink. We think that sinks diverting part of the flow are counter-intuitive to

157 4.3. PRELIMINARIES 157 Figure 4.4: An outerplanar graph. Note that it has 2 V 3 edges, which is a property of all edge-maximal outerplanar graphs. the idea of confluence. The existing works on confluent flows also use the more restrictive definition for sinks that we use. Note though, that our algorithm for outerplanar graphs only considers a single sink and is not affected by this decision, while the approximation scheme for graphs with bounded treewidth could be expanded to include the more general definition Outerplanar Graphs One of the graph classes we are going to study in this chapter are outerplanar graphs, which give us a rich structure to work with. Definition 4.3 (Outerplanar graph). An undirected graph is called outerplanar if it has an embedding in the plane such that all vertices lie on the outer face. We call a digraph outerplanar if the underlying undirected graph is outerplanar. The usual drawing of an outerplanar graph places all vertices on a circle and all other edges are embedded within that circle, leading to the distinctive appearance of a cycle with (non-crossing) chords cutting through it as in Figure 4.4. Some embeddings also show parts of the graph seemingly on the outside of the prominent cycle but these external parts could also be folded onto the circle and, if done correctly, one can even add edges to complete the Hamiltonian cycle. A fascinating result [43] shows how flexibly outerplanar graphs can be embedded: Given any outerplanar graph G = (V, E) and any set of points P in the plane in general position (i. e., no 3 points lie on the same straight line) with P = V, one can find an embedding of G such that the vertices are mapped to P and all edges are represented by straight lines. Actually, with this result one can see that outerplanar graphs are the largest graph class that admits such an embedding for every given point set. There is an O( V log 3 V ) algorithm to find such an embedding, and a simpler

158 158 CHAPTER 4. CONFLUENT FLOWS O( V 2 ) algorithm exists as well [10]. Outerplanarity can even be recognized in linear time [72]. An alternative characterization is that outerplanar graphs are exactly those graphs that do not have K 4 or K 2,3 as minors [11] Tree Decompositions and Treewidth In contrast to the single class of outerplanar graphs, the concept of treewidth gives rise to an infinite hierarchy of graph classes, and every graph has a place in this hierarchy. In a way, treewidth describes how suitable a graph is for dynamic programming and this is exactly what we will use it for. To determine the treewidth, one needs an optimum tree decomposition of the graph, which superimposes a structure of vertex sets, called bags, over the graph, and fixes a tree describing in which order they should be processed. An example of this can be seen in Figure 4.5. We will call the vertices of this tree nodes to distinguish them from the vertices of the graph that we construct the tree decomposition for. For defining treewidth, we follow Bodlaender [9], and we consider a version that only depends on the underlying undirected graph. Johnson et al. [53] also give a generalization of treewidth to directed graphs, but it leads to the same results in the case that all reverse arcs are assumed to exist. Definition 4.4 (Tree decomposition). Let G = (V, E) be an undirected graph. A tree decomposition of G is a tree (I, T ) on some new node set I with edges T, and a family of sets B i V for i I, such that the following three properties hold: 1. The B i cover V, that is i I B i = V. 2. For every edge {u, v} E there is an i I with {u, v} B i. 3. For every v V, the set of all i I such that v B i forms a connected subgraph of (I, T ). Each B i is called a bag. The width of a tree decomposition is max i I B i 1. The treewidth of G is the minimum k such that G has a tree decomposition of width k. Every graph has a tree decomposition of width V 1 because one can use a single bag that contains the entire vertex set. So the treewidth is always finite. The hierarchy we mentioned above consists of the graph classes with treewidth 0, 1, 2 and so on. Some special cases are worth pointing out: Independent sets of vertices are the only graphs with treewidth 0. Non-trivial forests are the graphs with treewidth 1. Series-parallel graphs have treewidth 2. On the other hand, complete graphs have treewidth V 1, so this hierarchy really is infinite. In general, it is NP-hard to compute the treewidth of a graph. However, if the treewidth is known to be bounded by a constant k, many related problems can be solved in polynomial or even linear time. In particular, one can check in linear time whether a graph has treewidth at most k. One can also construct a tree decomposition with width k in linear time, if it exists. But the hidden

159 4.3. PRELIMINARIES 159 t (a) (b) Figure 4.5: A graph and a possible tree decomposition of minimum width 3. The bags cover all vertices and edges. The round white vertex is an example of the third condition, i. e., the bags containing it form a connected subgraph of the associated tree. The vertex t will be the future sink. I L I I F F I I F I F I F I F I I L B r = {t} F I I M F F I F F I F L M Figure 4.6: A nice tree decomposition derived from the tree decomposition in Figure 4.5 according to Lemma 4.7. The vertex t was used as sink. The colored bags correspond to the bags in the original tree decomposition. The letters correspond to the kind of node: Leaf, Introduce, Forget, or Merge.

160 160 CHAPTER 4. CONFLUENT FLOWS factors are large and grow quickly with k. An overview of achievable running times and algorithms is given in [9]. To deal with tree decompositions algorithmically, it is helpful to strengthen the definition to allow only few specialized nice cases. The idea is that the bags should change by at most a single vertex while traversing the superimposed tree towards the designated root, which will be the single sink of the confluent flow problem. This is shown in Figure 4.6. Such an approach is commonly used, the exact definitions differ slightly from author to author, though. Definition 4.5 (Nice tree decomposition). Let (I, T ) with (B i ) i I be a tree decomposition of G = (V, E) and denote a vertex t V. Then this tree decomposition is called nice if (I, T ) is a binary tree for some root r, satisfying the following additional properties: 4. If a node i I has two children j, k I, then B i = B j = B k. In this case, i is called a Merge node. 5. If a node i I has one child j, then one of the following must hold: (a) B i = B j 1 and B i B j. Then i is called a Forget node. (b) B i = B j + 1 and B j B i. Then i is called an Introduce node. 6. B i = 1 holds for all leaves i of (I, T ). Such an i is called a Leaf node. 7. B r = {t} holds. Because a nice tree decomposition designates a root, we can consider the bags of the subtree rooted at some i I (which also includes i itself). The vertices contained in these bags induce a subgraph G i of G, which we call the bagend. Algorithmically, the bagends are the subgraphs on which we want to compute intermediate solutions, with the bagend of the root being the entire graph. The bag B i is called the interface between G i and G because it can be seen that any path from V (G i ) to V (G) \ V (G i ) must pass through B i. Definition 4.6 (Bagend). Consider a nice tree decomposition (I, T ) with bags (B i ) i I. For i I, the bagend G i is the subgraph of G induced by the union of all B j for j in the subtree rooted at i. On a side note, it would be consistent to require Leaf nodes to be empty sets: This would truly trivialize any operation on them, and the 1-vertex sets would then be constructed like any other Introduce node. However, our notation would have to deal with empty graphs and empty vectors then, which could at least cause some confusion. We will now see that such a nice tree decomposition can always be constructed efficiently. Lemma 4.7. Let G = (V, E) be a graph of treewidth k and t V an arbitrary vertex. Then G has a nice tree decomposition (I, T ) with bags (B i ) i I with width k

161 4.3. PRELIMINARIES 161 and with root r such that B r = {t}. Furthermore, the number of nodes in I is polynomially bounded in V and it can be constructed in linear time, assuming that k is constant. Proof. Because polynomial sized tree decompositions can be found in linear time as mentioned above, we assume a tree decomposition of polynomial size is given and make it nice. Showing that the result remains a tree decomposition of G mostly revolves around Condition 3 from above. This condition can be easily checked because we will only modify the tree locally. First, we choose an arbitrary node r with t B r and orient the tree towards r. If B r > 1, we introduce a new node r above r that becomes the real root with B r = {t}. If B r = 1, no such additional step is needed. Next we want to obtain a binary tree. We replace any node i I with p > 2 children by a binary tree (I, T ) with p leaves and associate all nodes i I with the vertex set B i. The new binary tree (I, T ) is then inserted in place of i, and the p children of i are connected to the p leaves. Note that the replacement binary tree can be chosen such that I 2p 1. Thus, all such replacements can at most double the size of the entire tree. It is easy to see that leaves with empty bags may be deleted. Now consider a leaf i with B i > 1. We simply add a new child node to it with a bag containing only one vertex from B i. Notice that a single edge {i, k} in the tree (I, T ) can be subdivided easily as long as the new node j between i and k has a bag satisfying B i B k B j B i B k. So if i is a father of k that does not satisfy the bag size restrictions of a nice tree decomposition, this allows us to first introduce a node j with B j = B i B k in-between, and then forget all vertices in B i \B j one by one in further subdivisions of the edge {i, j}. Proceeding from j downwards, we can then introduce all vertices in B k \B j in subdivisions of {j, k}. For a Merge node i it might actually be necessary to introduce one more subdivision to first obtain a child with the exact same bag B i. The number of steps needed for each sequence of subdivisions is clearly bounded in V, giving a tree decomposition still with a polynomial number of nodes. Since we never use a bag larger than max{ B i, B k }, the treewidth does not increase. It only remains to contract edges between identical associated bags, unless the parent is a Merge node. Let us establish some well-known properties of nice tree decompositions, that we will use in the proofs later on and which might provide further understanding of tree decompositions. Lemma 4.8. Let (I, T ) with bags (B i ) i I be a nice tree decomposition of G = (V, E) with root r. 1. For every vertex v V \ B r, there is a unique Forget node i I with child j I such that B j = B i {v}. There is no such Forget node for v B r. 2. For every Merge node i with children j and k, it holds B i = V (G j ) V (G k ).

162 162 CHAPTER 4. CONFLUENT FLOWS 3. For every Introduce node i with child j, let v be the unique vertex in B i \ B j. Then there are no edges between v and V (G j ) \ B j. Proof. 1. The unique Forget node for v V \ B r is the node i I that contains v and is closest to the root r of (I, T ). If there was a Forget node for v B r, that would contradict the connectedness of {i I : v B i }. 2. Since B i = B j = B k, we have B i V (G j ) V (G k ). Suppose there is a vertex v V (G j ) V (G k ) \ B i. Thus, v is contained in the bags of some nodes j and k of the subtrees rooted at j and k, respectively, and so in the bags of every node on the path from j to k for this to form a connected subgraph. In particular, v B i, a contradiction. 3. Suppose there is an edge {v, w} for some w V (G j ) \ B j. This edge must be contained in some B k. Thus, v and w are in B k, in particular k i, k j. If k is in the subtree of (I, T ) rooted in j, then v V (G j ), a contradiction. Otherwise the path from j to k passes through i. Then w B i must hold, again a contradiction Bounding the Complexity To discuss the hardness of confluent flows, we must first fix a decision version of MaxConfFlow: Given an instance with graph G = (V, A), capacities u, sources S + and sinks S and supply/demand function d, is there a confluent flow with value at least f? The limits given by the supplies and demands are generally not essential, because we can model the supply of a sources by a new vertex and an arc with a suitable capacity leading into the source, and similarly for sinks. On the other hand, infinite supplies and demands could be replaced with bounds easily obtained from the incident arcs. Just like for static flows, we can also introduce a single supersink. However, we cannot introduce a supersource because the confluence condition implies that only one of the original sources could be used. We obtain the following hardness results: Theorem Approximating MaxConfFlow on general directed graphs within a factor of 3 2 ε is NP-hard for any ε > MaxConfFlow is weakly NP-hard if G is a directed tree. 3. MaxConfFlow is weakly NP-hard on planar graphs with treewidth 2 and a single sink. Proof. 1. One can model the 2-DisjointPaths problem as a confluent flow: Given a directed graph G with two terminal pairs (s + 1, s 1 ) and (s+ 2, s 2 ), the decision problem is whether there are two vertex-disjoint paths that

163 4.3. PRELIMINARIES 163 d 1 s + 1 s 1 s + 1 2d 1 s + 2 d 2 s 2 2d 1 s + 2 d 1 s D 2d n 2d 2 2d n 1.. d n 1 s n 1 D 2d n 2d 2 2d n 1.. d2 d n 1 s s + n 1 s + n 1 d n s + n d n s n s + n (a) (b) Figure 4.7: (a) The MaxConfFlow instance that models a Partition problem on a directed tree. With unbounded supplies/demands, the maximum flow value is 3 n 2 i=1 d i if and only if the instance is a YES-instance. (b) Another reduction of Partition, this time to a planar graph with a single sink. This problem is strongly NP- connect the corresponding terminals. complete [31, 36]. We can reduce the 2-DisjointPaths instance to a MaxConfFlow problem on the same graph with arc capacities u 2. Each source s + i, i {1, 2}, has supply d(s+ i ) := i and each sink s i has a matching demand, d(s i ) := i. The maximum flow value is 3 if and only if each source sends to the corresponding sink, which can happen if and only if the two disjoint paths exist. If the disjoint paths do not exist, the optimum value is at most We reduce the weakly NP-complete Partition problem [36] to the decision version of MaxConfFlow. An instance to Partition consists of n natural numbers d 1,..., d n with D := n i=1 d i and the question is, whether there is a subset P {1,..., n} such that i P d i = D/2. The input graph for MaxConfFlow is described in Figure 4.7a, with infinite supplies at all sources s + i. The sinks S = {s 1,..., s n, s } also have infinite demands. The arc capacities are given in the figure. We claim that the solution to MaxConfFlow is at least f := 3 2 D if and only if the Partition instance is a YES-instance. We can identify P {1,..., n} with the sources sending to s in an optimum solution. Let z := i P d i. Then the value of the flow is min{2z, D} + (D z), which attains its unique maximum 3 2 D at z = 1 2D. Thus, the optimum value is 3 2D if and only if a partition exists. To complete the proof that this problem is weakly NP-complete, we refer

164 164 CHAPTER 4. CONFLUENT FLOWS to the upcoming Theorem 4.30, which proves that for graphs with constant treewidth, MaxConfFlow can be solved by a pseudo-polynomial dynamic program. 3. Note that in the last reduction we can identify all sinks with the single sink s. This yields a planar graph with only one sink as shown in Figure 4.7b. The non-trivial tree had treewidth 1, and we can add s to all of the existing bags to obtain a tree decomposition of treewidth 2. Since the graph contains a cycle, the treewidth cannot be 1, though. Again, showing that the problem is only weakly NP-complete depends on Theorem It may seem somewhat surprising, that the MaxConfFlow problem remains NP-complete even on trees. However, there are three crucial ingredients to this hardness result: Firstly, the problem is only weakly complete as we will see in Chapter 4.5, so the input numbers must be exponentially large. Secondly, there must be a non-constant number of sources and sinks to prevent efficient enumeration of all possible routings. Finally, the maximum confluent flow value must be less than the sum of what each source could send by itself. (In particular, there cannot be a feasible transshipment.) This is clearly seen in the Partition reduction, where at least a quarter of the effectively available supply is lost. If the second or third of these requirements for the NP-hardness does not hold, the polynomial algorithms from the following theorem can be used to solve MaxConfFlow on trees. Theorem Consider a bidirected tree G = (V, A) with arc capacities u : A Z 0 and supplies/demands d : V Z, as well as matching sources S + and sinks S. 1. There is a linear-time algorithm that decides whether the value of Max- ConfFlow equals s + S + d(s+ ). 2. There is a polynomial-time algorithm that determines the value of Max- ConfFlow if S + or S is constant. Proof. 1. Deciding whether all supplies can be sent can be solved with the following depth first search (DFS) that traverses the tree in postorder (children first, then the parent). The idea is that we can easily resolve the supply of a leaf because there is only one option if the leaf has positive supply. On the other hand, if a leaf is a sink, it offers itself as a sink to its parent. On each vertex v the algorithm keeps track of the supply d(v), which changes during run time, and another variable holding the usable demand of the currently best known sink among the children of v. We will denote this with b(v) 0 and initialize it to 0 for all vertices. The algorithm can start at an arbitrary vertex r of the tree and maintains the following invariants: (a) The current instance is feasible if and only if the original instance is.

165 4.3. PRELIMINARIES 165 (b) The already processed vertices have no positive supply. (c) For a vertex w, b(w) is the maximum amount of flow that can be sent from w to a sink through one of the already processed children of w. (This is meaningless if w is a sink itself but is still updated.) These are clearly correct at the beginning of the execution because there are no processed vertices. When the DFS reaches a vertex v r that has no unprocessed children, one of the following cases occurs. Denote the parent of v with w. (a) If d(v) < 0: By the invariant, all vertices in the subtree rooted at v have no supply. Thus, any remaining flow must reach v through w. Therefore, v offers itself to w as a sink and can accept up to min{u(w, v), d(v) } flow units from w. Thus, it sets b(w) to this value if it exceeds the current b(w). (b) Otherwise d(v) 0: Since v must sent all of its supply, it has two meaningful choices: Send up to its parent w or down to the best known sink. Since all children of v are already processed, this best sink is indeed known. i. If d(v) b(v), then v can send down to one of its children. In this case, all other flow through v must continue to this sink as well and we can update d(v) to d(v) b(v) 0. Also, we might have to update b(w) appropriately (as in the first case), if d(v) is now negative. ii. Otherwise, d(v) > b(v) and no child of v can accept the supply of v. Then v must send to w. If u(v, w) < d(v), the instance is infeasible. If w is a sink with demand d(w) strictly less than d(v), then the instance is also infeasible. Otherwise, we set d(v) = 0 and d(w) is increased by d(v) (whether w is a sink or not). In its last step, the DFS encounters r and outputs YES if and only if d(r) b(r) at that time. In particular, d(r) 0 leads to a YES, while otherwise the supply of r has to be sent to a child. This completes the algorithm. The running time of a depth first search is linear, and the correctness follows from the invariants. 2. If the number of sources is a constant k, an algorithm can enumerate all possible ways the sources can send to the sinks: The flow from each source must eventually reach one of the sinks. There are only S k = O( V k ) possibilities for this, which can then be checked for confluence. It is easy to see that one of these solutions gives the maximum flow value. On the other hand, if the number of sinks is a constant k, the optimum solution can be characterized by a forest with a sink in each tree. (There might be additional independent vertices, but they can be connected to an arbitrary sink tree at no loss.) This optimum forest can be found by enumerating all possible ways to delete k 1 edges from the undirected tree, which can be done in O( V k 1 ).

166 166 CHAPTER 4. CONFLUENT FLOWS We note that the result on the QuickestTrans problem on trees by Mamada et al. [67] already implies that the static confluent transshipment can be solved in polynomial time, although with a worse bound on the running time than in Theorem Chen et al. [13] also consider the confluent transshipment on trees, but for uniform capacities only. They do not specify a running time, but it should at worst be quadratic. We now have a clearer picture of the hardness of the MaxConfFlow problem. As mentioned before, Chen et al. [12] also study a variant where the objective is to minimize the uniform arc capacity. One can use MaxConf- Flow to model the decision version of their problem using a single sink s, but their inapproximability within a factor of 1 2 log 2 δ in (s ) ε does not carry over to the maximum flow value. The crucial difference is that in the setting of Chen et al. all demands are satisfied proportionally. As long as this proportion is 1, i. e., all flow must arrive at the sink, the MaxConfFlow problem behaves the same. For lower values, MaxConfFlow does not enforce any proportion at all and sources that increase the congestion too much for too little gain (relative to the total flow value) may simply be ignored. Furthermore, their all-or-nothing maximum flow variant cannot be approximated within a factor of 2 ε. But in the MaxConfFlow problem the supplies at the sources may be satisfied partially and, thus, this result does not carry over either. 4.4 Polynomial Algorithm for Outerplanar Graphs So far, we have seen complexity results for MaxConfFlow that leave little room for polynomial cases because the graph structure cannot be more restricted than trees. We bypass this problem by allowing only a single sink. In general, we can always introduce a supersink so this restriction by itself is meaningless. However, introducing a presumably highly connected supersink changes the graph structure, so a combination of requiring a single sink and a special graph class can lead to new and meaningful results. For example, trees with a single sink can be solved by a greedy algorithm. The graph class considered here is that of outerplanar graphs, as introduced above. They have some key properties that make them a non-trivial case. First of all, any graph class that has V + O(1) edges can be solved by brute force, enumerating all contained trees. But outerplanar graphs can have up to 2 V 3 edges (each in both directions). There is also no obvious greedy strategy, as the chords let different parts of the graph interact. The existence of a dynamic program seems likely (and indeed we will develop one), but one has to be careful to avoid a pseudo-polynomial running time. We start with a high-level description of the algorithm. We can assume that we are given a straight-line embedding of a bidirected outerplanar graph G = (V, A) on the regular n-gon, and that the vertices are labeled clockwise from v 0 to v n 1. We may also assume that the entire outer cycle is part of the graph in both directions. Adding these arcs with capacity 0 cannot lead to any crossings by simple geometric arguments. The (unlimited) sink is v 0, located at the bottom of the cycle, and we think of vertices with

167 4.4. POLYNOMIAL ALGORITHM FOR OUTERPLANAR GRAPHS 167 lower indices as being left of vertices with higher indices, as can be seen in Figure 4.8. Our dynamic program works on slices cut out of the cycle. More precisely, for i < j, we consider the graph induced by {v i, v i+1,..., v j }, which we denote with G i,j. This is again an outerplanar graph (by the same embedding as G). We will only need to consider such G i,j that are self-contained: That is, no vertex v k with i < k < j is incident to a vertex outside of G i,j in G. We want to capture all the relevant information about the flow inside of G i,j in the dynamic program. Once this information is available, we will join consecutive G i,j and G j,k to obtain the values for G i,k. To find the necessary properties for the dynamic program, consider an optimum confluent flow on G. The flow-carrying arcs induce an in-tree pointing towards the sink which we can greedily extend to a spanning in-tree. The path from any vertex in G i,j to v 0 must either travel through v i or v j (because we assume G i,j to be self-contained). This gives rise to three cases: The subgraph G i,j is split by the tree into two components which must then be trees rooted at v i and v j. We will collect information about this case in a table Ψ i,j. To compute the corresponding entries in the dynamic program, we can treat v i and v j as sinks in a smaller confluent flow instance consisting only of G i,j. Otherwise, the tree induces only one component in G i,j which is either rooted at v i or v j. If it is rooted at v i, then we say the flow is routed left and treat v i as a sink. This case will be handled by a table L i,j. But it may be that additional flow enters G i,j at v j in an optimum solution. Therefore we have to treat v j as a source with undetermined supply. To represent this in a dynamic program, we have to know the flow value at v i for all possible supplies of v j. In the other case, denoted by R i,j, the flow leaves G i,j through v j on the right, which acts as sink, while v i is a variable source. The dynamic program needs a separate table for each of the three cases. For Ψ i,j, all dominating pairs of flow values arriving at v i and v j are computed. There can only be (j i) many values, as these situations can be characterized by a split point between v i and v j. For L i,j (and analogously for R i,j ), the desired table is a function describing the maximum flow that can reach the temporary sink v i for each possible amount of supply at v j, and not just for d(v j ). Without further effort, these tables would have a pseudo-polynomial size. However, this function is simple, that is, for every unit of additional supply at v j, the output at v i can increase by 1 or remain unchanged. The number of times this behavior changes can be bounded. If we additionally perform the entire computations with such a piecewise description of the tables in mind, we obtain a polynomial running time. We actually want to split up the single unlimited sink v 0 into two sinks, allowing for a more symmetric notation: We introduce a new vertex v n between v 0 and v n 1 and make it a second sink. The capacities between v n 1 and v n are inherited from the former arcs (v n 1, v 0 ) and (v 0, v n 1 ), but there is no arc between v 0 and v n. The chords to v 0 also remain unchanged. Both sinks have infinite demands. One can easily check that the original instance has the

168 168 CHAPTER 4. CONFLUENT FLOWS Left Right v 2 v 1 v n 1 v 0 v n Figure 4.8: An outerplanar instance. The two sinks have infinite demands and all other vertices can be sources. The outer cycle is mandatory, the chords are just an example. All edges are bidirected but possibly with capacity 0 in one or both directions. same MaxConfFlow value as the new instance. These preparations allow us to focus on the following kind of instances, which is again exemplified in Figure 4.8: Definition 4.11 (Outerplanar instance). An outerplanar instance consists of an outerplanar graph G = (V, A) with the vertices V = {v 0,..., v n } embedded on the regular (n + 1)-gon and labeled clockwise, arc capacities u : A Z 0, arbitrary sources S + and sinks S = {v 0, v n }. The supply/demand function d : V Z must match this and the sinks must have effectively infinite demands. Furthermore, all arcs along the outer cycle except (v 0, v n ) and (v n, v 0 ) must be in A. The following definitions formalize the ideas explained above that a confluent flow on a self-contained subgraph G i,j has essentially three options to reach the sinks: All flow is sent to the leftmost vertex, or all flow is sent to the rightmost vertex, or the flow splits, sending to the leftmost and to the rightmost vertex. Definition Consider an outerplanar instance and i, j Z 0 with 0 i < j n. Let G i,j := G[{v i, v i+1,..., v j }]. Let L i,j (z) : Z 0 Z 0 be the maximum value of a confluent flow in G i,j with capacities u, supplies/demands d except d(v i ) := and d(v j ) := z, and v i the unique sink. Let L i,i (z) := 0 for all z Z 0.

169 4.4. POLYNOMIAL ALGORITHM FOR OUTERPLANAR GRAPHS v 1 v 2 0/4 5/7 v 3 2/4 v /4 7/7 0/1 0/2 9/10 2 v 6 v 5 1/ R 3,6 (z) z v 0 v 7 (a) (b) Figure 4.9: (a) An example for the MaxConfFlow problem on outerplanar graphs. For simplicity, all opposite arcs have capacity 0 and are not shown. The colored numbers at vertices are the supplies, while the labels of the arcs are in the form x(a)/u(a) for the optimum solution. (b) The plot for R 3,6 (z) and optimum configurations for the different values of supply z at v 3 and with v 6 as sink. Although L i,j and R i,j are defined on the integers, the interpolation is consistent with the definition and helps visualizing the functions. Let R i,j (z) : Z 0 Z 0 be the maximum value of a confluent flow in G i,j with capacities u, supplies/demands d except d(v i ) := z and d(v j ) :=, and v j the unique sink. Let R i,i (z) := 0 for all z Z 0. Let Ψ i,j Z 2 0 contain exactly those pairs (l, r) with l = L i,k(d(v k )) and r = R k+1,j (d(v k+1 )) for k {i,..., j 1}. The example in Figure 4.9 shows that the L and R functions for larger subgraphs are not trivial. The complexity occurs because flow entering, for example, on the left does not increase the output on the right immediately. But at some point, the optimal solution might switch to an in-tree that allows for more capacity on the chord, but routes less of the flow originating inside G i,j to the right, still resulting in a net gain. On the other hand, if the chord is not needed to route anything past G i,j, it is usually more effective to route the flow from vertices on the left through the chord instead of along the outer cycle. Finding the right balance between capacity on the chord and sending flow from inside to the outside is the trick. This leads to optimal solutions that send flow in zigzag paths to the sinks. There are two properties of outerplanar graphs that make a dynamic program based upon these definitions feasible: Any confluent flow splits the graph

170 170 CHAPTER 4. CONFLUENT FLOWS v m v m v i v j (a) (b) Figure 4.10: (a) As described in Lemma 4.13, there always is a vertex v m that splits the flow going to the left sink from the one going to the right sink. (b) Choosing the rightmost neighbor v m of v i (that is not v j ) yields a refinement of G i,j into the self-contained subgraphs G i,m and G m,j as noted in Lemma into some G 0,k and G k+1,n that entirely send to v 0 and v n respectively. Choosing k in the definition of Ψ 0,n almost by definition yields the optimum flow as l + r for some (l, r) Ψ 0,n. The second property allows us to choose selfcontained subgraphs G i,j easily. Indeed, to find a smaller self-contained graph in G i,j it suffices to look at the longest chord incident with v i, say (v i, v m ) or (v m, v i ). Then G i,m and G m,j must be self-contained again. Let us quickly formalize these properties. Note that we make use of the straight-line embedding of the outerplanar graph, which implies that if i < k < j < l, then the undirected edges {v i, v j } and {v k, v l } would cross if they both existed. Lemma Consider an outerplanar instance. Let x be a confluent flow for this instance. Then there exists some m {0,..., n 1} such that all flow-carrying arcs A(x) lie in A(G 0,m ) A(G m+1,n ). Proof. For i {0, 1,..., n}, let P i be the path from v i to either v 0 or v n in A(x), or P i = if v i is not connected to a sink in A(x). Let m := max{i : 0 i n 1 V (P i ) {v 0, v 1,..., v i }}. Since i = 0 is always admissible, m is properly defined. We claim that m is the desired index. Suppose there exists an arc in A(x) \ A(G 0,m ) A(G m+1,n ). We need to treat the two directions of this arc slightly differently. If there is an arc (v i, v j ) A(x) from G 0,m to G m+1,n, then P m cannot contain v i, which would imply containing v j as well, because P m does not contain any vertex with a larger index than m. Then there must be a chord

171 4.4. POLYNOMIAL ALGORITHM FOR OUTERPLANAR GRAPHS 171 (v k, v l ) in P m with l < i < k m < j that enables P m to bypass v i. However, (v k, v l ) would then cross the chord (v i, v j ). Now that we have established that no chord carries flow from G 0,m to G m+1,n, we consider the case that there is a chord (v j, v i ) A(x) carrying flow in the other direction. Since v n has no outgoing flow, j < n must hold. This implies V (P j ) = V (P i ) {v j } and P i cannot use any chord going back to G m+1,n as just shown. This makes j > m eligible for the choice of m, a contradiction. Lemma Consider an outerplanar instance. For 0 i < j n with j i 2, choose m, i < m < j, so that v m is the rightmost neighbor of v i. Then A(G i,j ) A(G i,m ) A(G m,j ) {(v i, v j ), (v j, v i )} holds. That is, if G i,j is self-contained, so is G i,m and G m,j. Proof. Let m := max{m : i < m < j ((v i, v m ) A (v m, v i ) A)}, so that v m is the rightmost neighbor of v i. An outerplanar instance assumes that (v i, v i+1 ) exists, so this is well-defined. Suppose there is an arc in A(G i,j ) that contradicts the claim. It cannot be incident to v m, as all of those arcs are in A(G i,m ) A(G m,j ). Any arc between v i and v k with m < k < j would contradict the choice of m. Any arc between v j and v k with i < k < m would cross (v i, v m ) or (v m, v i ). So a contradicting arc has to run between {v i+1,..., v m 1 } and {v m+1,..., v j 1 }. This would be a chord that bypasses v m, so it would cross the chord between v i and v m. Lemma The optimum value of an outerplanar instance is max (l + r). (l,r) Ψ 0,n Proof. Choose (l, r) that attain the maximum. By definition of Ψ 0,n, this implies the existence of two confluent flows on the disjoint subgraphs G 0,k and G k+1,n with sinks v 0 and v n, respectively. Thus, we can easily combine them to obtain a feasible solution. (Note that for k = 0 or k = n, one of the subgraphs consists of a single vertex and no arc. Thus, the flow is actually an empty function. This is not a problem, though, as L i,i and R i,i are always 0, so have the correct flow value.) On the other hand, Lemma 4.13 shows that for any confluent flow x there always is an index ˆk where the outerplanar graph is split by A(x). Plugging ˆk into the definition of Ψ 0,n, we see that the optimum solution gives an entry (l, r ) Ψ 0,n with l and r being the amount of flow sent to the two sinks. Then l + r l + r must be true, but the latter is the optimum value. We will now show how one can compute the functions L and R in a dynamic program. By definition, L i,i 0. The next case is L i,i+1, where the single arc (v i+1, v i ) determines how the flow may behave: For incoming flow value z, the output at the other end of the arc is z up to the capacity u(v i+1, v i ), from where on the output stays constant.

172 172 CHAPTER 4. CONFLUENT FLOWS L i,m (d(v m ) + L m,j (z)) d(v m ) l v k v k+1 r L m,j (z) z min{u(v j, v i ), z + r} z (a) (b) Figure 4.11: (a) Ignoring the chord, the flow value L i,j (z) reaching v i can be found using any midpoint v m and the functions L i,m and L m,j. (b) Assuming that the chord is used, the flow value results from an entry (l, r) Ψ i,j for the flow being split at some optimum v k. A larger self-contained subgraph G i,j that can be split into self-contained subgraphs G i,m and G m,j does not follow this easy pattern. But as illustrated in Figure 4.11, we can determine one possible flow by sending z into G m,j at v j, which gives a flow value of L m,j (z) at v m. We add to this the supply d(v m ), so that the flow value at v i is then L i,m (L m,j (z) + d(v m )). However, this is only the optimum flow value if we assume that the optimum flow travels from v j through v m to v i. The other option is that the chord (v j, v i ) is used. Then the flow must split somewhere within G i,j at a vertex v k, which gives rise to some (l k, r k ) Ψ i,j. This allows us to send l k flow units directly to v i, while r k + z flow units are available in total at v j, that can use the chord up to its capacity. The best of these options (using the chord for some splitting point or not at all) determines L i,j (z). To summarize: Lemma Consider an outerplanar instance. Let 0 i < m < j n be as in Lemma Assume the chords (v i, v j ) and (v j, v i ) exist but maybe with capacity L i,i+1 (z) = min{u(v i+1, v i ), z} 2. R i,i+1 (z) = min{u(v i, v i+1 ), z}. 3. Define Then holds. L chord i,j (z) := max (l,r) Ψ i,j {l + min{u(v j, v i ), z + r}}. { L i,j (z) = max L i,m (L m,j (z) + d(v m )), L chord i,j } (z)

173 4.4. POLYNOMIAL ALGORITHM FOR OUTERPLANAR GRAPHS Define Then holds. R chord i,j (z) := max (l,r) Ψ i,j {r + min{u(v i, v j ), z + l}}. { R i,j (z) = max R m,j (R i,m (z) + d(v m )), R chord i,j } (z) Proof. We only proof the statements for L i,j, as R i,j is symmetrical. 1. The value L i,i+1 (z) is the maximum value of a confluent flow over the single arc (v i+1, v i ) with supplies/demands d(v i ) = and d(v i+1 ) = z. The claim follows obviously. 3. For a fixed z, L i,j (z) is defined to be the maximum confluent flow value in G i,j that can leave v i (ignoring the original d(v i )) if the supply of v j is z and the intermediate vertices have their usual supply. Let x be such an optimal confluent flow and consider A(x). Since the sink must not use an outgoing arc, (v i, v j ) A(x) holds. First consider the case (v j, v i ) A(x): Then any flow originating in {v m,..., v j } must be routed through v m to reach v i by the choice of m. Thus, we can see this as additional supply on v m. The maximum supply available at v m is then L m,j (z)+d(v m ). So the maximum flow reaching v i if (v j, v i ) A(x) is L i,m (L m,j (z)+d(v m )), which constitutes the first part of the maximum that defines L i,j (z). Now suppose that (v j, v i ) A(x). Then x can be interpreted as a flow on G i,j but with sinks v i and v j and ignoring the flow on the chord. (It may not be the optimal confluent flow for this instance, though.) We can apply Lemma 4.13 to this smaller instance, yielding a k {i,..., j 1} such that the two trees pointing towards the sinks in A(x) are contained in {v i,..., v k } and {v k+1,..., v j }, respectively. Then the flow value of x in this new instance is at most L i,k (d(v k )) + R k+1,j (d(v k+1 )). Note that if k = i or k = j, the special case definitions of L i,i 0 and R j,j 0 ensure that these sinks by themselves do not contribute any flow value. Since v j is not really a sink, but has supply z and uses the chord (v j, v i ) to send to v i, the flow value of the contribution from v j is increased by z and bounded by u(v j, v i ). So assuming x is the optimal confluent flow, we then obtain L i,j (z) = L i,k (d(v k ))+min{u(v j, v i ), R k+1,j (d(v k+1 ))+z}. Since we do not know if (v j, v i ) is used (if it is available at all) and which vertex v k is the splitting point, we have to try all combinations (which indeed represent feasible flows) and take the maximum value. The function L chord i,j (z) represents all cases where (v j, v i ) A(x). However, using these formulae to evaluate the functions at certain values of z is not advisable in a polynomial algorithm. Computing L i,j or R i,j in the interesting case where there is a chord (v i, v j ) or (v j, v i ), relies on Ψ i,j. This again relies on evaluations of L and R for j i subgraphs, and so on. In particular, the same pair of indices might occur in different subtrees of the

174 174 CHAPTER 4. CONFLUENT FLOWS recursion. So we want to use a dynamic programming approach instead, which computes the required values in a bottom-up way. There is one big problem, though, compared to the recursion: We do not know for which values of z we need to evaluate the L and R functions, so we want to compute the entire functions. Done naïvely, this only yields a pseudo-polynomial algorithm because z may take pseudo-polynomially many values. However, we can describe these functions efficiently, making use of their stair-case form as seen in Figure 4.9b. We call such a function simple. Even better, we can carry out the computations in Lemma 4.16 on these efficient descriptions much faster than for all values of z by themselves. Definition 4.17 (Simple function). Let f : Z 0 Z 0 be a function such that f(z + 1) {f(z), f(z) + 1} for all z Z 0. Then we call f a simple function. Such a function can be described by f(0) and the intervals on which f increases. We denote the size of this set of intervals by f. Note that it is easy to store the f intervals describing f in a linked list. This trivially allows the evaluation of f(z) in O( f ). Lemma L i,j and R i,j are simple. Proof. As L i,j describes a maximum flow value, L i,j (z + 1) L i,j (z), because the flow cannot become worse with higher supply by using the same in-forest A(x) of flow carrying arcs. It holds L i,j (z + 1) L i,j (z) + 1, because any in-forest used for L i,j (z + 1) could have been used for L i,j (z), losing at most one unit of flow. R i,j is simple by the same arguments. We can operate on simple functions as follows: Lemma Let f 1 and f 2 be two simple functions. 1. g(z) := f 1 (f 2 (z)) is simple and g f 1 + f g(z) := max{f 1 (z), f 2 (z)} is simple and g f 1 + f For c Z 0, g(z) := f 1 (z) + c is simple and g = f For c Z 0, g(z) := f 1 (z + c) is simple and g f 1. In all cases, g can be computed in time linear in f 1 and f 2, where applicable. Proof. 1. The definitions immediately imply f 1 (f 2 (z+1)) f 1 (f 2 (z)+1) f 1 (f 2 (z))+1 and f 1 (f 2 (z +1)) f 1 (f 2 (z)). So, g(z) is simple. Whenever f 1 (f 2 (z)) changes between increasing and constant behavior there has to be a change in the behavior of f 1 or f 2. Therefore f 1 (f 2 (z)) can have at most f 1 + f 2 intervals on which it increases. The computation can be carried out by a scan through the interval list of f 2, pushing another pointer in f 1 forward whenever f 2 (z) has increased enough.

175 4.4. POLYNOMIAL ALGORITHM FOR OUTERPLANAR GRAPHS g(z + 1) f 1 (z + 1) f 1 (z) and g(z + 1) f 2 (z + 1) f 2 (z), so g(z + 1) max{f 1 (z), f 2 (z)} = g(z). On the other hand, if g(z + 1) = f 1 (z + 1), then g(z + 1) f 1 (z) + 1 g(z) + 1, and analogously for the case g(z + 1) = f 2 (z + 1). So, g(z) is simple. Any increasing interval of g(z) must be contained in an increasing interval of f 1 (z) or f 2 (z). However, the same increasing interval of f 1 (z) or f 2 (z) cannot cause more than one increasing interval in g(z): If it did, the endpoints of the subintervals would still have the same values and distances as in the original function, so the slope in between must always be 1 for g(z), meaning that the increasing interval in g(z) was uninterrupted. The computation is a scan along both functions simultaneously. 3. This only shifts the constant f 1 (0). 4. This shifts the entire function to the left by c. All intervals ending before c have to be deleted, and f 1 (0) adjusted accordingly. We can now analyze the formulae from Lemma 4.16 with respect to the size of the data: Lemma For all 0 i j n the following holds: 1. Ψ i,j j i (assuming i < j). 2. L i,j (j i) R i,j (j i) 2. Proof. 1. All the entries in Ψ i,j are determined by one of the values of k {i,..., j 1}. 2. We use induction on j i. For j i = 0, the claim holds because L i,i 0 and L i,i = 0. Similarly elemental, for j i = 1, we note min{u(v j, v i ), z} 1. For j i 2, the calculation depends on the existence of the chord. If there is no chord, L i,j can be calculated as L i,m (L m,j (z) + d(v m )). Induction and Lemma 4.19 shows L i,m (L m,j (z) + d(v m )) (j m) 2 + (m i) 2 (j i) 2. If there is a chord, first consider L chord i,j. It is the maximum over Ψ i,j j i simple functions of size at most 1 (each l + min{u, z + r}). So j i. Lemma 4.19 shows that due to L chord i,j the maximum, this plus the size of L i,m (L m,j (z)+d(v m )) bounds the size of L i,j. But (j m) 2 +(m i) 2 +j i (j m) 2 +(m i) 2 +2(j m)(m i) because i < m < j, and the latter is simply (j i) As for L i,j. We now have all the pieces for a polynomial algorithm for MaxConfFlow on outerplanar graphs with one sink.

176 176 CHAPTER 4. CONFLUENT FLOWS Theorem MaxConfFlow on an outerplanar graph G with a single sink can be solved in polynomial time. Proof. First, find an embedding of G and transform the input into an outerplanar instance. Then, starting with j i = 0 and proceeding towards j i = n, one can compute Ψ i,j, L i,j and R i,j inductively: For each of these pairs (i, j), Ψ i,j can be computed from various L i,k and R k+1,j, with k i < j i and j (k + 1) < j i. Having done so, one can compute L i,j from Ψ i,j, L i,m and L m,j with i < m < j. Proceeding similarly for R i,j, we see that all O(n 2 ) functions/relations can be computed in a dynamic program. Furthermore, Lemma 4.20 shows that at no time Ψ i,j can be larger than j i, while L i,j and R i,j need at most (j i) 2 entries to describe them. Computing each Ψ i,j requires O(n) evaluations of L and R functions, each of which can be done in O( L ) = O(n 2 ). So for each Ψ i,j we need at most time O(n 3 ). Computing L requires two evaluations of L and the maximum over O(n) entries in Ψ. So this takes O(n 2 ) time. Note that we can find the splitting point m with a simple greedy strategy by checking the existence of O(n) arcs, as shown the proof of Lemma Clearly, this is dominated by O(n 3 ). So for each pair (i, j) we can find all necessary functions in O(n 3 ), and the entire table is computed in O(n 5 ). Finally, the maximum value can be found in O(n) from Ψ 0,n, as noted in Lemma We do not claim that outerplanar graphs are the largest graph class for which MaxConfFlow problems can be solved in polynomial time. Indeed, some trivial extensions can be imagined, e. g., for graphs that differ from outerplanar graphs only by a constant number of arcs. However, more interesting extensions are possible. As mentioned in the preliminaries, we can characterize outerplanar graphs as the graphs not containing K 4 or K 2,3 as minors. Since any subdivision of an edge of a K 4 or a single edge added to a K 4 leads to a K 2,3 as a minor, it suffices to require that a graph is K 2,3 -minor-free and does not contain K 4 as an induced subgraph to prove its outerplanarity. We can then extend our algorithm to the K 2,3 -minor-free graphs with a single sink, as we will show in the following corollary. Corollary If the underlying undirected graph is K 2,3 -minor-free, single-sink MaxConf- Flow problems can be solved in polynomial time. Proof. Let G be a bidirected graph that is K 2,3 -minor-free. Consider all induced copies {H i } i of K 4 in G. No path can connect two vertices of the same copy of K 4, except for the direct arc between them, as this would lead to a K 2,3. This has far-reaching consequences, as essentially G decomposes into outerplanar components (not containing a K 4 ) and the cliques H i joining these together, such that the blocks are arranged in a tree as shown in Figure Technically speaking, any vertex v V (H i ) is an articulation vertex unless all its neighbors lie within H i. Since we assume a single sink, we can equivalently delete the vertices behind an articulation vertex v and increase the supply of v by the MaxConf- Flow value of the deleted vertices for an instance with v as a sink. This

177 4.5. APPROXIMATION SCHEME FOR BOUNDED TREEWIDTH 177 outerplanar Figure 4.12: A K 2,3 -minor-free graph can be regarded as blocks arranged in a tree through articulation vertices, where each block is either an outerplanar graph or a K 4. Note that outerplanar graphs overlapping in a single vertices are again an outerplanar graph, so this case does not need to be shown. MaxConfFlow value can be determined if components split off are outerplanar or consist only of a K 4. We can reliably find an articulation vertex that exhibits this property by looking at the articulation vertex that is farthest from the sink. Thus, a polynomial-time algorithm can repeatedly search for a suitable articulation and resolve the split off components using Theorem 4.21 or with complete enumeration if only a K 4 remains. This continues until an outerplanar graph or a single K 4 containing the sink is left, which can also be dealt with. Due to the efficient decomposition, the overall running time is of the same magnitude as the original algorithm, namely O(n 5 ), which also dominates the search for the articulation vertices. As a note on what may or may not be possible, we should not expect to find polynomial time algorithms for 2-outerplanar graphs (roughly an outerplanar graph surrounded by and connected to another cycle), because these can arise from a tree with an unlimited number of sinks combined into a supersink, which enables NP-hard instances as seen in Theorem Approximation Scheme for Bounded Treewidth We now want to present an approximation scheme for MaxConfFlow that works for graphs with bounded treewidth. As the hardness result in Theorem 4.9 shows, we cannot approximate MaxConfFlow with arbitrary precision on general graphs (unless P = NP), so such a restriction to certain graph classes is necessary. On the positive side, the treewidth hierarchy eventually covers all graphs, so the algorithm at least works on all graphs but the running time grows exponentially with the treewidth.

178 178 CHAPTER 4. CONFLUENT FLOWS The first step towards the approximation scheme is a dynamic program not unlike the one presented for outerplanar graphs: For certain self-contained subgraphs, namely the bagends, we want to determine all feasible combinations of incoming and outgoing flow values. However, we immediately lose the ability to efficiently encode the tables when we turn to graphs with treewidth 2. This follows from the Partition instances in Theorem 4.9. If we introduce additional arcs that measure the flow into the supersink from the two constructed Partition sets, then these arcs can have pseudo-polynomially many possible flow values. A dynamic program must account for all of these if the capacity of the following arcs is not yet known. The second and final step towards an approximation scheme is the appropriate rounding of the pseudo-polynomial dynamic program to polynomially many entries. Note that all combinatorial problems that can be described by extended monadic second order logic (EMS) can be solved in polynomial time on graphs with bounded treewidth. See Arnborg et al. [4] for these results and sample lists of combinatorial problems which this covers. This does not immediately solve our problem (which remains NP-hard on graphs with treewidth 2), but a reduction to an instance of pseudo-polynomial size could be possible. This instance could then be solved in effectively pseudo-polynomial time. This would remove the theoretical need for the first step above, that is, developing a pseudo-polynomial dynamic program. But using the results from EMS typically yields weak bounds on the running times and reveals little about the structure needed to actually implement the dynamic program efficiently. In particular, we think understanding the structure of the dynamic program is important for those wishing to prove Conjecture Dynamic Program We first restrict ourselves to the single-sink case of MaxConfFlow. We will generalize this when we describe the approximation algorithm. We are given a graph G = (V, A) with the sink t and can assume that we also have a nice tree decomposition with a bag only containing t as root, as shown in Lemma 4.7. We work bottom-up from the Leafs of the tree decomposition to the root containing t. The bags and bagends of a nice tree decomposition change towards the sink in an orderly fashion, introducing and forgetting one vertex at a time. So this approach is fairly straightforward if we have suitable representations of confluent flows on a bagend G i that we can express just by information on the interface G[B i ]. When we have done this for all bags, it just remains to read off the maximum flow value at the root. The arcs chosen in B i and the flow sent on them seem like natural candidates for the dynamic program, but this is not the choice we will make because finding a tree-like flow on the tree-like structure has not as many synergies as one might think. Sources may send flow from near the sink t through the interface B i to G i, then out of B i and to the sink near where the flow started, compare Figure So when we consider the restriction of a confluent flow to G i, we again need to account for sources we have not yet encountered, just like in the outerplanar case. This time, we do this by ignoring the balance

179 4.5. APPROXIMATION SCHEME FOR BOUNDED TREEWIDTH 179 bagend G i processed current bag B i unknown Figure 4.13: A confluent flow through the current bag B i which uses vertices that have not been considered yet. Despite the structure information available from the tree decomposition, it is not feasible to determine locally what interconnections exist between the flow on local arcs. condition of the vertices in B i and tracking the outgoing and incoming flow at each vertex separately. We try to make the technicalities a bit simpler by fixing the outflow of a vertex once and for all when an outgoing arc is chosen for it. We will denote this target outflow value with o(v). Any imbalance must be made up for by the supply of the vertex and the incoming flow. As we collect incoming flow during the run of the algorithm, the sources of this flow might no longer be visible and therefore we think of this quantity as additional supply d + (v) of v. The main advantage of fixing the target outflow is that we can locally manipulate the set of flow-carrying arcs. For example, if a vertex v uses no outgoing arc so far and we want to send flow along an arc (v, w), we can simply do so without triggering a chain reaction of increasing flow along the path that these flow units take. Indeed, such a path could lead into already processed parts of the bagend that are no longer in the interface, which would make tracking the changes even harder. Instead, we simply increase the additional supply d + (w) of the receiving vertex. Its target outflow o(w) remains unchanged, and therefore no further steps need to be taken. Regarding arcs, note that we do not need to consider the chosen arcs of A[B i ] in the dynamic program. Instead, we assume that no decision has been made about the arcs in the current bag. We only consider whether a vertex has no outgoing arc yet, in which case its target outflow value is 0; or it already has an outgoing arc, in which case o(v) must be positive. When a vertex leaves a bag due to a Forget node in the tree decomposition, all vertices in the bag with no outgoing flow have the opportunity to choose an arc towards the leaving vertex. If the leaving vertex had no outgoing flow itself, it may also choose one arc pointing to a vertex in the bag. So only then arcs are fixed, but they are no longer visible in the arc set of the interface.

180 180 CHAPTER 4. CONFLUENT FLOWS Eventually, the considered bags converge to the root bag, which only contains the sink t and no arcs, so all vertices can choose their outgoing arc at some point. One final note is that we cannot avoid directed cycles in such a localized approach, so instead of considering confluent flows, we consider nearly confluent flows. Recall Definition 4.1 that this is simply a flow where each vertex has at most one outgoing arc and sinks have none, and that considering nearly confluent flows will give us the same maximum flow value as MaxConfFlow. These considerations motivate the following table for each node of the tree decomposition: Definition 4.23 (Table). Let G = (V, A) be a graph with arc capacities u : A Z 0, supplies d : V Z but only with a single unlimited sink t, which in addition must satisfy δ out (t) =. Let U be an upper bound on MaxConfFlow for this instance and assume u(a) U for all arcs a A. Let (I, T ) and (B i ) i I be a nice tree decomposition of G. For i I, let Table(i) be defined as the set containing exactly the elements (o, d + ) with o {0, 1,..., U} B i, d + {0, 1,..., U} B i for which there is an integral feasible nearly confluent flow function x in G i with the following properties: 1. x(a) = 0 for all a A[B i ]. 2. o(v) = outflow(v) for all v B i. 3. d + (v) = inflow(v) for all v B i bal(v) d(v) for all v V (G i ) \ B i. Please note that given a flow function x on the bagend G i, it is easy to check whether x satisfies all conditions. Also, x determines a unique entry (o, d + ) Table(i). We finish the description of the dynamic programming table with a lemma showing that this table contains the solution to MaxConfFlow at the root node of the tree decomposition: Lemma Under the assumptions of Definition 4.23, the optimum value of MaxConf- Flow is max (o,d + ) Table(r) d+ (t), where r I with B r = {t} is the root node of the nice tree decomposition. Proof. Let OPT be the optimum value of MaxConfFlow. An optimum confluent flow for this instance can be chosen integral and implies an entry ((0), (OPT)) in Table(r): The Condition 1 on the arcs is trivially satisfied, the Conditions 2 and 3 on the only vertex t in B r just correspond to the entry, and a confluent flow satisfies the balance conditions at all vertices v in G r \ B r = V \ {t} as well, which satisfies Condition 4.

181 4.5. APPROXIMATION SCHEME FOR BOUNDED TREEWIDTH 181 On the other hand, any entry (o, d + ) Table(r) implies some integral feasible nearly confluent flow function x on G i = G. Trivially, o(t) = 0 must hold because δ out (t) =. Then x is a flow because the balance conditions are satisfied at all non-sink vertices by Condition 4 and the sink has infinite demand and no outgoing flow. Then d + = (inflow(t)) = (val(x)) is indeed the flow value of a nearly confluent flow, which attains the same maximum as a confluent flow Computing the Tables For computing the entries of Table(i) from the tables of the children of i, subroutines are needed that address the four classes of nodes (Leaf, Introduce, Merge, and Forget). One has to ensure that the flow values are calculated correctly, and to proof that all entries can be derived in this way. These steps are mostly technical and the main work is done in the Forget nodes, the only place where the algorithm has to decide which arcs should carry flow. All of the lemmata detailing these constructions operate under the assumptions and notation of Definition 4.23 without explicitly saying so every time. We begin with the trivial case of a Leaf node. The bagend consists of just a single vertex and has no arc, so all flow functions are trivial, and the only possible entry corresponds to no incoming or outgoing flow. Lemma For a Leaf node i, Table(i) = {((0), (0))}. Proof. The bagend G i consists of a single vertex only and in particular contains no arcs. So any x has to be an empty function, and Condition 1 and 4 do not apply. The other conditions simply force an entry to be ((0), (0)). Since these are all conditions, this is a valid entry. In an Introduce node, the new vertex is simply added with target outflow 0 and additional supply 0 while leaving all other entries unchanged. (As mentioned after the definition of a nice tree decomposition, the Leaf nodes could also be considered as Introduce nodes that add a vertex to an empty bag, but this would be an atypical setup.) Lemma Let i be an Introduce node, and let j be the unique child of i. Let v be the new vertex, such that B i = B j {v}. Let o, d + {0,..., U} B i. Then (o, d + ) Table(i) if and only if o(v) = d + (v) = 0 and (o Bj, d + Bj ) Table(j). Proof. Observe that Lemma 4.8 part 3 immediately implies that v has no neighbors in V (G i ) \ B j. Let (o, d + ) Table(i) and let x be a matching flow function. Since x is 0 on A[B i ] and v has no other neighbors in G i, the incoming and outgoing flow of v must be 0, proving o(v) = d + (v) = 0. We claim that restricting x to A[G j ] proves that (o Bj, d + Bj ) Table(j). Indeed, all conditions of Definition 4.23 are subsets of the conditions for (o, d + ) Table(i).

182 182 CHAPTER 4. CONFLUENT FLOWS For the reverse direction, consider o, d + {0,..., U} B i with o(v) = d + (v) = 0 and (o Bj, d + Bj ) Table(j). The latter implies the existence of a flow function x on G j corresponding to the table entry. We can extend this by 0 to the arcs of G i, noting that this only affects the arcs incident to v. Then x is an integral feasible nearly confluent flow function on G i. It is 0 on all arcs in A[B i ] (Condition 1). The outflow and inflow of all vertices of B j are unchanged and the new v has no inflow or outflow (Conditions 2 and 3). And, finally, the balance of the vertices in V (G i ) \ B i = V (G j ) \ B j is not affected either (Condition 4). When dealing with Merge nodes, all bags contain the same vertices but the bagends differ: Lemma 4.8 shows that the bagends of the children are disjoint except for the vertices in the Merge bag. In particular, the only arcs that are considered in both bagends are the arcs in the current bag, which do not carry any flow. So the flow functions are essentially disjoint and we can add them without violating capacity constraints. But we do have to ensure that no vertex in the current bag has outgoing arcs into both bagends. Lemma Let i be a Merge node, and let j and k be the children of i, implying B i = B j = B k. Let o, d + {0,..., U} B i. Then (o, d + ) Table(i) if and only if there exist (ō, d + ) Table(j) and (ô, d + ) Table(k) such that o = ō + ô and d + = d + + d + and for all v V at least one of ō(v) and ô(v) is zero. Proof. Let (o, d + ) Table(i) and let x be a corresponding flow function on G i. Let x and ˆx be the restriction of x to G j and G k, respectively. We claim that these flow functions yield the desired entries in Table(j) and Table(k). Let us first establish that they induce some entries (ō, d + ) Table(j) and (ô, d + ) Table(k) by checking the appropriate conditions. Clearly, the restricted functions are integral, feasible and nearly confluent. Due to Lemma 4.8 part 2, the balance of any vertex in the bagends G j or G k is unchanged because all of its neighbors in G i are also part of the respective smaller bagends. Also, no arc in A[B i ] can carry any flow. Thus, if we choose ō and ô according to the outflow of x and ˆx, respectively, while d + and d + are derived from the inflow, we obtain valid entries of the tables. Note that the set of arcs between V (G j ) \ B i and B i is disjoint from the set of arcs between V (G k ) \ B i and B i, because the endpoints not in B i can only belong to either G j or G k, again due to the Lemma 4.8. Then o = ō + ô and d + = d + + d + follow immediately because the considered arcs can be split into those belonging to either G j or G k. For the same reason, every vertex contains at most one outgoing arc that carries flow, and that can belong to only one of G j or G k. On the other hand, suppose (ō, d + ) Table(j) and (ô, d + ) Table(k) satisfy the conditions of the lemma. Let x and ˆx be the corresponding flow functions on G j and G k and extend them to G i with 0. Then let x := x+ˆx. We claim that o := ō+ô and d + := d + + d + form an entry (o, d + ) of Table(i) if all values stay within the considered range {0,..., U}. As above, the conditions

183 4.5. APPROXIMATION SCHEME FOR BOUNDED TREEWIDTH 183 for the vertices in V (G i ) \ B i are inherited from x and ˆx, and the outflow and inflow values for v B i correspond to x. It just remains to note that we explicitly require that no vertex v B i has two outgoing arcs, one into G j and one into G k so that x really is nearly confluent. This shows (o, d + ) Table(i). The remaining and most intricate case is that of a Forget node. What makes the discussion of this case so long is that essentially two steps happen at once. First, a set of arcs between the leaving vertex and the remaining vertices of the bag are chosen and flow is pushed along these arcs, for which target outflow values have to be chosen. The additional supplies are adjusted accordingly as well. But then, the vertex is removed and none of these arcs are visible anymore. Lemma Let i be a Forget node, and let j be the unique child of i. Let v be the leaving vertex, such that B i = B j \ {v}. Let o, d + {0,..., U} B i. Then (o, d + ) Table(i) if and only if there exists (ō, d + ) Table(j) such that: 1. For all w B i with o(w) ō(w), it holds ō(w) = 0 and there is an arc a = (w, v) A with o(w) u(a). 2. There is at most one vertex w B i with d + (w) d + (w). If there is such a vertex w, then ō(v) = 0 and d + (w) > d + (w) must hold. Then there also is an arc a = (v, w) A with d + (w) d + (w) u(a). 3. Let y o := ō(v)+ w B i (d + (w) d + (w)) and y d + := d + (v)+ w B i (o(w) ō(w)). Then 0 y o y d + d(v) must hold. Proof. Let (o, d + ) Table(i) and let x be a corresponding flow function on G i. Note that G i = G j because both bagends include B j. Let x be the flow function obtained when setting all flow on arcs between v and B i to 0, that is, all arcs in A[B j ] \ A[B i ]. We claim that x implies the desired entry (ō, d + ) Table(j). It clearly is integral, feasible and nearly confluent, because we only set certain arcs to 0. Also due to this decrease of flow, neither the inflow nor outflow can exceed the range {0,..., U}. The balances of vertices in G i \ B j are unaffected by the changed arcs and satisfy Condition 4 of Definition 4.23 for j as well. (The balance of v is affected, but it is not checked for the bag j.) Arcs in A[B i ] already carried zero flow and arcs in A[B j ] \ A[B i ] are set to 0, so Condition 1 is satisfied as well. If we choose ō and d + according to x, then we are certain to have an entry in Table(j). We check the additional requirements of this lemma in the following: 1. Let w B i be a vertex with o(w) ō(w). Then the outflow must have changed, so there must have been an arc a from w to v on which the flow was reduced to 0. Since this must have been the only outgoing flow-carrying arc of w, we can conclude ō(w) = outflow x (w) = 0. Also, o(w) u(a), because the entire outflow of w in x must be through arc a.

184 184 CHAPTER 4. CONFLUENT FLOWS 2. Similarly, all w B i with d + (w) d + (w) must have changed due to incoming flow-carrying arcs in x, which must originate at v. Therefore, there is at most one such w and if there is, then v has no more outgoing flow-carrying arc in x and ō(v) = 0 in this case. Furthermore, d + (w) d + (w) = x(a), showing 0 < d + (w) d + (w) u(a) as claimed. 3. Let y o := ō(v) + w B i (d + (w) d + (w)). Note that by the above considerations, there is at most one w in the sum that contributes a nonzero value, the vertex that v is sending flow to in x, if any. So y o = outflow x (v). Similarly, if we define y d + := d + (v) + w B i (o(w) ō(w)), then the sum captures all vertices of B i sending flow to v in x and exactly how much they send. So y d + = inflow x (v). Since x satisfies the balance condition for v, we have 0 y o y d + d(v) as claimed. For the reverse direction, let o, d + {0,..., U} B i and suppose (ō, d + ) Table(j) is as required. Let x be a corresponding flow to (ō, d + ). We need to construct a flow x that shows (o, d + ) Table(i). To achieve this, we start with x and send o(w) flow units to v for all w B i for which o(w) differs from ō(w). In this case, the lemma required that ō(w) = 0, that is, w has no outgoing flow-carrying arc in x, and that there is an arc (w, v) with sufficient capacity. (In particular, w is not the sink t, because the sink has no outgoing arcs according to Definition 4.23.) Thus, after these changes, the flow function is still integral, feasible and nearly confluent. If there is a vertex w B i for which d + (w) d + (w), then the existence of an arc (v, w) with sufficient capacity to send the positive amount d + (w) d + (w) is guaranteed as well, and ō(v) = 0 implies that v has no outgoing flow-carrying arc in x. Since there can be at most one such w, we obtain an integral feasible nearly confluent flow function x after this last modification. Since all arcs that were changed run between v B i and B i, this flow function x is still 0 on A[B i ]. The outflow of a vertex w B i was unchanged unless flow was sent from w to v. Then outflow x (w) = o(w) holds by construction. Similarly, the additional supply of any w B i was unchanged, unless it receives flow from v. In that case, the incoming flow was precisely increased by d + (w) d + (w), so that it is now d + (w). The balance of all vertices in the bagend that are not in B j is also unchanged. Finally, we just need to show that v satisfies the balance condition. As in the first part of the proof, we consider y o := ō(v) + w B i (d + (w) d + (w)) and y d + := d + (v) + w B i (o(w) ō(w)). These are by construction the outflow and inflow of v, and the lemma requires that 0 y o y d + d(v). Thus, (o, d + ) is an entry in Table(i). This completes the construction of the tables from a mathematical point of view. We now shortly discuss the running times of these constructions. Lemma There is an algorithm that, given a nice tree decomposition (I, T ), computes all tables correctly and runs in time polynomial in I and U, assuming that the width k of the tree decomposition is constant.

185 4.5. APPROXIMATION SCHEME FOR BOUNDED TREEWIDTH 185 Proof. It is clear that computing the tables bottom-up will result in I calls to the subroutines handling the different cases of nodes in the tree decomposition. It just remains to show that the running time is as claimed. We will represent each table as a boolean array of size L := (U + 1) 2k+2, which is polynomial in U if k is constant. We initialize such an array as empty for each bag. 1. For a Leaf node, we just need to set a single entry. This takes O(1) operations. 2. For an Introduce node i, we can loop over all entries in the child node j, extend them with (0, 0) for the new vertex and mark this new entry in Table(i). This takes O(L) operations. 3. For a Merge node i, we can loop over all combinations of entries in the children j 1 and j 2 and check each combination with O(k) = O(1) arithmetic operations and, if applicable, mark an entry in Table(i). This requires O(L 2 ) operations. 4. For a Forget node i with child j, we can consider each pair (o, d + ) and check Table(j) for a corresponding entry. This takes O(L 2 ) time. In summary, the Merge and Forget nodes are expensive enough to justify bounding each subroutine with O(L 2 ), that is, the number of combinations of entries from two tables. Thus, the algorithm computes each table in time polynomial in U, and the overall running time is polynomial in I and U Deriving the Approximation Scheme We can now apply the recent Lemma 4.29 to solve the MaxConfFlow problem as follows. This also includes the construction to deal with multiple sinks that have limited demands. Theorem MaxConfFlow can be solved in pseudo-polynomial time if the treewidth of the graph is bounded by some constant. Proof. Consider an instance with a digraph G = (V, A), capacities u : A Z 0, supply/demand function d : V Z, with sources S + and sinks S. Assume that the treewidth of G is at most some constant k. We preprocess the instance to obtain a graph G with a single supersink t. Notice that this preprocessing does not change the value of MaxConfFlow. Also, the treewidth of this new instance is at most k + 1 because a tree decomposition of G can be turned into a tree decomposition of G by adding t to every bag. Let U := v S + d(v). Since U is a bound on the MaxConfFlow value, we can reduce all capacities to U without changing the solution value. If an arc carried higher flow values, this would imply that flow units cycle in the graph because they cannot reach the sink. Now the instance is suitable for Definition Next we compute a nice tree decomposition of G with t as the sink. This decomposition can be chosen to have width at most k + 1 and polynomially

186 186 CHAPTER 4. CONFLUENT FLOWS many nodes, as described in Lemma 4.7. Finally, we run the dynamic program from Lemma 4.29 on the modified instance to compute Table(r) for B r = {t}. As we have already shown in Lemma 4.24, we can output the maximum value of d + (t) in Table(r) as the solution value. Note that the overall running time is polynomial in the size of G and U. A theoretical advantage of this result is that we can now be certain that MaxConfFlow with multiple sinks on trees is weakly NP-hard, but no harder. The proof for this claim in Theorem 4.9 was still outstanding. More importantly, the result above yields a fully polynomial-time approximation scheme for MaxConfFlow, still under the assumption of bounded treewidth: Theorem MaxConfFlow on digraphs with V vertices and treewidth bounded by some constant can be approximated to a factor of (1 ε) for any ε > 0 in time polynomial in V and 1/ε. Proof. Again the input is a digraph G = (V, A) with capacities u : A Z 0, supply/demand function d : V Z, with sources S + and sinks S. W. l. o. g. we can assume that no vertex has larger supply than it could send to a sink by itself. This can be assured in polynomial time by solving the Bottleneck Shortest Path problem, e.g., with a modified Dijkstra s algorithm (cf. [54]). Then the sum of supplies is a meaningful bound on the MaxConfFlow value OPT: With d max := max v S + d(v), we obtain d max OPT v S + d(v) V d max =: U. We will use a simple rounding scheme to obtain the approximation. Let ε := ε/ V 2. Then round the arc capacities and supply/demand function to multiples of ε U and always towards 0. Call these new values ũ and d. For example, ũ(a) := u(a) ε U ε U. If we further scale the instance by (ε U) 1, we obtain integral values between 0 and 1/ε. We run the algorithm to solve this scaled down integral instance. Theorem 4.30 shows that the running time is polynomial in the size of the graph and Ũ := (ε U) 1 v V d(v) O( V /ε ) = O( V 3 /ε). Thus, this is a polynomial time algorithm with respect to V and 1/ε. Let APX denote the optimum value of the rounded instance scaled back to the original range by a factor of ε U. (The optimum value scales with the parameters.) We need to show APX (1 ε)opt to conclude this proof. Consider a confluent flow x with value OPT for the original instance. The standard flow theory from Theorem 1.3 tells us that we can decompose it

187 4.6. OUTLOOK 187 into at most A(x) V paths P P carrying flow from the sources to the sinks. We denote the amount of flow on each path with x P > 0, and P P x P = OPT must hold. Rounding the flow on each path to x P := x P ε U ε U yields a path decomposition of a new flow x where all flow values are multiples of ε U. Rounding down individual paths results in a larger loss due to rounding than for d and u, so x obeys all rounded capacities, supplies and demands. Thus, x is a confluent flow in the rounded instance and has value x P ε U ( xp ) ε U 1 P P P P OPT V ε U. Since APX is the optimum value of a feasible flow in the rounded instance, it follows that APX val( x) OPT V ε U. Recalling the bounds on OPT, we obtain as claimed. APX OPT V ε U = OPT V 2 ε d max = OPT εd max OPT εopt = (1 ε)opt As usual, backtracking in the dynamic program provides the actual flow as well for both algorithms. Note that it is also possible to modify the exact dynamic program in order to handle, e.g., lower capacities on arcs or all-ornothing flows. However, since the accuracy of the rounded instance is limited, these conditions only carry over in an O(ε)-precise way to the approximation algorithm. One point to consider is also the practicality of these algorithms. Even the running time of the FPTAS suffers from a large exponent depending on the treewidth. It might just be that the dynamic program as written is only suitable for graphs with treewidth 2. At least this covers approximating Max- ConfFlow on trees with multiple sinks. 4.6 Outlook The research into the MaxConfFlow problem still has a few more steps to take, not to mention accounting for costs or variations like bifurcated flows. For example, it would be really interesting to have a constant-factor approximation algorithm for general graphs or a suitable inapproximability result. Another direction is to strengthen the understanding of the polynomial cases of single-sink MaxConfFlow. Because we believe to have a good idea of what these are, we want to elaborate on this point. In the section on outerplanar graphs we finished with a polynomial algorithm for K 2,3 -minor-free graphs. We believe that combining these ideas

188 188 CHAPTER 4. CONFLUENT FLOWS with those from the recent section could lead to polynomial algorithms for K 2,m -minor-free graphs for every constant m. Conjecture For every constant m Z >0, the single-sink MaxConfFlow problem on K 2,m -minor-free graphs can be solved in polynomial time. We first show why this restriction to K 2,m -minor-free graphs is a suitable counter to the NP-completeness results from Theorem 4.9. There, we have seen that the weakly NP-complete Partition problem can be modeled using a complete bipartite graph of the form K 2,n, with n the number of elements in the Partition instance. By bounding the size of this bipartite graph, the reduced problems become at least enumerable. On the other hand, any graph class containing arbitrarily large K 2,n -minors would immediately make the problem NP-complete again, so this is certainly a necessary condition. Recall from the same theorem the reduction of the strongly NP-complete 2-DisjointPaths problem to MaxConfFlow. Forbidding K 2,m as a minor also prevents this class of instances, because we can apply the algorithm developed for graphs with bounded treewidth. To see this, we use one of the essential results from the work of Robertson and Seymour on treewidth: Theorem 4.33 (Robertson, Seymour [78]). The class of H-minor-free graphs has bounded treewidth if and only if H is planar. So by forbidding the planar K 2,m to avoid the Partition instances, we also limit the treewidth. Thus, we can apply our pseudo-polynomial algorithm to solve the instance exactly. In particular, this algorithm can then solve the instances we used to model the 2-DisjointPaths problem with their constant-sized supplies and capacities. The difficult routing decisions were removed by the bounded treewidth. Of course, without further modifications this algorithm cannot solve instances of the 2-DisjointPaths problem that are rescaled to exponential supplies because the dynamic programming tables would then contain exponentially many entries. The common technique to only store the Pareto front, that is, the dominating entries, in the dynamic program will not work with our dynamic program as it is. This would only be doable for the additional supplies d +, where higher is always better and excess flow can be dropped. But for the target outflow values o, the exact values are important, and we do not see any way around this currently. Instead, we believe it is necessary to find efficient descriptions of how the flow values on the vertices in the bag affect each other, or what sends where. The dominating solutions that one could store are the possible configurations of the confluent flow in the bagend. These would have to be described by a multi-dimensional analogue of a simple function as in our algorithm for outerplanar graphs. Then one could determine the additional supplies depending on the variable target outflows of the other vertices. A more intuitive description would be the (contracted) forest of flowcarrying arcs. This works for bagends without sources. In this case, out

189 4.6. OUTLOOK 189 of the at most k + 1 vertices in a bag B i, k vertices have roughly O(k! A k ) options in which the flow can converge to one remaining vertex v in the bag. To see this, the paths from the k vertices acting as sources can be joined in at most k! ways, and after each joint can be subjected to one out of A possible arc capacities. Thus, the k sources can give rise to O(k! A k ) maximum flow values on v. Because k is a constant, this results in polynomially many configurations of the flow. Note that Hagerup et al. [44] consider a very similar question for the standard MaxFlow problem. They show that all possible configurations result in a convex set of flow values on the terminals in the bag, which can be described independently of the size of the bagend by 2 k linear inequalities. The confluence condition does not allow such a convex set, though, as can easily be seen in the case of a single source and multiple sinks. However, the bagend can still contain arbitrarily many sources as well, which then give rise to a lot more possible configurations of the flow, as the Partition instances with treewidth 2 already show. This is where we have to use the additional restriction to K 2,m -minor-free graphs and not just the bounded treewidth because we need to show that there are only polynomially many dominating configurations. Say, there are l sources in the bagend of B i and consider a vertex v for which we want to determine the possible values of additional supply depending just on the contributions of the sources. Again, the paths from l sources can result in at most O(l! A l ) maximum flow values on v. But the bagend could be the entire graph, so l is not bounded by m. Instead, we need to look at the number of dominating configurations that determine the additional supply d + (v). (We ignore the contribution of the target outflow of the vertices in the bag for now.) The idea to obtain such a bound is to consider any flow x where some subset S of the sources in the bagend sends to v, but the flow from each source never passes through another source. This then gives rise to a K 1, S by contracting the flowcarrying arcs A(x) until only a star rooted in v remains. But if each of these values for v appears in the Pareto front of dominating configurations, the flow values on the rest of B i must be non-dominated entries. Therefore, there must be some interaction between the vertices in B i \{v} and the sources that could contribute to v but do not. In particular, if we trace a suitable flow from these sources through B i \ {v} to the supersink, we obtain another tree that we can contract to a star rooted in the supersink. What we claim but cannot proof yet is the following: If there are sufficiently many dominating configurations of the sources, then for some subset S, there is a pair of flows x 1 and x 2 among these configurations such that the flow from S is sent to v in flow x 1, but in the other flow x 2 the same sources send the flow through the remainder of the bag to the supersink. We believe we can suitably contract these flows to obtain two internally disjoint stars K 1, S with the same leaves (not necessarily S ), but one with root v and the other with the supersink as root. Combined, these would give rise to a K 2, S. Because m bounds S, we have a bound on the possible dominating configurations affecting v. This is not to say that there are only polynomially

190 190 CHAPTER 4. CONFLUENT FLOWS many configurations. That is certainly not true. But we believe that the number of dominating configurations is bounded by a polynomial, because not all sources may interact with each other. Thus, small subsets can be resolved independently. One still needs to combine the configurations for the target outflows of the vertices in the bag (which are constantly many sources, but with unknown supplies) with the configurations of the real sources in the bagend (which are many, but with known supplies). But at least we can claim a polynomial bound for each of these problems individually. While far from a proof yet, we think there is potential in this idea to arrive at the conjecture or a close variant. Using minors as a criterion seems like an appropriate choice of tools because the contractions performed for a minor resemble the confluence conditions. For a first step, though, it seems reasonable to consider only K 2,4 -minorfree graphs. Recent results by Dieng and Gavoille [37] show that any 2- connected K 2,4 -minor-free graph contains two vertices such that the graph without these vertices is outerplanar. The restriction to 2-connected graphs is no hindrance, as one can consider the 2-connected components one after another as in Corollary Sadly, the existence of these two vertices by itself is not sufficient to reduce the problem to outerplanar instances by brute force. These are only their high-level results, though, which rely on a much more detailed analysis of the structure of K 2,4 -minor-free graphs. As a side note, they also show that K 2,4 -minor-free graphs have treewidth at most 4, which is tight. In general, there is some recent interest in the structure of K 2,r -minor-free graphs, see for example, Chudnovsky et al. [14], which show that they contain at most 1 2 (r + 1)( V 1) edges. These results might help to better understand the possible configurations for a confluent flow. Finally, as another research topic, it is worth noting that there is little known about confluent flows over time besides the already mentioned work from Mamada et al. (see Section 4.1). Even what a suitable model would look like is not as clear-cut as in the static case. Should the outgoing arc of a vertex be fixed through time like an emergency exit sign? Or is a confluent flow over time only required to be a confluent flow in the time-expanded network, that can choose a new direction at each time step? The latter is probably too permissive. But exploring both models opens the way for intermediate models with limits on changing the outgoing arcs over time. These might be applicable for network routing tables, advanced dynamic emergency exit signs that can change their display, or emergency response forces that coordinate traffic at intersections.

191 Bibliography [1] Adaptive Verkehrssteuerung (Adaptive Traffic Control). math.tu-berlin.de/coga/projects/traffic/advest/. Supported by the German Federal Ministry for Education and Research (BMBF) as grant 03SKPAI6 and others. [2] Ahuja, R. K., Magnanti, T. L., and Orlin, J. B. Network flows - theory, algorithms and applications. Prentice Hall, [3] Algorithmic Solutions Software GmbH. LEDA product page. Accessed March [4] Arnborg, S., Lagergren, J., and Seese, D. Easy problems for treedecomposable graphs. Journal of Algorithms 12, 2 (1991), [5] Baumann, N., and Skutella, M. Earliest arrival flows with multiple sources. Mathematics of Operations Research 34, 2 (2009), [6] Bein, W., Brucker, P., and Tamir, A. Minimum cost flow algorithms for series-parallel networks. Discrete Applied Mathematics 10 (1985), [7] Bellman, R. On a routing problem. Quarterly of Applied Mathematics 16, 1 (1958), [8] Bley, A. Routing and Capacity Optimization for IP Networks. PhD thesis, Technische Universität Berlin, Berlin, Germany, [9] Bodlaender, H. Treewidth: Algorithmic techniques and results. In 22nd International Symposium on Mathematical Foundations of Computer Science (1997), I. Privara and P. Ruzicka, Eds., vol of Lecture Notes in Computer Science, Springer, pp [10] Bose, P. On embedding an outer-planar graph in a point set. Computational Geometry 23, 3 (2002), [11] Chartrand, G., and Harary, F. Planar permutation graphs. Annales de l Institut Henri Poincaré (section B) 3, 4 (1967), [12] Chen, J., Kleinberg, R. D., Lovász, L., Rajaraman, R., Sundaram, R., and Vetta, A. (Almost) tight bounds and existence theorems for single-commodity confluent flows. Journal of the ACM 54, 4 (2007),

192 192 BIBLIOGRAPHY [13] Chen, J., Rajaraman, R., and Sundaram, R. Meet and merge: approximation algorithms for confluent flows. Journal of Computer and System Sciences 72, 3 (2006), [14] Chudnovsky, M., Reed, B., and Seymour, P. D. The edge-density for K 2,t minors. Journal of Combinatorial Theory, Series B 101, 1 (2011), [15] Dantzig, G. B., and Fulkerson, D. R. On the max-flow min-cut theorem of networks. In Linear Inequalities and Related Systems, H. W. Kuhn and A. W. Tucker, Eds. Princeton University Press, 1956, pp [16] Delling, D., Sanders, P., Schultes, D., and Wagner, D. Engineering Route Planning Algorithms. In Algorithmics of Large and Complex Networks (2009), J. Lerner, D. Wagner, and K. Zweig, Eds., vol of Lecture Notes in Computer Science, Springer, pp [17] Dijkstra, E. W. A note on two problems in connexion with graphs. Numerische Mathematik 1 (1959), [18] Dinic, E. A. Algorithm for Solution of a Problem of Maximum Flow in a Network with Power Estimation. Soviet Mathematics Doklady 11 (1970), [19] Donovan, P., Shepherd, F. B., Vetta, A., and Wilfong, G. Degree-constrained network flows. In Proceedings of the thirty-ninth annual ACM symposium on theory of computing (2007), STOC 07, ACM, pp [20] Dressler, D., Flötteröd, G., Lämmel, G., Nagel, K., and Skutella, M. Optimal evacuation solutions for large-scale scenarios. In Operations Research Proceedings Selected Papers of the Annual International Conference of the German Operations Research Society. Springer, 2010, pp [21] Dressler, D., Groß, M., Kappmeier, J.-P., Kelter, T., Plümpe, D., Schmidt, M., Skutella, M., and Temme, S. On the use of network flow techniques for assigning evacuees to exits. In Proceedings of The First International Conference on Evacuation Modeling (ICEM 09) (2010), vol. 3 of Procedia Engineering, pp [22] Dressler, D., and Skutella, M. An FPTAS for Flows over Time with Aggregated Arc Capacities. In Approximation and Online Algorithms, 8th International Workshop, WAOA 2010, Liverpool, United Kingdom, September Revised Papers (2011), K. Jansen and R. Solis-Oba, Eds., vol of Lecture Notes in Computer Science, Springer, pp [23] Dressler, D., and Strehler, M. Polynomial-time algorithms for special cases of the maximum confluent flow problem. Submitted.

193 BIBLIOGRAPHY 193 [24] Dressler, D., and Strehler, M. Capacitated confluent flows: complexity and algorithms. In Proceedings of the 7th International Conference on Algorithms and Complexity (2010), CIAC 10, Springer, pp [25] Edmonds, J., and Karp, R. M. Theoretical improvements in algorithmic efficiency for network flow problems. Journal of the ACM 19 (1972), [26] Fleischer, L. Universally maximum flow with piecewise-constant capacities. Networks 38 (1998), [27] Fleischer, L., and Skutella, M. Quickest flows over time. SIAM Journal on Computing 36 (2007), [28] Ford, Jr., L. R. Network flow theory. Tech. Rep. P-923, RAND Corp., Santa Monica, CA, USA, [29] Ford, Jr., L. R., and Fulkerson, D. R. Constructing maximal dynamic flows from static flows. Operations Research 6 (1958), [30] Ford, Jr., L. R., and Fulkerson, D. R. Flows in Networks. Princeton University Press, [31] Fortune, S., Hopcroft, J., and Wyllie, J. The directed subgraph homeomorphism problem. Theoretical Computer Science 10 (1980), [32] Fredman, M. L., and Tarjan, R. E. Fibonacci heaps and their uses in improved network optimization algorithms. In 25th Annual Symposium on Foundations of Computer Science (1984), IEEE, pp [33] Fügenschuh, A., Homfeld, H., and Schuelldorf, H. Routing cars in rail freight service. No in Dagstuhl Seminar Proceedings. [34] Gale, D. Transient flows in networks. Michigan Mathematical Journal 6 (1959), [35] Garbage Collection FAQ. Common%20questions. Accessed March [36] Garey, M. R., and Johnson, D. S. Computers and Intractability. W.H. Freeman and Co., New York, NY, USA, [37] Gavoille, C., and Dieng, Y. La structure des graphes sans mineur K 2,4. In 10ièmes Journées Graphes et Algorithmes (2008). In French. [38] Gawron, C. An Iterative Algorithm to Determine the Dynamic User Equilibrium in a Traffic Simulation Model. International Journal of Modern Physics C 9, 3 (1998), [39] Goldberg, A. V. An efficient implementation of a scaling minimum-cost flow algorithm. Journal of Algorithms 22 (1997), 1 29.

194 194 BIBLIOGRAPHY [40] Goldberg, A. V., and Tarjan, R. E. Finding minimum-cost circulations by canceling negative cycles. In Proceedings of the 20th Annual ACM Symposium on the Theory of Computing (May 1988), R. Cole, Ed., ACM Press, pp [41] Goldberg, A. V., and Tarjan, R. E. Finding minimum-cost circulations by successive approximations. Mathematics of Operations Research 15, 3 (1990), [42] Goyal, N., Olver, N., and Shepherd, F. B. The VPN conjecture is true. In STOC 08: Proceedings of the 40th Annual ACM Symposium on Theory of Computing (2008), ACM, pp [43] Gritzmann, P., Mohar, B., Pach, J., and Pollack, R. Embedding a planar triangulation with vertices at specified points (solution to problem E3341). The American Mathematical Monthly 98, 2 (1991), [44] Hagerup, T., Katajainen, J., Nishimura, N., and Ragde, P. Characterizing multiterminal flow networks and computing flows in networks of small treewidth. Journal of Computer and System Sciences 57, 3 (1998), [45] Hajek, B., and Ogier, R. G. Optimal dynamic routing in communication networks with continuous traffic. Networks 14 (1984), [46] Hamacher, H. W., and Tjandra, S. Earliest Arrival Flow with Time Dependent Capacity for Solving Evacuation Problems. In Pedestrian and Evacuation Dynamics, M. Schreckenberger and S. Sharma, Eds. Springer, 2002, pp [47] Hamacher, H. W., and Tjandra, S. Mathematical Modeling of Evacuation Problems: State of the Art. In Pedestrian and Evacuation Dynamics, M. Schreckenberger and S. Sharma, Eds. Springer, 2002, pp [48] Hart, P. E., Nilsson, N. J., and Raphael, B. A formal basis for the heuristic determination of minimum cost paths. Systems Science and Cybernetics, IEEE Transactions on 4, 2 (1968), [49] Hoppe, B. E. Efficient dynamic network flow algorithms. PhD thesis, Cornell University, Ithaca, NY, USA, [50] Hoppe, B. E., and Tardos, E. The quickest transshipment problem. Mathematics of Operations Research 25 (2000), [51] Jarvis, J., and Ratliff, H. Some equivalent objectives for dynamic network flow problems. Management Science 28 (1982), [52] Jensen, B., and Berthelsen, L. net simplex. Available as binaries provided by H. Mittelmann at software/net_simplex_binaries/. Accessed March 2012.

195 BIBLIOGRAPHY 195 [53] Johnson, T., Robertson, N., Seymour, P. D., and Thomas, R. Directed tree-width. Journal of Combinatorial Theory, Series B 82, 1 (2001), [54] Kaibel, V., and Peinhardt, M. A. F. On the bottleneck shortest path problem. Tech. Rep , Konrad-Zuse-Zentrum für Informationstechnik Berlin, [55] Khachiyan, L. G. A polynomial algorithm for linear programming. Soviet Mathematics Doklady 20 (1979), (Russian original in Doklady Akademiia Nauk SSSR, 244: ). [56] King, V., Rao, S., and Tarjan, R. E. A faster deterministic maximum flow algorithm. In Proceedings on the third Annual Symposium on Discrete Algorithms (1992), vol. 3 of Symposium on Discrete Algorithms, ACM/SIAM, pp [57] Kleinberg, J. M. Single-source unsplittable flow. In 37th Annual Symposium on Foundations of Computer Science (1996), pp [58] Klingman, D., Napier, A., and Stutz, J. NETGEN: A program for generating large scale capacitated assignment, transportation, and minimum cost network flow problems. Management Science 20 (1974), [59] Klinz, B., and Woeginger, G. J. One, two, three, many, or: complexity aspects of dynamic network flows with dedicated arcs. Operations Research Letters 22 (1998), [60] Klinz, B., and Woeginger, G. J. Minimum-cost dynamic flows: The series-parallel case. Networks 43 (2004), [61] Köhler, E., and Skutella, M. Flows over time with load-dependent transit times. SIAM Journal on Optimization 15 (2005), [62] Korte, B., and Vygen, J. Combinatorial Optimization: Theory and Algorithms, 4th ed. Springer, [63] Lämmel, G., Grether, D., and Nagel, K. The representation and implementation of time-dependent inundation in large-scale microscopic evacuation simulations. Transportation Research Part C: Emerging Technologies, 1 (2010), [64] Löbel, A. MCF version 1.3 a network simplex implementation. Available for academic use free of charge via WWW at [65] Löbel, A. Solving large-scale real-world minimum-cost flow problems by a network simplex method. Tech. Rep. SC-96-7, Konrad-Zuse-Zentrum für Informationstechnik Berlin, [66] LoDyFA (Library of Dynamic Flow Algorithms). mathematik.uni-kl.de/old/evacuation/description.html. Accessed March 2012.

196 196 BIBLIOGRAPHY [67] Mamada, S., Uno, T., Makino, K., and Fujishige, S. A tree partitioning problem arising from an evacuation problem in tree dynamic networks. Journal of the Operations Research Society of Japan 48, 3 (2005), [68] MATSim : Multi-Agent Transport Simulation Toolkit. matsim.org/. Accessed March [69] McCormick, S. T. Submodular function minimization. In Handbook on Discrete Optimization, K. Aardal, G. Nemhauser, and R. Weismantel, Eds. Elsevier, 2006, pp An updated version is available at [70] Melkonian, V. Flows in dynamic networks with aggregate arc capacities. Information Processing Letters 101 (2007), [71] Minieka, E. Maximal, lexicographic, and dynamic network flows. Operations Research 21 (1973), [72] Mitchell, S. L. Linear algorithms to recognize outerplanar and maximal outerplanar graphs. Information Processing Letters 9 (1979), [73] Möhring, R. H., Köhler, E., Gawrilow, E., and Stenzel, B. Conflict-free Real-time AGV Routing. H. Fleuren, D. Hertog, and P. Kort, Eds., vol of Operations Research Proceedings, Springer, pp [74] Moore, E. F. The shortest path through a maze. In Proceedings of the International Symposium on the Theory of Switching (1959), Harvard University Press, pp [75] OpenStreetMap. [76] Papadimitriou, C. H., Serafini, P., and Yannakakis, M. Computing the throughput of a network with dedicated lines. Discrete Applied Mathematics 42 (1993), [77] Pfender, T. Arboreszenz-Flüsse in Graphen: polyedrische Untersuchungen. Diploma thesis, Technische Universität Berlin, Berlin, Germany, In German. Author s name is now T. Achterberg. [78] Robertson, N., and Seymour, P. D. Graph minors. V. Excluding a planar graph. Journal of Combinatorial Theory Series B 41, 1 (1986), [79] Rost, M. Modellierung und Lösung von Evakuierungsproblemen mittels Netzwerkflüssen. Bachelor thesis, Fachbereich Mathematik und Informatik, Freie Universität Berlin, Berlin, Germany, In German. [80] Roughgarden, T., and Tardos, E. Introduction to the Inefficiency of Equilibria. In Algorithmic Game Theory, N. Nisan, T. Roughgarden, E. Tardos, and V. Vazirani, Eds. Cambridge University Press, 2007.

197 BIBLIOGRAPHY 197 [81] Ruzika, S., Sperber, H., and Steiner, M. Earliest arrival flows on series-parallel graphs. Networks 57, 2 (2011), [82] Schrijver, A. Combinatorial Optimization: Polyhedra and Efficiency, vol. 24 of Algorithms and Combinatorics. Springer, [83] Schulz, I. Evakuierung und Netzwerkflüsse. Diploma thesis, Lehrstuhl für diskrete Optimierung, Universität Dortmund, Dortmund, Germany, In German. [84] Seidel, R., and Aragon, C. R. Randomized search trees. vol. 16 of Algorithmica, Springer, pp [85] Simon, P. M., Esser, J., and Nagel, K. Simple queuing model applied to the city of Portland. International Journal of Modern Physics C 10, 5 (1999), [86] Skutella, M. An introduction to network flows over time. In Research Trends in Combinatorial Optimization, W. Cook, L. Lovász, and J. Vygen, Eds. Springer, 2009, pp [87] Sleator, D. D., and Tarjan, R. E. A data structure for dynamic trees. Journal of Computer and System Sciences 26, 3 (June 1983), [88] Strehler, M. Signalized Flows Optimizing Traffic Signals and Guideposts and Related Network Flow Problems. PhD thesis, Brandenburgische Technische Universität Cottbus, Cottbus, Germany, [89] Tjandra, S. Dynamic Network Flow Models for Evacuation Problems. PhD thesis, Technische Universität Kaiserslautern, Kaiserslautern, Germany, [90] Wilkinson, W. L. An algorithm for universal maximal dynamic flows in a network. Operations Research 19 (1971), [91] Zadeh, N. A bad network problem for the simplex method and other minimum cost flow algorithms. Mathematical Programming 5 (1973), [92] ZET Evakuierungs Tool. Current developers: M. Groß and J.-P. Kappmeier. Accessed March 2012.

198 198 BIBLIOGRAPHY

199 Index, 46, 174 A -search, 64 A ++, see time-expanded network A, see time-expanded network Adaptive Traffic Control, see Adaptive Verkehrssteuerung Adaptive Verkehrssteuerung, 33, 35, 120, 128 additional supply, 179 admissible, 25 agent, 35 Airbus, 85, 102 airplane, 85, 102 table, 92 approximation algorithms, 13 arboricity flow, see confluent flow arc, 13 available, 45, 49, 126 backward, 19 cost, 19 capacity, 15, 22 congestion, 154 cost, 14 backward, 19 forward, 19 length, 14, 22 parallel, 13 transit time, 22, see also transit time arrival(t), 42 A T, see time-expanded network Audimax, 85, 102 augment, 19, 59, 72 availability, see arc, available average travel time, 37, 42 backward arc, 19 bag, 158 bagend, 160, 179 bal(v), see flow, balance bal(v, t), see flow over time, balance Berlin, 83, 101, 121 map, 84 network, 84 table, 92 B i, see bag bidirected, 14 bidirectional search, see mixed search bifurcated, 155 Boeing, 85, 102 bottleneck, see capacity, bottleneck breadcrumb, 51, 72 breadth first search, 51, 54, 60, 62, 65, 112 bridge flow, see flow with aggregate arc capacities BridgeTrans, see flow with aggregate arc capacities, minimum cost building, 85, 102, 118 table, 92 capacity, 15, 22 aggregate, 131, 134 bottleneck, 19, 46 bridge, 131 constant, 45 residual, 19 time-dependent, 37, 42, 126 time-expanded network, 28 uniform, 154, 166 capacity scaling, 21 commodity, see flow, multi-commodity complexity theory, 13 confluent flow, 151, 155 algorithm, 164, 175, 185 approximation, 177, 186 complexity, 162 instance,

200 200 INDEX K 2,3 -minor-free graph, 176, 188 maximum, 153, 156, 162 minimum cost, 153 outerplanar graph, 166 over time, 155, 190 sink, 157 cost function, 14 conservative, 14 non-negative, 14 cost on sinks, 40, 48, 77, 96 cost(x), see flow, cost cs2, 18, 39, 94, 96, 107, 116, 118 cycle, 14 d, see supply/demand function, see discretization, step size d + (v), see additional supply -constant, 25 δ in (v), 14 δ out (v), 14 data structure, 46, 62, 72, 113, 125 dedicated arc, 133 demand, see supply/demand function destination-based routing, 151 Dijkstra s algorithm, 14, 63, 107, 126, 186 directed graph, 13 discretization, 25, 27 admissible, 25, 135 group size, 76 step size, 25, 38, 76, 78, 127, 131, 144 disjoint paths, 143, 163, 188 dist, 14 dynamic flow, see flow over time dynamic guidance, 64 EAF, see earliest arrival flow earliest arrival flow, 33, 37, 42 algorithm, 37, 38 approximation, 37 arrival pattern, 38, 42, 43 existence, 37, 42, 44 earliest arrival pattern, see earliest arrival flow, arrival pattern edge, 13 Edmonds-Karp algorithm, 20, 60, 112 egress time, 34, 42 emergency exit sign, 121, 153 EMS, see monadic second order logic equilibrium, 37, 120 estimated arrival time, 54, 61, 66 evacuation, 33, 102, 128, 131 evacuee, 45 instance, 45 model, 33, 45, 118, 120 exact label, 15 excess, see flow over time, excess exit assignment, 120 exponential, 13 f, see flow over time, averaged feasible, 15, 22 FIFO, 36 fill-in effect, 55 flow, 15, see also flow over time algorithm, 18 augment, 19 balance, 18 complexity, 17 conservation, 15, 15 cost, 16, 19 cycle, 17 feasible, 15 function, 15 integrality, 21 maximum, 17, 19 minimum cost, 17, 94 multi-commodity, 21 optimality, 19 path, 17 running time, 18 time-expanded network, 30 value, 15 flow over time, 22, 23, 36, 37, 45 algorithm, 25 approximation, 37, 38 averaged, 26 balance, 22, 26 capacity, 22, 34, 131 complexity, 25 conservation, 23, 28, 36, 40, 52, 125, 148 cost, 24 -constant, 25 discrete, 26

201 INDEX 201 excess, 23, 27 feasible, 22 function, 22 maximum, 24, 127 minimum cost, 24, 39 multi-commodity, 25 path, 30, 145 quickest, 24, 38, 44 rate, 22 scale, 27 time-expanded network, 27 transshipment, see transshipment over time value, 24 flow problem, 17 flow rate, see flow over time, rate flow with aggregate arc capacities, 131, 134 approximation, 134, 144, 148 complexity, 133, 137, 143 flow conservation, 138 flow rate capacity, 136 impulse, 136, 138 integrality, 138, 143 maximum, 138 minimum cost, 135 multi-commodity, 149 quickest, 138 Ford-Fulkerson algorithm, 20 forget node, 160, 183 forward arc, 19 Forward BFS, 94, 113 forward search, 48, 112 algorithm, 53 propagation, 48, 49, 62 reachability, 48 settings, 94 Forward Seeking, 94, 96, 114 FPTAS, see fully polynomial-time approximation scheme fully polynomial-time approximation scheme, 13, 148, 186 game theory, 37 G i, see bagend greedy guidance, 64 grid, 87 table, 93 Grid-fine, 89, 107, 110 Grid-long, 88, 106, 110, 112, 116, 121 Grid-longer, 89, 107, 110 Grid-short, 88, 105, 110, 116, 121 group size, 76 G T, see time-expanded network guided search, 63, 65, 98, 101, 107, 113 G x, see residual network head, 13 holdover, see flow over time, conservation holdover arc, 28, 30, 40, 125 I, see tree decomposition impulse, see flow with aggregate arc capacities, impulse in-forest, 155 increasingly fine grid, see Grid-fine increasingly long grid, see Grid-longer induced subgraph, 14 inflow, 15 instance, 77 artificial, 85, 87, 90, 105, 110 grid, 87, 90 real-world, 77, 83, 85, 97 interface, 160 interval, 46, 52, 72 introduce node, 160, 181 Java virtual machine, 73, 94, 95 JVM, see Java virtual machine k-furcated, 155 L, 167, 168, 172 l(a), see capacity, aggregate label, 15, 59, 125, 128 exact, 15 label(v), 48, 55 leaf node, 160, 181 LEDA, 95, 111 linear programming, 13, 39, 133 lodyfa, 95, 111 long and narrow grid, see Grid-long loop, 13 MATSim, 35, 120, 132

202 202 INDEX MaxConfFlow, see confluent flow, maximum MaxFlow, see flow, maximum MaxFlowOverTime, see flow over time, maximum mcf, 18, 94, 96, 107, 116 memory, 47, 73, 75, 94 merge node, 160, 182 meta-heuristics, 125 MinCostFlow, see flow, minimum cost MinCostFlowOverTime, see flow over time, minimum cost MinCostTransOverTime, see flow over time, minimum cost minimum travel time, 33, 34, 37, 38, 40, 95 cost function, 40 MinTravelTime, see minimum travel time Mixed BFS, 94 mixed search, 59, 60, 64 propagation, 62 reachability, 59 settings, 94 Mixed Seeking, 94, 96 model, 33, 45, 102, 118, 120, 131 deterministic queuing, 35 macroscopic, 34, 77, 83 microscopic, 34, 85 monadic second order logic, 178 Moore-Bellman-Ford algorithm, 14 multi-commodity, 21, 25, 140, 149 multiple paths, 59, 62, 65, 112 nearly confluent, 155, 180 netgen, 85, 105, 110 network, 15 network loading, 35 network simplex, 39, 94 node, 158 non-terminal, 15 NP, 13 o(v), see target outflow OH14, 85, 102 outerplanar, 157, 166 embedding, 157, 170 instance, 168 outflow, 15 target, see target outflow P, 13 Padang, 34, 77, 97, 112, 116, 120, 121, 128 flooding, 81 map, 79 network, 80 table, 91 partition problem, 163, 188 path, 14 over time, 30, 61 path decomposition, 17, 186 flow over time, 30, 145 path flow, see flow, path and flow over time, path plan, 36 planar, 164 polynomial, 13 pre-processing, 63 preflow-push, 39 price of anarchy, 120 propagation, 127, 128 running time, 52 pseudo-code, 67 pseudo-polynomial, 13 Ψ, 167, 168, 172 push-relabel, 39 queue, 51, 59, 61, 64, 73 quick cutoff, 61, 63, 94, 112 QuickestTrans, see flow over time, quickest R, 167, 168, 172 Random Access Machine, 13 reachability, 48, 72 reverse search, 55 repeated paths, 60, 112 replanning, 36 residual network, 18 Reverse BFS, 94, 114 reverse search, 54, 66, 101, 112, 114 algorithm, 58 arrival time, 54 propagation, 56, 62

203 INDEX 203 reachability, 55 settings, 94 road network, 34, 45, 77, 83, 131 rounding, 76, 78, 132 running time, 51, 52, 64, 72, 76, 164, 175, 184 S +, see source S ++, see time-expanded network S, see sink S, see time-expanded network scale, 45, 76, 78, 131 scanned-flag, 51, 62, 72 search-tree, 46, 72 seeking guidance, 64, 98, 101, 107, 113 self-contained, 167, 169 series-parallel, 38, 158 shelter, 128, see also sink, limited short and wide grid, see Grid-short shortest path, 14, see also successive shortest path algorithm all-pair, 14 simple function, 167, 174, 188 single-commodity, see flow sink, 15 confluent flow, 157 limited, 16, 34, 44, 128, 164, 185 multiple, 45, 59 single, 17, 42, 44, 45, 47, 61, 131, 135, 162 sort paths, 60, 112 source, 15 limited, 16 multiple, 45, 60, 164 single, 17, 44, 61, 131, 135, 162 sparse graph, 45, 72 spillback, 36 SSP, see successive shortest path algorithm starting position, 34 static flow, see flow storage, see flow over time, conservation strong flow conservation, see flow over time, conservation strongly polynomial, 13 submodular, 37 successive shortest path algorithm, 20, 48 basic version, 21 correctness, 20 earliest arrival flow, 38, 42 interval-based, 44, 48, 67 other implementation, 95, 111 properties, 20, 66 pseudo-code, 67 running time, 21, 76, 106, 112, 116, 118 single-source single-sink EAF, 37, 44, 61 supersink, 17, see also sink, single supersource, 17, see also source, single supply additional, see additional supply supply/demand function, 15, 23, 34, 45 infinite, 16 obey, 16, 23, 24 satisfy, 15, 23, 24 time-expanded network, 28 total supply, 45, 116 T, see tree decomposition t?, see estimated arrival time Table(i), 180 tail, 13 target outflow, 179 task, 48, 59, 64, 73 telecommunication, 151 Telefunken-Hochhaus, 85, 102, 112 temporally repeated, 37 TEN, see time-expanded network terminal, 15, see also source and sink time horizon, 23, 27, 42, 46, 51, 62, 77, 89, 107 time layer, 25 time step, 25 time-expanded network, 25, 27, 135 capacity, 28 cost, 28 flow conservation, 28 supply/demand function, 28 terminal, 28 track unreachable, 66, 114 transit time, 22, 40, 145

204 204 INDEX constant, 45 load-dependent, 133 time-dependent, 37, 126 zero, 37 transshipment, 16 minimum cost, 17 transshipment over time, 24 minimum cost, 24 quickest, 24 travel time, 40, 47, 126 tree decomposition, 158, 188 complexity, 158 nice, 160, 178 properties, 161 treewidth, see tree decomposition, 164 triple-optimization result, 37 u min (P ), see capacity, bottleneck undirected, 14, 158 universal maximal flow, see earliest arrival flow unsplittable flow, 151 vertex clean-up, see vertex label cleanup vertex label clean-up, 62, 65, 72, 94, 113, 126, 129 V T, see time-expanded network W(a), see arc, available walk, 14 closed, 14 warm-start, 127 weak flow conservation, see flow over time, conservation width, see tree decomposition χ, 31 ZET, 34, 73, 85, 95, 102, 111, 125

205