Graph-based Simplex Method for Pairwise Energy Minimization with Binary Variables

Graph-based Simplex Method for Pairwise Energy Minimization with Binary Variables Daniel Průša Center for Machine Perception, Faclty of Electrical Engineering, Czech Technical Uniersity Karloo náměstí 13, 121 35 Prage, Czech Repblic prsapa1@fel.ct.cz Abstract We show how the simplex algorithm can be tailored to the linear programming relaxation of pairwise energy minimization with binary ariables. A special strctre formed by basic and nonbasic ariables in each stage of the algorithm is identified and tilized to perform the whole iteratie process combinatorially oer the inpt energy minimization graph rather than algebraically oer the simplex tablea. This leads to a new efficient soler. We demonstrate that for some compter ision instances it performs een better than methods redcing binary energy minimization to finding maximm flow in a network. 1. Introdction Energy minimization is a well known NP-hard combinatorial problem which arises in MAP inference in graphical models [16]. It is of great importance in low-leel compter ision. A lot of effort has been spent by researchers to inent methods finding exact or approximate soltions. Pairwise energy minimization with binary ariables has a prominent role. It is easily expressible as qadratic psedoboolean optimization (QPBO) [1, 8]. Max-flow/min-ct algorithms can be applied to find a partial optimal soltion, with some ariables ndecided. This shrinks the size of the inpt problem. The ndecided ariables can be frther resoled by applying e.g. the probing techniqe [13] or, in general, by branch and bond. A complete optimal soltion is always retrned by QPBO for sbmodlar instances. This is frther tilized in some cases of energy minimization with general ariables. Eery mlti-label energy minimization redces to the binary one [6, 14], presering the sbmodlarity property. Frther, soling sbmodlar binary instances is a crcial part of approximate minimizations [3]. Since binary energy minimization is cast as a maxflow/min-ct problem, the key prereqisite for soling it efficiently has been so far to come p with a max-flow algorithm working well on ision instances. The poplar algorithm by Boyko and Kolmogoro [2] flfills this role. An empirical comparison of max-flow algorithms recently done by Verma and Batra [15] reeals that some other contemporary implementations are more sitable for instances with dense graphs. In this paper we introdce a different principle to sole the binary case. We do not perform any translation to maxflow. Or starting point is the linear programming (LP) relaxation of the problem [17]. For binary ariables, it is known to be half-integral, i.e., all components of each optimal soltion are in {0, 1 2, 1}. Moreoer, the soltion coincides with the reslt of QPBO ale 1 2 indicates ndecided ariables [1]. We show that the simplex method soling the LP relaxation can be trned into a ery efficient algorithm, performed prely oer the inpt energy minimization graph. Special ersions of the simplex method with similar properties hae already been proposed for transportation, assignment and minimm cost-flow problems [4]. They are known as the network simplex algorithms. There is een a cstomization for the maximm flow problem [5], thogh it does not figre among the leading implementations. The proposed algorithm has practical benefit. Or experiments demonstrate there are ision instances, where the algorithm performs better than max-flow based solers. Besides that, it allows to stdy the behaior of the simplex method on large-scale data and gies a hope for a generalization to mlti-label problems. And finally, its impact may extend beyond the scope of energy minimization as it can be applied to min-ct which is expressible as sbmodlar binary energy minimization. Since the formlation and the nderlying strctre is different than in the case of the network simplex algorithm for max-flow, new ideas for soling min-ct/max-flow may emerge. We assme the reader is familiar with the simplex method, as presented e.g. in [4]. Knowledge of notions like basis, basic ariable, basic feasible soltion, pioting rle or minimm ratio test is essential for nderstanding the text.

2. Energy minimization and its LP relaxation The task of pairwise energy minimization is to compte [ θ (k ) + ] θ (k, k ) (1) min k K V V {,} E where V is a finite set of objects, E ( V 2) is a set of object pairs (i.e., (V, E) is an ndirected graph), K is a finite set of labels, and the fnctions θ : K R and θ : K K R are nary and pairwise interactions. We adopt that θ (k, l) = θ (l, k) and refer to the ales of θ and θ as potentials. We shortly write θ (k) as θ ;k and θ (k, l) as θ ;kl. The potentials together form a ector θ R I with I = { (; k) V, k K } { (; kl) {, } E; k, l K }. (2) Throghot the paper, we consider energy minimization with two labels, so K is fixed as {0, 1}. An instance of problem (1) is ths flly defined by a tple (V, E, θ). Its linear programming relaxation reads as argmin θ, x. (3) x Λ Here we optimize oer polytope Λ which consists of ectors x R I that satisfy the constraints x ;kl = x ;k, V, N, k K (4a) l K x ;k = 1, V (4b) k K x 0, (4c) where N = { {, } E } is the set of neighbors of object. We again adopt x ;kl = x ;lk. 3. Applying simplex method preparation We work with the ariant of the simplex method which assmes the inpt linear program to be in the standard form: min{ θ, x Ax = b, x 0} (5) where A R m n, m n, rank(a) = m, b R m, and θ R n. The LP relaxation of an energy minimization gien by (V, E, θ) indces oerall n = 2 V + 4 E ariables and m = V + 3 E linearly independent eqations [9]. For a gien basis B I, Figre 1 shows how we isalize basic and nonbasic LP relaxation ariables. The scheme incldes the inpt graph (V, E). The ariables form an nderlying microstrctre. Nonbasic ariables are colored p p r 11 1 r 1 01 p 10 r 0 r 0 p 00 Figre 1. An LP relaxation diagram. The energy minimization graph consists of objects, and object pair {, }. LP relaxation ariables are denoted as r k = x ;k, r k = x ;k, and p kl = x ;kl. Nonbasic ariables are red, basic ariables are black or ble. The selected basis is B = {(; 0), ( ; 00), ( ; 0), ( ; 01), ( ; 10)}. in red. They always attain zero ale in the indced feasible soltion. Basic ariables attaining nonzero or zero ale are black or ble, respectiely. Each sch scheme is called the LP relaxation diagram. In accordance with the diagram appearance, eery nary or pairwise LP relaxation ariable is simply called a node (ariable) or edge (ariable), respectiely. The set of basic ariables represented in Figre 1 reslts in the following simplex tablea. r 0 r 1 r 0 r 1 p 00 p 01 p 10 p 11 θ 0 θ 1 θ 0 θ 1 θ 00 θ 01 θ 10 θ 11 0 r 0 1 1 0 0 0 0 0 0 1 r 0 0 0 1 1 0 0 0 0 1 p 00 0 1 0 1 1 0 0-1 1 p 01 0 0 0-1 0 1 0 1 0 p 10 0-1 0 0 0 0 1 1 0 Constraints (4a), (4b) gie 6 linear eqations, bt only 5 of them are linearly independent. Each row of the tablea is a linear combination of the constraints expressing a basic ariable. The indced basic soltion is feasible since all elements in the rightmost colmn are nonnegatie. The simplex algorithm cold be lanched if the row with potentials (objectie costs) is adjsted to contain zeros in all basic colmns. Performance of the standard, tablea-based simplex algorithm is inflenced by two factors. The nmber of performed iterations. It tends to be steadily O(n) in practical applications [4]. Some pathological LP instances reslt in an exponential amont of iterations. On the other hand, polynomial pper bonds were proed e.g. for the assignment problem mentioned in the introdction. Memory and rnning time reqired to represent and pdate the simplex tablea. This is what makes the method comptationally infeasible for large-scale instances, qadratic time and space is reqired. A sage of a sparse matrix might be helpfl, howeer, the nmber of nonzero elements keeps growing and the

θ <0 θ A b a ij Figre 2. Haing chosen piot a ij, only nonzero elements in the i-th row and j-th colmn are needed to pdate θ, θ and b properly. pdate is still a time-consming operation. The problems are partially addressed by the so called reised simplex method [4], bt still, the general algorithm is not fast enogh to be able to compete with max-flow based QPBO solers. The key concept leading to an efficient algorithm is explained in Figre 2. Each iteration of the simplex algorithm pdates cost ector θ and the objectie fnction ale θ based on the piotal row, while the right-hand sides ector b is pdated based on the piotal colmn. Moreoer, only nonzero ales hae an impact. We show that the LP relaxation diagram enables a cheap retrieal of all sch nonzero elements. It is ths enogh to maintain only θ, θ and b. 4. Strctre of basis It is a well known fact that each basic ariable is expressible as a linear fnction of nonbasic ariables. Moreoer, it is expressed in a niqe way. Theorem 1. Let B I be a basis of (3) and let N = I B be the set of nonbasic indices. There are niqe coefficients b i, a ij R (i B, j N ) sch that each feasible ector x Λ satisfies x i = b i j N a ij x j, i B. (6) Proof. Consider a linear program with constraint eqations Cx = c where C R m n, c R m and rank(c) = m. It can be written as Bx B + Nx N = c where B is an inertible matrix and x B, x N are ectors of basic and nonbasic ariables, respectiely [4]. This implies that x B = B 1 c B 1 Nx N. Assme that also x B = d Dx N for some D R m (n m), d R m. Setting nonbasic ariables to zeros gies the (only) basic soltion corresponding to basis B, hence d = B 1 c. This frther implies that Dx N = B 1 Nx N for all x N R n m, which holds only if D = B 1 N. Note that coefficients a ij and b i in Theorem 1 correspond to elements in the simplex tablea composed for basis B. In what follows, we examine how they relate to the strctre of the LP relaxation diagram. We say that a basic ariable x i depends on a nonbasic ariable x j if a ij 0. Two sefl corollaries can be obtained from Theorem 1. Corollary 2. Let B I be a basis of (3), N = I B and y, z Λ. If y i = z i for all i N, then y = z. Corollary 3. Let B I be a basis of (3), N = I B and y, z Λ. Let there be k N sch that y k z k and y i = z i for all i N {k}. For j B, if x j is a basic ariable in (6) which depends on nonbasic ariable x k, then y j z j. The first corollary states that a feasible ector is flly determined by its nonbasic components. The second corollary states that if two feasible ectors hae the same nonbasic components except one, then they differ in all basic components which depend on the distinctie nonbasic component. Gien a basis B, two special elements in (V, E) are important for expressing basic node ariables in the form (6). An object V is called a dependency root if each basic node ariable x ;k (k {0, 1}) depends only on the other node x ;1 k and/or on edge ariables in object pairs adjacent to. Two possible configrations forming a dependency root are depicted in Figre 3. An object pair {, } E is called a dependency object pair if there are exactly two nonbasic edges x ;kl which are either parallel or intersecting as shown in Figre 4. Note that Figre 3 and Figre 4 show only snippets of an LP relaxation diagram, not standalone diagrams with a alid basis. Define a dependency graph D(V, E, B) = (V, E ) as the sbgraph of (V, E) where E consists of all dependency object pairs in E. Moreoer, for V, define D (V, E, B) as the connected component of D(V, E, B) containing. An example of a LP relaxation diagram and the indced dependency graph is depicted in Figre 6. The following theorem gies its characterization. Theorem 4. Let (V, E, θ) be an instance of binary energy minimization and let B be a basis of its LP relaxation. Then, D(V, E, B) has the following strctre. Each component has at most one cycle, if a component is a tree, it contains exactly one dependency root, and if a component has a cycle, it does not contain any dependency root. Moreoer, each basic node in an object V depends only on nonbasic ariables located in the following elements:

a b d e c Figre 3. Objects and are dependency roots since a = 1 b, d = e + f and c = 1 e f. a b e f c d Figre 4. Two parallel or intersecting nonbasic edges allow to delegate expressing basic nodes in to expressing basic nodes in (and ice ersa). For example, a = e f +d and b = f e+c. Figre 5. Objects and do not depend on a single nonbasic edge in {, }. Three nonbasic edges in {, } indce dependency roots,. Figre 6. An example of an LP relaxation diagram and its dependency graph. Dependency object pairs are displayed as directed edges oriented towards the root or cycle of the component. Edges in a cycle follow one of the orientations which closes a loop. Dependency roots as well as objects in cycles are highlighted. objects and object pairs forming the path from to the dependency root or the cycle of D (V, E, B), objects and object pairs of the cycle of D (V, E, B), and the object pair containing nonbasic edges establishing the root of D (V, E, B) (refers to Figre 3). f w Proof. We list all locally admissible configrations of basic and nonbasic ariables (p to permtations). After that, we derie how a basic node is expressed in the form (6). As the first step, obsere there is at most one nonbasic node within one object. Since nonbasic nodes attain zero in feasible soltions, two nonbasic nodes do not flfill constraint (4b). In addition, each object pair {, } contains one, two or three nonbasic edges. This is proed by contradiction. Let all for edges x ;ij, i, j {0, 1} be nonbasic (and ths of zero ale). Constraint (4a) forces x ;0 = x ;1 = 0 which again does not flfill (4b). On the other side, let all for x ;ij be basic. Take a feasible soltion y where y ;00 = y ;11 = 1/2 and y ;01 = y ;10 = 0. Create ector z from y as follows: set z ;00 = z ;11 = z ;01 = z ;10 = 1/4 and copy ales of all other components. Since z is again a feasible soltion, pair y, z contradicts Corollary 2. We frther inspect, how the configrations participate in expressing nodes in the form (6). If there is only one nonbasic edge (see Figre 5), we can proe that none of the basic nodes in the adjacent objects depends on it. Consider the same ectors y and z as specified before. W.l.o.g., let x ;10 be the only nonbasic edge within {, } and, w.l.o.g., let x ;0 depend on x ;10. Obsere that y ;10 differs from z ;10, bt y ;0 eqals z ;0. This contradicts Corollary 3. If there are two nonbasic edges with a shared end-node in object (Figre 3), then, by definition, is a dependency root. Three nonbasic edges indce one pair of edges with a shared end-node in each of the objects (Figre 5), two dependency roots are ths formed. The only configrations of nonbasic edges allowing dependance of basic nodes on more distant nonbasic ariables are those ones defining dependency object pairs (Figre 4). In this case, nodes in one object can be locally expressed sing nodes in the opposite object (and ice ersa). For a parallel dependency object pair (Figre 4) we derie x ;0 = x ;00 x ;11 + x ;1, x ;1 = x ;11 x ;00 + x ;0. Analogosly, for an intersecting object pair (Figre 4) we get x ;0 = x ;01 x ;10 + x ;0, x ;1 = x ;10 x ;01 + x ;1. Since x ;00 and x ;11 are nonbasic ariables, to complete the expression of x ;0 and x ;1 as (6) wold reqire to express x ;1 and x ;0. This again cold delegate the process to a neighboring object. Sch a procedre prodces a complete expression of a basic ariable when a dependency root is reached, as demonstrated in Figre 7. Following the path from to w yields x ;0 = x ;00 x ;11 + x w;10 x w;01 + x w;1.

If a seqence of objects ( i ) k forms a path in (V, E), each { i, i+1 }, 1 i k 1 is w.l.o.g. an intersecting dependency object pair and x k ;0 is the only nonbasic node among all nodes in i s, then basic nodes in 1 are expressed by nonbasic ariables as follows. w x 1;0 = x i i+1;01 x i i+1;10 + x k ;0, (7) x 1;1 =1 x i i+1;01 + x i i+1;10 x k ;0. (8) The second possibility how to terminate the described process of expressing a basic node is to close a loop. This is demonstrated in Figre 7 where x ;0 = 1 2 (x ;00 x ;11 + x s;10 x s;01 + x st;11 x st;00 + x t;00 x t;11 + 1). In general, let ( i ) k be a seqence of objects forming a cycle in (V, E). W.l.o.g., let { i, i+1 } be intersecting for 1 i k 1 and let { 1, k } be parallel. Assme than all nodes in i s are basic. Node x 1,0 can be again expressed as (7), bt this is not yet the form (6) since x k ;0 is basic node. Howeer, if we sbstitte x k ;0 = x k 1;00 x k 1;11 + 1 x 1;0, we obtain an eqation which gies x 1;0 = 1 ( 1 + x i 2 i+1;01 x i i+1;10 + x k 1;00 x k 1;11 ). (9) Node x 1;1 can be now expressed as (6) sing the eqality x 1;1 = 1 x 1;0. Note it was essential that { 1, k } is parallel. Changing it to intersecting wold reslt in an eqation where both, x 1;0 and x 1;1 are missing. To express these ariables, it is necessary to hae a dependency cycle where all basic edges and nodes form a connected component in the nderlying microstrctre. Finally, the most general sitation when there is a path starting at object and leading to object which is part of a cycle (and it is the first sch an object in the path) is handled in two steps. First, nodes in are expressed along to the path to as (7), (8), and second, nodes of are expressed within the cycle as (9). To finish the proof it sffices to realize it is not possible to hae more choices of following dependency object pairs to a root or cycle since (6) is niqe for each basic ariable. Figre 7. Expressing a nonbasic node ariable in along a path formed of dependency object pairs terminates only if a dependency root is reached or a loop is closed. Theorem 5. Gien a binary energy minimization (V, E, θ) and a basis B of its LP relaxation, each basic edge of an object pair e = {, } E depends only on a sbset of nonbasic ariables in e, C = D (V, E, B), C = D (V, E, B) and on nonbasic edges establishing dependency roots of C and C. Proof. If there are at least two nonbasic edges in e = {, }, then each basic edges in e is adjacent to a nonbasic edge in some node (inspect all possible configrations in Figres 3, 4, 4, 5). It is ths possible to express eery basic edge in e as a linear fnction of one nonbasic edge and one (basic or nonbasic) node. For example, we derie in Figre 4 s x ;01 = x ;1 x ;11. (10) If there is only one nonbasic edge in e (Figre 5), then the basic edge which is not adjacent to it in any node is expressed sing two nodes. In Figre 5, it holds x ;01 = x ;0 x ;0 + x ;10. (11) Formla (6) for any basic edge in e is obtained if we sbstitte the basic nodes with their expressions (6). Theorem 6. Gien a binary energy minimization (V, E, θ), a feasible basis B of its LP relaxation, i B and x i expressed in the form (6). It holds b i {0, 1 2, 1} and, for all j I B, a ij {0, ± 1 2, ±1, ±2}. Proof. Possible ales of b i are prescribed by the halfintegrality of basic feasible soltions [1]. As obserable in Figre 6, ales 1 2 are assigned to basic ariables in dependency components with a cycle and to basic edge ariables in object pairs adjacent to sch components. By checking expressions (7), (8) and (9) deried in the proof of Theorem 5, we find that a ij is in {0, ± 1 2, ±1} for each basic node. The same conclsion holds for basic edges which can be expressed as (10). Consider a basic edge expressed as (11). Sbstitte for x ;0 and x ;0 their expressions (6). If D (V, E, B) and D (V, E, B) differ, then the t

w + + c s b e + f a + e + t l Figre 8. The ble signs show dependency of a on nonbasic ariables (each dependency coefficient is either +1 or 1), while the black signs show dependency of b. Since f = e + a b, edge f does not depend on node c, neither on the nonbasic edges in {w, }. If the length of the dependency path from w to was an een nmber, the dependency coefficient of f on c wold be 2. Figre 9. The highlighted elements contain all the basic ariables that depend on a nonbasic edge ariable in object pair e. A basic ariable in object pair l = {, } is expressed along paths from and to the root or cycle. sbstitted expressions do not share ariables with nonzero coefficients. Howeer, if D (V, E, B) = D (V, E, B) then a nonbasic ariable x q with a nonzero coefficient can be present in both expressions, with coefficients of the same absolte ale, hence the dependency of x ;01 on x q either anishes or the dependency coefficient is dobled. This is demonstrated in Figre 8. 5. Algorithm Section 4 has established theoretical basis for a special ersion of the simplex algorithm exected for the LP relaxation of a binary energy minimization (V, E, θ). Here we pt together all needed ingredients and gie a description of the algorithm itself. The algorithm performs steps oer the LP relaxation diagram indced by the actal feasible basis B. The diagram enables an efficient retrieal of dependency coefficients which corresponds to elements in the simplex tablea. For a faster traersal of the diagram, the direction of dependency object pairs towards the dependency root or cycle, as gien in Figre 6, is maintained. Each node and edge ariable is assigned by a boolean flag determining its basic/nonbasic stats. Black/ble color of basic ariables is recorded only for nodes. Color of basic edges can be deried from a local context. For a nonbasic ariable, it is possible to locate all basic ariables that depend on it by traersing the dependency component in the direction opposite to the orientation of dependency object pairs in Figre 9. Similarly, for a basic ariable, it is possible to traerse all nonbasic ariables on which it depends. In this case a directed path to the dependency root or cycle is followed in one or two components, as can be seen in Figre 9. The dependency coefficient can also be determined by traersing components. We apply a modified Dantzig s pioting rle. The original ariant chooses a nonbasic ariable with the lowest negatie cost (sch a ariable enters the basis), howeer, this is time-consming. We rather se r dobly linked lists L 1,..., L r and negatie threshold ales τ 1 < τ 2 <... < τ r = 0. An objectie cost θ is stored in L s if τ s 1 < θ τ s (where τ 0 = ). The list can be fond by a binary search in log 2 r steps. From performance reasons, we also do not implement any anti-cycling strategy. This is a normal approach in general LP solers. No cycling has been obsered dring or nmeros experiments. The experiments frther reealed that it is sfficient to choose r = 8. Setting r > 8 has not reslted in any considerable improement for the tested instances. It has also trned ot that the nmber of iterations performed by the algorithm is nearly identical as in the case of the reglar Dantzig rle. The initial feasible basis follows the niform pattern in Figre 1. A description of the algorithm follows. Initialization. Create the initial feasible basis. For eery object V, mark x ;1 as a nonbasic node. For eery object pair {, } E, mark x ;11 as a nonbasic edge. All the other nodes and edges are basic ariables. Apply a reparametrization (see section 2.1 of [8]) to θ which changes costs of all basic ariables to zero. Insert negatie costs into lists L i. Entering ariable selection. Find the first nonempty list L i and take its first element. The nonbasic ariable x e referenced by it is the entering ariable. Assme it is located in an object or object pair e. Leaing ariable selection. Mark each object that passes throgh e when traersing from to the root or cycle, respectiely. Search throgh the dependent ariables and find a leaing ariable (x l in an object or object pair l) flfilling the minimm ratio test. Finish the search prematrely if a zero ratio is fond. Unmark all the marked objects. Iteration. Update costs of all ariables on which x l depends. Remoe positie costs from lists, append costs that became negatie. If a negatie cost in L i is changed and

belongs now to L j, remoe it form L i and append it to L j. Update basic/nonbasic flag of x e and x l. Update the representation of dependency components. Update colors of nodes. Termination. Finish when all lists L i are empty. Time of one iteration is proportional to size of the inoled dependency components. A good performance can be expected for those energy minimization instances which indce dependency components whose aerage size is either constant or slowly growing in size of the energy minimization graph. The experimental ealation showed that sch small components sally emerge for nonsbmodlar instances and sbmodlar instances where pairwise potentials are less dominant. On the other hand, sbmodlar instances with more dominant pairwise potentials (e.g. segmentation instances), which reslt in soltions containing large regions assigned by the same label, tend to form large dependency components dring the late iterations of the algorithm. In the next section, we report details on the faorable type of problems. 6. Experiments We hae implemented the specialized simplex algorithm (SA) in C++. It spports the creation of general graphs and comptes with doble precision floating point nmbers. The max-flow based QPBO implementation by Kolmogoro [18] (BK) is sed for comparison. All the sorces were compiled in Microsoft Visal Stdio 2012 and rn on a notebook with Intel Core i5-4300m 2.6 GHz, 12 GB RAM and 64-bit Windows 7. The ealation is done for ision problems and random data. 6.1. Vision problems Test data are taken mainly from the empirical comparison of max-flow algorithms [15] which targets a wide range of max-flow algorithms based on agmenting-path, pshrelabel or psedoflow principles. Since their performance relatie to BK algorithm is known, it is possible to compare SA with them. The data are aailable at [20] in the form of QPBO graphs. We trned them into an energy minimization format. Decision Tree Field (DTF) is a recently introdced model by Nowozin et al. [11] that combines random forests and conditional random fields. 99 instances are aailable, giing a sfficiently representatie sample. The problem is nonsbmodlar and inoles dense graphs. We measre performance of SA relatie to BK. Ratios of rnning times (SA/BK) sorted in ascending order are plotted in Figre 10. SA is at least two times faster than BK for 80 instances. It is slower only for 4 instances. The aerage time of the best performing max-flow algorithm oer DTFs reported in [15] SA / BK rel. time 2 1.5 1 0.5 0 0 20 40 60 80 100 instance Figre 10. SA/BK relatie time for Decision Tree Fields. is abot 46% of the aerage time of BK (proided that the initialization time is conted). SA achiees 31% of BK. The amont of samples aailable for the next problems is considerably lower. We hae collected two instances of Sper Resoltion (a sparse nonsbmodlar problem) and six instances of Deconoltion with 3x3 or 5x5 blr-kernel (a dense nonsbmodlar problem, graph connectiity is 24 or 80, respectiely). These instances were considered by Rother et al. in [13]. Some of them (#3, #7, #9) are proided at [20]. Shape Fitting [10] is a representatie of a sparse sbmodlar problem. An instance (LB07-bnny-sml) has been taken at [19]. Measred absolte rnning times are in Table 1. Two selected DTFs are inclded there for comparison. To demonstrate a hge gain oer a general LP soler, rnning times of IBM ILOG CPLEX 12.6 for the instances are also inclded. We tested the primal as well as the dal simplex method. Time limit of 10 mintes was applied for each comptation. We can see that BK performs better than SA for instances #3 8. Howeer, the responsieness of SA is within reasonable limits. The sitation is different for deconoltion with a 5x5 blr-kernel. BK otperforms SA for #9. Conersely, SA otperforms BK for #10. Note that there are max-flow algorithms better than BK for this denser ariant of deconoltion (approx. 1.5 times faster [15]). 6.2. Scalable random data Here we stdy how the performance of SA scales in the size of the inpt graph. For this prpose we need a dataset of instances with eqally growing nmber of objects. We generate a sbset of sqare grids from 10 10 to 500 500 with 8-neighborhood system. Unary potentials are generated as independent gassians θ ;0, θ ;1 N (0, 1). Pairwise potentials are set to zero for θ ;00 and θ ;11. Vales of θ ;01 and θ ;10 are generated as N (0, 2). This setp is an instance of the Ising model with mixed potentials. Oerall, we obtain sparse nonsbmodlar inpts containing abot 66% ndecided ariables in the optimal LP relaxation soltion. The ealation is also done for the ariant of SA which strictly follows Danzig s pioting rle (a binary heap is sed to store negatie objectie costs in this case). This ariant

# description objects obj. pairs SA [ms] BK [ms] CPLEX primal [ms] CPLEX dal [ms] 1 dtf-78 7776 217414 149 1337 time limit exceeded 96315 2 dtf-94 8384 234237 276 1523 time limit exceeded 129574 3 sp. res. (4-con) 5246 10345 3.7 1.5 218 421 4 sp. res. (8-con) 5246 20545 10.4 3.2 483 1388 5 shape fit. (6-con) 805800 2391242 628 135 ot of memory ot of memory 6 decon. 3x3 1024 11346 5.5 2.2 1451 296 7 decon. 3x3 1000 10968 4.9 3.1 1185 358 8 decon. 3x3 1000 10968 4.6 2.5 858 405 9 decon. 5x5 1000 33900 71.7 8.3 17924 1872 10 decon. 5x5 1024 35400 7.5 23.9 22730 2496 11 decon. 5x5 880 29820 32.1 46.4 1841 1342 Table 1. Rnning time of SA, BK and CPLEX 12.6 for ision instances. time [ms] 3500 3000 2500 2000 1500 1000 500 SA SA D BK 0 0 0.5 1 1.5 2 2.5 nmber of objects x 10 5 Figre 11. Rnning time of SA, SA-D and BK for random grids. iterations 14 x 105 12 10 8 6 4 2 SA D SA 0 0 0.5 1 1.5 2 2.5 nmber of objects x 10 5 objects traersed per iteration 9 8.5 8 7.5 7 6.5 6 5.5 SA D SA 5 0 0.5 1 1.5 2 2.5 nmber of objects x 10 5 Figre 12. SA and SA-D: the nmber of performed iterations and the aerage nmber of objects traersed per iteration. is denoted as SA-D. The dependency of rnning time on the nmber of objects is plotted in Figre 11. SA performs better than BK and the measred time is linear. In contrast, a non-linearity cased by the binary heap sage is obserable for SA-D. It is also interesting that the fnction for SA is the smoothest one. Figre 12 reeals two dependencies. The nmber of iterations performed by the simplex algorithm is linear. It is almost identical for both, SA and SA-D, the displayed fnctions coalesce. Moreoer, the aerage nmber of objects traersed within one iteration is almost constant. Srprisingly, the constant is greater for SA-D. In conclsion, SA demonstrated a ery good performance and stability in this test. 7. Conclsion We hae presented a graph-based ersion of the simplex method for pairwise energy minimization with binary ariables. The experiments confirmed that the proposed algorithm is efficient for certain types of ision problems. Otperforming other solers on DTF instances represents an immediate practical benefit. We beliee the method has a ery good potential for frther research. We obtained an algorithm competitie with best solers based on finding maximm flow in a network, which has been intensiely stdied by many researches for a long time. Or algorithm has not ndergone sch eoltion. Promising opportnities for improements are ths expectable. For example, a different pioting rle might reslt in a smaller size of dependency components, a more adanced representation of components might redce the nmber of traersed objects when searching for the leaing ariable, etc. The algorithm may also be sitable for parallelization. An interesting qestion is whether the approach can be efficiently generalized to the LP relaxation of mlti-label energy minimization problems. The sitation is srely more difficlt. More labels indce richer relations among basic and nonbasic ariables. Except that, the LP relaxation becomes as hard as general LP [12] and, as a conseqence, components of optimal soltions are, roghly speaking, arbitrary fractions. On the other hand, known methods for soling the LP relaxation of mlti-label problems like message passing [7] are considerably slower than max-flow algorithms. Een a slower retrieal of simplex tablea elements (e.g. by soling sbsystems of linear eqations within dependency components) cold still reslt in a method with better performance. The presented applicability of the simplex algorithm in the binary setting shold encorage sch considerations.

Acknowledgment The athor was spported by the Czech Science Fondation project P202/12/2071. References [1] E. Boros and P. L. Hammer. Psedo-Boolean optimization. Discrete Applied Mathematics, 123(1-3):155 225, 2002. 1, 5 [2] Y. Boyko and V. Kolmogoro. An experimental comparison of min-ct/max-flow algorithms for energy minimization in ision. IEEE Trans. on Pattern Analysis and Machine Intelligence, 26(9):1124 1137, Sept. 2004. 1 [3] Y. Boyko, O. Veksler, and R. Zabih. Fast approximate energy minimization ia graph cts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(11):1222 1239, No. 2001. 1 [4] G. Dantzig and M. Thapa. Linear Programming 1: Introdction. Springer, 1997. 1, 2, 3 [5] D. Goldfarb and J. Hao. A primal simplex algorithm that soles the maximm flow problem in at most nm piots and O(n 2 m) time. Mathematical Programming, 47(1-3):353 365, 1990. 1 [6] H. Ishikawa. Exact optimization for Marko random fields with conex priors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(10):1333 1336, Oct. 2003. 1 [7] V. Kolmogoro. Conergent tree-reweighted message passing for energy minimization. IEEE Trans. on Pattern Analysis and Machine Intelligence, 28(10):1568 1583, 2006. 8 [8] V. Kolmogoro and C. Rother. Minimizing nonsbmodlar fnctions with graph cts-a reiew. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(7):1274 1279, 2007. 1, 6 [9] A. Koster, C. P. M. an Hoesel, and A. W. J. Kolen. The partial constraint satisfaction problem: Facets and lifting theorems. Operations Research Letters, 23(3 5):89 97, 1998. 2 [10] V. Lempitsky and Y. Boyko. Global optimization for shape fitting. In IEEE Conf. on Compter Vision and Pattern Recognition (CVPR), pages 1 8, Jne 2007. 7 [11] S. Nowozin, C. Rother, S. Bagon, T. Sharp, B. Yao, and P. Kohli. Decision tree fields. In D. N. Metaxas, L. Qan, A. Sanfeli, and L. J. V. Gool, editors, IEEE International Conference on Compter Vision, pages 1668 1675. IEEE, 2011. 7 [12] D. Průša and T. Werner. Uniersality of the local marginal polytope. IEEE Trans. on Pattern Analysis and Machine Intelligence, 37(4):898 904, April 2015. 8 [13] C. Rother, V. Kolmogoro, V. S. Lempitsky, and M. Szmmer. Optimizing binary MRFs ia extended roof dality. In IEEE Conf. on Compter Vision and Pattern Recognition (CVPR), 2007. 1, 7 [14] D. Schlesinger and B. Flach. Transforming an arbitrary Min- Sm problem into a binary one. Technical Report TUD- FI06-01, Dresden Uniersity of Technology, Germany, April 2006. 1 [15] T. Verma and D. Batra. Maxflow reisited: An empirical comparison of maxflow algorithms for dense ision problems. In Proceedings of the British Machine Vision Conference, pages 61.1 61.12. BMVA Press, 2012. 1, 7 [16] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and ariational inference. Fondations and Trends in Machine Learning, 1(1-2):1 305, 2008. 1 [17] T. Werner. A linear programming approach to max-sm problem: A reiew. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(7):1165 1179, Jly 2007. 1 [18] V. Kolmogoro web pages. http://pb.ist.ac.at/ nk/. 7 [19] Uniersity of Western Ontario Max-flow problem instances in ision. http://ision.csd.wo.ca/data/maxflow/. 7 [20] T. Verma, D. Batra MaxFlow Reisited. http://ttic.chicago.ed/ dbatra/ research/mfcomp/. 7