IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 147 What Energy Functons Can Be Mnmzed va Graph Cuts? Vladmr Kolmogorov, Member, IEEE, and Ramn Zabh, Member, IEEE Abstract In the last few years, several new algorthms based on graph cuts have been developed to solve energy mnmzaton problems n computer vson. Each of these technques constructs a graph such that the mnmum cut on the graph also mnmzes the energy. Yet, because these graph constructons are complex and hghly specfc to a partcular energy functon, graph cuts have seen lmted applcaton to date. In ths paper, we gve a characterzaton of the energy functons that can be mnmzed by graph cuts. Our results are restrcted to functons of bnary varables. However, our work generalzes many prevous constructons and s easly applcable to vson problems that nvolve large numbers of labels, such as stereo, moton, mage restoraton, and scene reconstructon. We gve a precse characterzaton of what energy functons can be mnmzed usng graph cuts, among the energy functons that can be wrtten as a sum of terms contanng three or fewer bnary varables. We also provde a general-purpose constructon to mnmze such an energy functon. Fnally, we gve a necessary condton for any energy functon of bnary varables to be mnmzed by graph cuts. Researchers who are consderng the use of graph cuts to optmze a partcular energy functon can use our results to determne f ths s possble and then follow our constructon to create the approprate graph. A software mplementaton s freely avalable. Index Terms Energy mnmzaton, optmzaton, graph algorthms, mnmum cut, maxmum flow, Markov Random Felds. æ 1 INTRODUCTION AND OVERVIEW MANY of the problems that arse n early vson can be naturally expressed n terms of energy mnmzaton. The computatonal task of mnmzng the energy s usually qute dffcult as t generally requres mnmzng a nonconvex functon n a space wth thousands of dmensons. If the functons have a very restrcted form, they can be solved effcently usng dynamc programmng [2]. However, researchers typcally have needed to rely on general purpose optmzaton technques such as smulated annealng [3], [16], whch requres exponental tme n theory and s extremely slow n practce. In the last few years, however, a new approach has been developed based on graph cuts. The basc technque s to construct a specalzed graph for the energy functon to be mnmzed such that the mnmum cut on the graph also mnmzes the energy (ether globally or locally). The mnmum cut, n turn, can be computed very effcently by max flow algorthms. These methods have been successfully used for a wde varety of vson problems, ncludng mage restoraton [9], [10], [18], [21], stereo and moton [4], [9], [10], [20], [24], [27], [32], [35], [36], mage synthess [29], mage segmentaton [8], voxel occupancy [39], multcamera scene reconstructon [28], and medcal magng [5], [6], [25], [26]. The output of these algorthms s generally a soluton wth some nterestng theoretcal qualty guarantee. In some cases [9], [18], [20], [21], [35], t s the global mnmum, n other cases, a local mnmum n a strong sense [10] that s wthn a known factor of the global. The authors are wth the Computer Scence Department, Cornell Unversty, Ithaca, NY 14853. E-mal: {vnk, rdz}@cs.cornell.edu. Manuscrpt receved 2 May 2002; revsed 17 Mar. 2003; accepted 16 June 2003. Recommended for acceptance by M.A.T. Fgueredo, E.R. Hancock, M. Pelllo, and J. Zeruba. For nformaton on obtanng reprnts of ths artcle, please send e-mal to: tpam@computer.org, and reference IEEECS Log Number 118731. mnmum. The expermental results produced by these algorthms are also qute good. For example, two recent evaluatons of stereo algorthms usng real magery wth dense ground truth [37], [41] found that the best overall performance was due to algorthms based on graph cuts. Mnmzng an energy functon va graph cuts, however, remans a techncally dffcult problem. Each paper constructs ts own graph specfcally for ts ndvdual energy functon and, n some of these cases (especally [10], [27], [28]), the constructon s farly complex. One consequence s that researchers sometmes use heurstc methods for optmzaton, even n stuatons where the exact global mnmum can be computed va graph cuts. The goal of ths paper s to precsely characterze the class of energy functons that can be mnmzed va graph cuts and to gve a general-purpose graph constructon that mnmzes any energy functon n ths class. Our results play a key role n [28], provde a sgnfcant generalzaton of the energy mnmzaton methods used n [4], [5], [6], [10], [18], [26], [32], [39], and show how to mnmze an nterestng new class of energy functons. In ths paper, we only consder energy functons nvolvng bnary-valued varables. At frst glance, ths restrcton seems severe snce most work wth graph cuts consders energy functons wth varables that have many possble values. For example, the algorthms presented n [10] use graph cuts to address the standard pxel labelng problem that arses n early vson, ncludng stereo, moton, and mage restoraton. In the pxel-labelng problem, the varables represent ndvdual pxels and the possble values for an ndvdual varable represent, e.g., ts possble dsplacements or ntenstes. However, as we dscuss n Secton 2, many of the graph cut methods that handle multple possble values work by repeatedly mnmzng an energy functon nvolvng only bnary varables. As we wll see, our results generalze many graph cut algorthms [4], [5], [6], [10], [18], [26], [39] and can be easly appled to 0162-8828/04/$20.00 ß 2004 IEEE Publshed by the IEEE Computer Socety
148 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 problems lke pxel labelng, even though the pxels have many possble labels. 1.1 Summary of Our Results In ths paper, we focus on two classes of energy functons. Let fx 1 ;...;x n g, x 2f0; 1g be a set of bnary-valued varables. We defne the class F 2 to be functons that can be wrtten as a sum of functons of up to two bnary varables at a tme, Eðx 1 ;...;x n Þ¼ E ðx Þþ E ;j ðx ;x j Þ: ð1þ <j We defne the class F 3 to be functons that can be wrtten as a sum of functons of up to three bnary varables at a tme, Eðx 1 ;...;x n Þ¼ E ðx Þþ E ;j ðx ;x j Þ <j þ E ;j;k ðx ;x j ;x k Þ: Obvously, the class F 2 s a strct subset of the class F 3. However, to smplfy the dscusson, we wll begn by focusng on F 2. Note that there s no restrcton on the sgns of the energy functons or of the ndvdual terms. The man results n ths paper are:. A precse characterzaton of the class of functons n F 3 that can be mnmzed usng graph cuts.. A general-purpose graph constructon for mnmzng any functon n ths class.. A necessary condton for any functon of bnary varables to be mnmzed va graph cuts. Therestofthepapersorganzedasfollows:InSecton2,we descrbe how graph cuts can be used to mnmze energy functons and dscuss the mportance of energy functons wth bnary varables n the context of two example vson problems, namely, stereo and multcamera scene reconstructon. In Secton 3, we formalze the relatonshp between graphs and energy functons and provde a precse defnton of the problem that we wsh to solve. Secton 4 contans our man theorem for the class of functons F 2 and shows how ths result can be used for both stereo and multcamera scene reconstructon. Secton 5 gves our results for the broader class F 3 and shows how ths result can be used for multcamera scene reconstructon. A necessary condton for an arbtrary functon of bnary varables to be mnmzed va graph cuts s presented n Secton 6. We dscuss some related work n the theory of submodular functons n Secton 7 and gve a summary of our graph constructons n Secton 8. Software can be downloaded from http://www.cs.cornell.edu/~rdz/ graphcuts.html that automatcally constructs the approprate graph and mnmzes the energy. An NP-hardness result and a proof of one of our theorems are deferred to the Appendx. 2 MINIMIZING ENERGY FUNCTIONS VIA GRAPH CUTS Many vson problems, especally n early vson, can naturally be formulated n terms of energy mnmzaton. Energy mnmzaton has a long hstory n computer vson (see [34] for several examples). The classcal use of energy mnmzaton s to solve the pxel-labelng problem, whch s a generalzaton of such problems as stereo, moton, and mage restoraton. The nput s a set of pxels P and a set of ð2þ labels L. The goal s to fnd a labelng f (.e., a mappng from P to L) whch mnmzes some energy functon. A standard form of the energy functon s EðfÞ ¼ D p ðf p Þþ V p;q ðf p ;f q Þ; ð3þ p2p p;q2n where NPP s a neghborhood system on pxels. 1 D p ðf p Þ s a functon derved from the observed data that measures the cost of assgnng the label f p to the pxel p. V p;q ðf p ;f q Þ measures the cost of assgnng the labels f p ;f q to the adjacent pxels p; q and s used to mpose spatal smoothness. At the borders of objects, adjacent pxels should often have very dfferent labels and t s mportant that E not overpenalze such labelngs. Ths requres that V be a nonconvex functon of jf p f q j. Such an energy functon s called dscontnuty-preservng. Energy functons of the form (3) can be justfed on Bayesan grounds usng the wellknown Markov Random Felds (MRF) formulaton [16], [31]. Energy functons lke E are extremely dffcult to mnmze, however, as they are nonconvex functons n a space wth many thousands of dmensons. They have tradtonally been mnmzed wth general-purpose optmzaton technques (such as smulated annealng) that can mnmze an arbtrary energy functon. As a consequence of ther generalty, however, such technques requre exponental tme and are extremely slow n practce. In the last few years, however, effcent algorthms have been developed for these problems based on graph cuts. 2 2.1 Graph Cuts Suppose G¼ðV; EÞ s a drected graph wth nonnegatve edge weghts that has two specal vertces (termnals), namely, the source s and the snk t. Ans-t-cut (whch we wll refer to nformally as a cut) C ¼ S;T s a partton of the vertces n V nto two dsjont sets S and T such that s 2 S and t 2 T. The cost of the cut s the sum of costs of all edges that go from S to T: cðs;tþ ¼ cðu; vþ: u2s;v2t;ðu;vþ2e The mnmum s-t-cut problem s to fnd a cut C wth the smallest cost. Due to the theorem of Ford and Fulkerson [14], ths s equvalent to computng the maxmum flow from the source to snk. There are many algorthms that solve ths problem n polynomal tme wth small constants [1], [7], [17]. It s convenent to note a cut C ¼ S;T by a labelng f mappng from the set of the vertces V fs; tg to f0; 1g, where fðvþ ¼0 means that v 2 S and fðvþ ¼1 means that v 2 T. We wll use ths notaton later. Note that a cut s a bnary partton of a graph vewed as a labelng; t s a bnary-valued labelng. Whle there are generalzatons of the mnmum s-t-cut problem that nvolve more than two termnals (such as the multway cut problem [12]), such generalzatons are NP-hard. 1. Wthout loss of generalty, we can assume that N contans only ordered pars p; q for whch p<qsnce we can combne two terms V p;q and V q;p nto one term. 2. It s nterestng to note that some smlar technques have been developed by algorthms researchers workng on the task assgnment problem [30], [33], [40].
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 149 -expanson move from the current labelng. If ths expanson move has lower energy than the current labelng, then t becomes the current labelng. The algorthm termnates wth a labelng that s a local mnmum of the energy wth respect to expanson moves; more precsely, there s no -expanson move, for any label, wth lower energy. It s also possble to prove that such a local mnmum les wthn a multplcatve factor of the global mnmum [10] (the factor s at least 2 and depends only on V ). Fg. 1. An example of an expanson move. The labelng on the rght s a whte-expanson move from the labelng on the left. 2.2 Energy Mnmzaton Usng Graph Cuts In order to mnmze E usng graph cuts, a specalzed graph s created such that the mnmum cut on the graph also mnmzes E (ether globally or locally). The form of the graph depends on the exact form of V and on the number of labels. In certan restrcted stuatons, t s possble to effcently compute the global mnmum. If there are only two labels, [18] showed how to compute the global mnmum of E. Ths s also possble for an arbtrary number of labels as long as the labels are consecutve ntegers and V s the L 1 dstance. The constructon s due to [20] and s a modfed verson of [36]. Ths constructon has been further generalzed to handle an arbtrary convex V [22]. However, a convex V s not dscontnuty preservng and optmzng an energy functon wth such a V leads to oversmoothng at the borders of objects. The ablty to fnd the global mnmum effcently, whle theoretcally of great value, does not overcome ths drawback. Ths s clearly vsble n the relatvely poor performance of such algorthms on the stereo benchmarks descrbed n [37]. Moreover, effcent global energy mnmzaton algorthms for even the smplest class of dscontnuty-preservng energy functons almost certanly do not exst. Consder V ðf p ;f q Þ¼T½f p 6¼ f q Š, where the ndcator functon T½Š s 1 f ts argument s true and otherwse 0. Ths smoothness term, sometmes called the Potts model, s clearly dscontnutypreservng. Yet, t s known to be NP-hard to mnmze [10]. However, graph cut algorthms have been developed that compute a local mnmum n a strong sense [10]. As we shall see, these methods mnmze an energy functon wth nonbnary varables by repeatedly mnmzng an energy functon wth bnary varables. 3 2.3 The Expanson Move Algorthm One of the most effectve algorthms for mnmzng dscontnuty-preservng energy functons s the expanson move algorthm, whch was ntroduced n [10]. Ths algorthm can be used whenever V s a metrc on the space of labels; ths ncludes several mportant dscontnutypreservng energy functons (see [10] for more detals). Consder a labelng f and a partcular label. Another labelng f 0 s defned to be an -expanson move from f f f 0 p 6¼ mples f0 p ¼ f p. Ths means that the set of pxels assgned the label has ncreased when gong from f to f 0. An example of an expanson move s shown n Fg. 1. The expanson move algorthm cycles through the labels n some order (fxed or random) and fnds the lowest energy 3. Ths s somewhat analogous to the ZM algorthm [13], although that method produces a global mnmum of a functon wth nonbnary varables by repeatedly mnmzng a functon wth bnary varables. 2.3.1 Applcatons of the Expanson Move Algorthm Energy functons lke that of (3) arse commonly n early vson, so t s straghtforward to apply the expanson move algorthm to many vson problems. For example, the expanson move algorthm can be used for the standard 2-camera stereo problem, as well as for multcamera scene reconstructon. Stereo. For the stereo problem, the labels are dspartes and the data term D p ðf p Þ s some functon of the ntensty dfference between the pxel p n the prmary mage and the pxelp þ f p n thesecondarymage.onthestereoproblem,the expanson move algorthm and other closely related energy mnmzaton algorthms gve very strong emprcal performance. They have been used as a crtcal component of several algorthms [4], [27], [28], [32], ncludng the top two algorthms (and fve of the top sx) accordng to a recent publshed survey [37] that used real data wth dense ground truth. Multcamera scene reconstructon. It s also possble to use the expanson move algorthm for multcamera scene reconstructon, as reported n [28]. In ther problem formulaton, the energy functon conssts entrely of terms wth two arguments (n other words, D p does not exst). The set of pxels P ncludes all pxels from all cameras and the labels correspond to planes. Thus, a par hp; f p specfes a 3D pont and a labelng f gves the nearest scene element vsble n each pxel n each camera. The expanson move algorthm can be straghtforwardly appled to ths problem as long as the energy s of the rght form. Note that an -expanson move ncreases the set of pxels labeled n all cameras at once, rather than n a sngle preferred mage. The actual energy functon conssts of three terms. All the terms are n the same form as V n that they depend on pars of labels f p ;f q. One term enforces the vsblty constrant. Whle ths constrant s very mportant, t has no analogue n the smple pxel labelng problem formulaton and we wll not dscuss t here. The other terms are analogous to the standard stereo data term and smoothness term. In the multcamera scene reconstructon problem, the data term enforces photoconsstency. Let I be a set of pars of nearby 3D ponts. These ponts wll come from dfferent cameras, but they wll share the same depth (.e., the ponts are of the form hp; f p ; hq; f q where f p ¼ f q and p; q are pxels from dfferent cameras). Then, the data term s D p;q;l ðf p ;f q Þ; ð4þ where fhðp;lþ;hðq;lþ2ig D p;q;l ðf p ;f q Þ¼Dðp; qþt½f p ¼ f q ¼ lš: We have rewrtten each term n the sum as a functon of f p ;f q (snce p; q do not depend on f). The functon D wll be
150 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 some functon of the ntensty dfference between p and q, typcally, a monotoncally ncreasng functon. The smoothness term, on the other hand, nvolves a sngle camera at a tme. It s defned to be V p;q ðf p ;f q Þ; ð5þ fp;qg2n where N s a neghborhood system on pxels n a sngle camera. 2.4 Energy Functons of Bnary Varables The key subproblem n the expanson move algorthm s to compute the lowest energy labelng wthn a sngle -expanson of f. Ths subproblem s solved effcently wth a sngle graph cut [10], usng a somewhat ntrcate graph constructon. It s mportant to note that ths subproblem s an energy mnmzaton problem over bnary varables, even though the overall problem that the expanson move algorthm s solvng nvolves nonbnary varables. Ths s because each pxel wll ether keep ts old value under f or acqure the new label. Formally, any labelng f 0 wthn a sngle -expanson of the ntal labelng f can be encoded by a bnary vector x ¼fx p jp 2Pg, where f 0 ðpþ ¼fðpÞ f x p ¼ 0 and f 0 ðpþ ¼ f x p ¼ 1. Let us denote the labelng defned by a bnary vector x as f x. Snce the energy functon E s defned over all labelngs, t s obvously also defned over labelngs specfed by bnary vectors. The key step n the expanson move algorthm s therefore to fnd the mnmum of Eðf x Þ over all bnary vectors x. The mportance of energy functons of bnary varables does not arse smply from the expanson move algorthm. Instead, t results from the fact that a graph cut effectvely assgns one of two possble values to each vertex of the graph. So, n a certan sense, any energy mnmzaton constructon based on graph cuts reles on ntermedate bnary varables. It s, of course, possble to use graph cuts n such a manner that varables n the orgnal problem do not correspond n a one-to-one manner wth vertces n the graph. Ths knd of complex transformaton les at the heart of the graph cut algorthms that compute a global mnmum [20], [22]. However, the NP-hardness result gven n [10] shows that (unless P = NP) such result cannot be acheved for even the smplest dscontnuty-preservng energy functon. 3 REPRESENTING ENERGY FUNCTIONS WITH GRAPHS Let us consder a graph G¼ðV; EÞ wth termnals s and t, thus V¼fv 1 ;...;v n ;s;tg. Each cut on G has some cost; therefore, G represents the energy functon mappng from all cuts on G to the set of nonnegatve real numbers. Any cut can be descrbed by n bnary varables x 1 ;...;x n correspondng to vertces n G (excludng the source and the snk): x ¼ 0 when v 2 S and x ¼ 1 when v 2 T. Therefore, the energy E that G represents can be vewed as a functonof n bnaryvarables: Eðx 1 ;...;x n Þ s equal to the cost of the cut defned by the confguraton x 1 ;...;x n (x 2f0; 1g). Note that the confguraton that mnmzes E wll not change f we add a constant to E. We can effcently mnmze E by computng the mnmum s-t-cut on G. Ths naturally leads to the queston: What s the class of energy functons E for whch we can construct a graph that represents E? We can also generalze our constructon. Above, we used each vertex (except the source and the snk) for encodng one bnary varable. Instead, we can specfy a subset V 0 ¼fv 1 ;...;v k gv fs; tg and ntroduce varables only for the vertces n ths set. Then, there may be several cuts correspondng to a confguraton x 1 ;...;x k. If we defne the energy Eðx 1 ;...;x k Þ as the mnmum among the costs of all such cuts, then the mnmum s-t-cut on G wll agan yeld the confguraton whch mnmzes E. We wll summarze the graph constructons that we allow n the followng defnton. Defnton 3.1. A functon E of n bnary varables s called graphrepresentable f there exsts a graph G¼ðV; EÞ wth termnals s and t and a subset of vertces V 0 ¼fv 1 ;...;v n gv fs; tg such that, for any confguraton x 1 ;...;x n, the value of the energy Eðx 1 ;...;x n Þ s equal to a constant plus the cost of the mnmum s-t-cut among all cuts C ¼ S; T n whch v 2 S, f x ¼ 0, and v 2 T, fx ¼ 1 (1 n). We say that E s exactly represented by G, V 0 f ths constant s zero. The followng lemma s an obvous consequence of ths defnton. Lemma 3.2. Suppose the energy functon E s graph-representable by a graph G and a subset V 0. Then, t s possble to fnd the exact mnmum of E n polynomal tme by computng the mnmum s-t-cut on G. In ths paper, we wll gve a complete characterzaton of the classes F 2 and F 3 n terms of graph representablty and show how to construct graphs for mnmzng graphrepresentable energes wthn these classes. Moreover, we wll gve a necessary condton for all other classes that must be met for a functon to be graph-representable. Obvously, t would be suffcent to consder only the class F 3 snce F 2 F 3. However, the condton for F 2 s smpler, so we wll consder t separately. Note that the energy functons we consder can be negatve, as can the ndvdual terms n the energy functons. However, the graphs that we construct must have nonnegatve edge weghts. All prevous work that used graph cuts for energy mnmzaton dealt wth nonnegatve energy functons and terms. 4 THE CLASS F 2 Our man result for the class F 2 s the followng theorem. Theorem 4.1 (F 2 Theorem). Let E be a functon of n bnary varables from the class F 2,.e., Eðx 1 ;...;x n Þ¼ E ðx Þþ E ;j ðx ;x j Þ: ð6þ <j Then, E s graph-representable f and only f each term E ;j satsfes the nequalty E ;j ð0; 0ÞþE ;j ð1; 1Þ E ;j ð0; 1ÞþE ;j ð1; 0Þ: We wll call functons satsfyng the condton of (7) regular. 4 Ths theorem thus states that regularty s a necessary and suffcent condton for graph-representablty, at least n F 2. 4. The representaton of a functon E as a sum n (6) s not unque. However, we wll show n Secton 5 that the defnton of regularty does not depend on ths representaton. ð7þ
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 151 TABLE 1 TABLE 2 Fg. 2. Graphs that represent some functons n F 2. (a) Graph for E, where E ð0þ >E ð1þ. (b) Graph for E, where E ð0þ 6> E ð0þ. (c) Thrd edge for E ;j. (d) Complete graph for E ;j f C>Aand C>D. In ths secton, we wll gve a constructve proof that regularty s a suffcent condton by descrbng how to buld a graph that represents an arbtrary regular functon n F 2. The other half of the theorem s proven n much more generalty n Secton 6, where we show that a regularty s a necessary condton for any functon to be graph-representable. Regularty s thus an extremely mportant property as t allows energy functons to be mnmzed usng graph cuts. Moreover, wthout the regularty constrant, the problem becomes ntractable. Our next theorem shows that mnmzng an arbtrary nonregular functon s NP-hard, even f we restrct our attenton to F 2. Theorem 4.2 (NP-Hardness). Let E 2 be a nonregular functon of two varables. Then, mnmzng functons of the form Eðx 1 ;...;x n Þ¼ E ðx Þþ E 2 ðx ;x j Þ; ð;jþ2n where E are arbtrary functons of one varable and Nfð; jþj1 <j ng, s NP-hard. The proof of ths theorem s deferred to the Appendx. Note that ths theorem mples the ntractablty of mnmzng the entre class of nonregular functons n F 2. It thus allows for the exstence of nonregular functons n F 2 that can be mnmzed effcently. 5 4.1 Graph Constructon for F 2 In ths secton, we wll prove the constructve part of the F 2 theorem. Let E be a regular functon of the form gven n 5. Gortler et al. have nvestgated a partcularly smple example, namely, functons that can be made regular by nterchangng the senses of 0 and 1 for some set of varables. the theorem. We wll construct a graph for each term separately and then merge all graphs together. Ths s justfed by the addtvty theorem, whch s gven n the Appendx. The graph wll contan n þ 2 vertces: V¼fs; t; v 1 ;...;v n g. Each nontermnal vertex v wll encode the bnary varable x. For each term of E, we wll add one or more edges that we descrbe next. Frst, consder a term E dependng on one varable x.if E ð0þ <E ð1þ, then we add the edge ðs; v Þ wth the weght E ð1þ E ð0þ (Fg. 2a). Otherwse, we add the edge ðv ;tþ wth the weght E ð0þ E ð1þ (Fg. 2b). It easy to see that, n both cases, the constructed graph represents the functon E (but wth dfferent constants: E ð0þ n the former case and E ð1þ n the latter). Now, consder a term E ;j dependng on two varables x, x j. It s convenent to represent t n Table 1. We can rewrte t as t s expressed n Table 2. The frst term s a constant functon, so we don t need to add any edges for t. The second and the thrd terms depend only on one varable x and x j, respectvely. Therefore, we can use the constructon gven above. To represent the last term, we add an edge ðv ;v j Þ wth the weght B þ C A D (Fg. 2c). Note that ths weght s nonnegatve due to the regularty of E. A complete graph for E ;j wll contan three edges. One possble case (C A>0 and D C<0) s llustrated n Fg. 2d. Note that we dd not ntroduce any addtonal vertces for representng bnary nteractons of bnary varables. Ths s n contrast to the constructon n [10] whch added auxlary vertces for representng energes that we just consdered. Our constructon yelds a smaller graph and, thus, the mnmum cut can potentally be computed faster. 4.2 Example: Applyng Our Results for Stereo and Multcamera Scene Reconstructon Our results gve a generalzaton of a number of prevous algorthms [4], [5], [6], [10], [18], [26], [32], [39] n the followng sense. Each of these methods used a graph cut algorthm that was specfcally constructed to mnmze a certan form of energy functon. The class of energy functons that we show how to mnmze s much larger and ncludes the technques used n all of these methods as specal cases. To llustrate the power of our results, we return to the use of the expanson move algorthm for stereo and multcamera scene reconstructon descrbed n Secton 2.3.1. Recall that the key subproblem s to fnd the mnmum energy labelng wthn a sngle -expanson of f, whch s equvalent to mnmzng the energy over bnary varables x.
152 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 TABLE 3 TABLE 5 4.2.1 Stereo The paper that proposed the expanson move algorthm [10] showed how to solve the key subproblem wth graph cuts as long as D p s nonnegatve and V s a metrc on the space of labels. Ths nvolved an elaborate graph constructon and several assocated theorems. Usng our results, we can recreate the proof that such an energy functon can be solved n just a few lnes. All that we need s to prove that E s regular f D p s nonnegatve and V s a metrc. Obvously, the form of D p doesn t matter; we smply have to show that f V s a metrc the ndvdual terms satsfy the nequalty n (7). Table 3 consders two neghborng pxels p; q, wth assocated bnary varables ; j. If V s a metrc, by defnton, for any labels ; ; V ð; Þ ¼ 0 and V ð; ÞþV ð; Þ V ð; Þ. Ths gves the nequalty of (7) and shows that E can be mnmzed usng graph cuts. 4.2.2 Multcamera Scene Reconstructon We can also show that the expanson move algorthm can be used for a dfferent energy functon, namely, the one proposed for multcamera scene reconstructon n [28]. We need to show that the smoothness (5) and the data (4) energy functons are regular. The argument for the smoothness term s dentcal to the one we just gave for stereo because the smoothness term s regular. The data term for multcamera scene reconstructon s perhaps the best demonstraton of the utlty of our results. The functon n the data term s clearly not a metrc on the space of labels (for example, t s entrely possble that D p;q;l ð; Þ 6¼ 0 for some label ). Pror to our results, the only way to use the expanson move algorthm for ths energy functon would be to create an elaborate graph constructon that s specfc to ths partcular energy functon. Worse, t was not a pror obvous that such a constructon exsted. Usng our results, t s smple to show that the data term can be made regular and the expanson move algorthm can be appled. Let us wrte out the sum n (4) and consder a sngle termdðp; qþt½f p ¼ f q ¼ lš,whchmposesapenaltyfpandq both have the label l. Recall that we are seekng the lowest energy labelng wthn a sngle -expanson of f. We wll have two bnary varables whch are 0 f the assocated pxel keeps ts orgnal label n f and 1 f t acqures the new label. There are two cases that must be analyzed separately: l 6¼ and l ¼. Consder the case l 6¼.Iff p 6¼ l or f q 6¼ l, then the term s zero for any values of the bnary varables. If f p ¼ f q ¼ l, then the penalty s only mposed when the bnary varables are both 0. We can graphcally show ths n Table 4. If l ¼, then, f ether f p or f q are, the penalty mposed s ndependent of one of the bnary varables and t s easy to see that ths s regular. If l ¼ and nether f p nor f q are, then the penalty s only mposed when the bnary varables are both 1. Graphcally, ths can be shown n Table 5. Overall, the energy functon s regular f Dðp; qþ s nonpostve. Recall that Dðp; qþ s a photoconsstency term. It s typcally a monotoncally ncreasng functon of the ntensty dfference between p and q. By smply subtractng a large constant, we can ensure that D s nonpostve and, thus, apply our constructon. 5 THE CLASS F 3 Before statng our theorem for the class F 3, we wll extend the defnton of regularty to arbtrary functons of bnary varables. Ths requres a noton of projectons. Suppose we have a functon E of n bnary varables. If we fx m of these varables, then we get a new functon E 0 of n m bnary varables; we wll call ths functon a projecton of E. The notaton for projectons s as follows: Defnton 5.1. Let Eðx 1 ;...;x n Þ be a functon of n bnary varables and let I, J be a dsjont partton of the set of ndces f1;...;ng: I ¼fð1Þ;...;ðmÞg, J ¼fjð1Þ;...;jðn mþg. Let ð1þ ;...; ðmþ be bnary constants. A projecton E 0 ¼ E½x ð1þ ¼ ð1þ ;...;x ðmþ ¼ ðmþ Š s a functon of n m varables defned by E 0 ðx jð1þ ;...;x jðn mþ Þ¼Eðx 1 ;...;x n Þ; where x ¼ for 2 I. We say that we fx the varables x ð1þ,..., x ðmþ. Now, we can gve a generalzed defnton of regularty. Defnton 5.2.. All functons of one varable are regular.. A functon E of two varables s called regular f Eð0; 0ÞþEð1; 1Þ Eð0; 1ÞþEð1; 0Þ.. A functon E of more than two varables s called regular f all projectons of E of two varables are regular. For the class F 2, ths defnton s equvalent to the prevous one. A proof of ths fact s gven n Secton 5.3. Now, we are ready to formulate our man theorem for F 3. Theorem 5.3 (F 3 Theorem). Let E be a functon of n bnary varables from F 3,.e., TABLE 4 Eðx 1 ;...;x n Þ¼ E ðx Þþ E ;j ðx ;x j Þ <j þ E ;j;k ðx ;x j ;x k Þ: ð8þ Then, E s graph-representable f and only f E s regular.
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 153 TABLE 6 5.1 Graph Constructon for F 3 Consder a regular functon E of the form gven n (8). The regularty of E does not necessarly mply that each term n (8) s regular. However, we wll gve an algorthm for convertng any regular functon n F 3 to the form (8) where each term s regular (see the regroupng theorem n the next secton). Note that ths was not necessary for the class F 2 snce, for any representaton of a regular functon n the form (6), each term n the sum s regular. Here, we wll gve graph constructon for a regular functon E of the form as n the F 3 theorem assumng that each term of E s regular. The graph wll contan n þ 2 vertces: V¼fs; t; v 1 ;...;v n g as well as some addtonal vertces descrbed below. Each nontermnal vertex v wll encode the bnary varable x. For each term of E, we wll add one or more edges and, possbly, an addtonal (unque) vertex that we descrbe next. Agan, our constructon s justfed by the addtvty theorem gven n the Appendx. Terms dependng on one and two varables were consdered n the prevous secton, so we wll concentrate on a term E ;j;k dependng on three varables x, x j, x k.it s convenent to represent t n Table 6. Let us denote P ¼ðA þ D þ F þ GÞ ðb þ C þ E þ HÞ. We consder two cases: Case 1. P>¼ 0. E ;j;k can be rewrtten as shown n Table 7, where P 1 ¼ F B, P 2 ¼ G E, P 3 ¼ D C, P 23 ¼ B þ C A D, P 31 ¼ B þ E A F, P 12 ¼ C þ E A G. E ;j;k s regular, therefore, by consderng projectons E ;j;k ½x 1 ¼ 0Š, E ;j;k ½x 2 ¼ 0Š, E ;j;k ½x 3 ¼ 0Š, we conclude that P 23, P 31, P 12 are nonnegatve. Thefrsttermsaconstant functon, sowedon tneed toadd any edges for t. The next three terms depend only on one varable (x, x j, x k, respectvely), so we can use the constructon gven n the prevous secton. The next three terms depend on two varables; the nonnegatvty of P 23, P 31, P 12 mples that they are all regular. Therefore, we can agan use the constructon of the prevous secton. (Three edges wll be added: ðv j ;v k Þ, ðv k ;v Þ, ðv ;v j Þ wth weghts P 23, P 31, P 12, respectvely). To represent the last term, we wll add an auxlary vertex u jk andfouredgesðv ;u jk Þ,ðv j ;u jk Þ,ðv k ;u jk Þ,ðu jk ;tþwth the Fg. 3. Graphs for functons n F 3. (a) Edge weghts are all P. (b) Edge weghts are all P. TABLE 8 weght P. Ths s shown n Fg. 3a. Let us prove these four edges exactly represent the functon shown n Table 8. If x ¼ x j ¼ x k ¼ 1, then the cost of the mnmum cut s 0 (the mnmum cut s S ¼fsg, T ¼fv ;v j ;v k ;u jk ;tg. Suppose at least one of the varables x, x j, x k s 0; wthout loss of generalty, we can assume that x ¼ 0,.e., v 2S.Ifu jk 2 S, then the edge ðu jk ;tþ s cut; f u jk 2 T, the edge ðv ;u jk Þ s cut yeldng the cost P. Hence, the cost of the mnmum cut s at least P. However, f u jk 2 S, the cost of the cut s exactly P,no matter wherethe vertces v, v j, v k are. Therefore, the cost of the mnmum cut wll be 0 f x ¼ x j ¼ x k ¼ 1 and P otherwse, as desred. Case 2. P<0. Ths case s smlar to the Case 1. E ;j;k can be rewrtten as shown n Table 9, where P 1 ¼ C G, P 2 ¼ B D, P 3 ¼ E F, P 32 ¼ F þ G E H, P 13 ¼ D þ G C H, P 21 ¼ D þ F B H. By consderng projectons E ;j;k ½x 1 ¼ 1Š, E ;j;k ½x 2 ¼ 1Š, E ;j;k ½x 3 ¼ 1Š, we conclude that P 32, P 13, P 21 are nonnegatve. As n the prevous case, all terms except the last are regular and depend on at most two varables, so we can use the constructon of the prevous secton. We wll have, for example, edges ðv k ;v j Þ, ðv ;v k Þ, ðv j ;v Þ wth weghts P 32, P 13, P 21, respectvely. To represent the last term, we wll add an auxlary vertex u jk and four edges ðu jk ;v Þ, ðu jk ;v j Þ, ðu jk ;v k Þ, ðs; u jk Þ wth the weght P. Ths s shown n Fg. 3b. TABLE 7
154 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 TABLE 9 Note that, n both cases, we added an auxlary vertex u jk. It s easy to see that ths s necessary snce graphs wthout auxlary vertces can only represent functons n F 2. Each edge represents a functon of at most two varables, so the whole graph represents a functon that s a sum of terms of at most two varables. 5.2 Constructve Proof of the Regroupng Theorem Now, we wll show how to convert a regular functon n F 3 to the form gven n (8) where each term s regular. Theorem 5.4 (regroupng). Any regular functon E from the class F 3 can be rewrtten as a sum of terms such that each term s regular and depends on at most three varables. We wll assume E gven as Eðx 1 ;...;x n Þ¼ E ;j;k ðx ;x j ;x k Þ; where, j, k are ndces from the set f1;...;ng (we omtted terms nvolvng functons of one and two varables snce they can be vewed as functons of three varables). We begn by gvng a defnton. Defnton 5.5. The functonal wll be a mappng from the set of all functons (of bnary varables) to the set of real numbers whch s defned as follows. For a functon Eðx 1 ;...;x n Þ ðeþ ¼ n ¼1 ð 1Þx Eðx1 ;...;x n Þ: x 12f0;1g;...;x n2f0;1g For example, for a functon E of two varables ðeþ ¼ Eð0; 0Þ Eð0; 1Þ Eð1; 0ÞþEð1; 1Þ. Note that a functon E of two varables s regular f and only f ðeþ 0. It s trval to check the followng lemma. Lemma 5.6. The functonal has the followng propertes.. s lnear,.e., for a scalar c and two functons E 0, E 00 of n varables ðe 0 þ E 00 Þ¼ðE 0 ÞþðE 00 Þ and ðc E 0 Þ¼c ðe 0 Þ.. If E s a functon of n varables that does not depend on at least one of the varables, then ðeþ ¼0. We wll prove the theorem usng the followng lemma and a trval nducton argument. Defnton 5.7. Let E ;j;k be a functon of three varables. The functonal NðE ;j;k Þ s defned as the number of projectons of two varables of E ;j;k wth postve values of the functonal. Note that NðE ;j;k Þ¼0 exactly when E ;j;k s regular. Lemma 5.8. Suppose the functon E of n varables can be wrtten as Eðx 1 ;...;x n Þ¼ E ;j;k ðx ;x j ;x k Þ; where some of the terms are not regular. Then, t can be wrtten as Eðx 1 ;...;x n Þ¼ ~E ;j;k ðx ;x j ;x k Þ; where N ~E ;j;k < N E ;j;k : Proof. For smplcty of notaton, let us assume that the term E 1;2;3 s not regular and ðe 1;2;3 ½x 3 ¼ 0ŠÞ > 0 or ðe 1;2;3 ½x 3 ¼ 1ŠÞ > 0 (we can ensure ths by renamng ndces). Let C k ¼ max ðe 1;2;k ½x k ¼ k ŠÞ k 2f0;1g C 3 ¼ n k¼4 C k : k 2f4;...;ng Now, we wll modfy the terms E 1;2;3,..., E 1;2;n as follows: ~E 1;2;k E 1;2;k R½C k Š k 2f3;...;ng; where R½CŠ s the functon of two varables x 1 and x 2 defned by Table 10 (other terms are unchanged: ~E ;j;k E ;j;k, ð; jþ 6¼ ð1; 2Þ). We have Eðx 1 ;...;x n Þ¼ ~E ;j;k ðx ;x j ;x k Þ snce P n k¼3 C k ¼ 0 and P n k¼3 R½C kš0. If we consder R½CŠ as a functon of n varables and take a projecton of two varables where the two varables that are not fxed are x and x j (<j), then the functonal wll be C, fð; jþ ¼ð1; 2Þ, and 0 otherwse (snce, n the latter case, a projecton actually depends on, at most, one varable). Hence, the only projectons of two varables that could have changed ther value of the functonal are ~E 1;2;k ½x 3 ¼ 3 ;...;x n ¼ n Š, k 2f3;...;ng, f we treat ~E 1;2;k as functons of n varables, or ~E 1;2;k ½x k ¼ k Š, f we treat ~E 1;2;k as functons of three varables. TABLE 10
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 155 TABLE 11 TABLE 12 Frst, let us consder terms wth k 2f4;...; ng. We have ðe 1;2;k ½x k ¼ k ŠÞ C k, thus ð ~E 1;2;k ½x k ¼ k ŠÞ ¼ðE 1;2;k ½x k ¼ k ŠÞ ðr½c k нx k ¼ k ŠÞ C k C k ¼ 0: Therefore, we dd not ntroduce any nonregular projectons for these terms. Now, let us consder the term ð ~E 1;2;3 ½x 3 ¼ 3 ŠÞ. We can wrte ð ~E 1;2;3 ½x 3 ¼ 3 ŠÞ ¼ ðe 1;2;3 ½x 3 ¼ 3 ŠÞ ðr½c 3 нx 3 ¼ 3 ŠÞ ¼ ðe 1;2;3 ½x 3 ¼ 3 ŠÞ ð n C k Þ¼ n ðe 1;2;k ½x k ¼ k ŠÞ; k¼4 where k ¼ arg max 2f0;1g ðe 1;2;k ½x k ¼ ŠÞ, k 2f4;...;ng. The last expresson s just ðe½x 3 ¼ 3 ;...;x n ¼ n ŠÞ and s nonpostve snce E s regular by assumpton. Hence, values ð ~E 1;2;3 ½x 3 ¼ 0ŠÞ and ð ~E 1;2;3 ½x 3 ¼ 1ŠÞ are both nonpostve and, therefore, the number of nonregular projectons has decreased. tu The complexty of a sngle step s OðnÞ snce t nvolves modfyng at most n terms. Therefore, the complexty of the whole algorthm s OðmnÞ, where m s the number of terms n (8) snce, n the begnnng, P NðE;j;k Þ s at most 6m and each step decreases t by at least one. 5.3 The Two Defntons of Regularty Are Equvalent We now prove that the defnton of regularty 5.2 s equvalent to the prevous defnton of regularty for the class F 2. Note that the latter one s formulated not only as a property of the functon E but also as a property of ts representaton as a sum n (6). We wll show equvalence for an arbtrary representaton, whch wll mply that the defnton of regularty for the class F 2 depends only on E but not on the representaton. Let us consder a graph-representable functon E from the class F 2 : Eðx 1 ;...;x n Þ¼ E ðx Þþ E ;j ðx ;x j Þ: <j Consder a projecton where the two varables that are not fxed are x and x j. By Lemma 5.6, the value of the functonal of ths projecton s equal to ðe ;j Þ (all other terms yeld zero). Therefore, ths projecton s regular f and only f E ;j ð0; 0ÞþE ;j ð1; 1Þ E ;j ð0; 1ÞþE ;j ð1; 0Þ, whch means that the two defntons are equvalent. k¼3 5.4 Example: Multcamera Scene Reconstructon We now show an example of functon from the class F 3 that s potentally useful n vson. Consder the multcamera scene reconstructon problem descrbed earler. Smlar to the data term dependng on pars of pxels, we can defne a term dependng on trples of pxels. 6 Let I 3 be a set of trples of nearby 3D ponts. These ponts wll come from three dfferent cameras, but they wll share the same depth (.e., the ponts are of the form hp; f q ; hq; f q ; hr; f r, where f p ¼ f q ¼ f r and p; q; r are pxels from dfferent cameras). The new data term wll be D p;q;r;l ðf p ;f q ;f r Þ; ð9þ where fhp;l;hq;l;hr;l2i 3 g D p;q;r;l ðf p ;f q ;f r Þ¼Dðp; q; rþt½f p ¼ f q ¼ f r ¼ lš: Ths term can be motvated as follows: If three pxels have smlar ntenstes, then t s more lkely that they see the same scene element than f only two pxels have smlar ntenstes. We now show the expanson move algorthm can be used for mnmzng ths new term,.e., that the resultng energy functon s regular. The proof proceeds smlarly to the proof n Secton 4.2.2. Consder a sngle term Dðp; q; rþt½f p ¼ f q ¼ f r ¼ lš, whch mposes a penalty f p, q, and r have the label l. We wll have three bnary varables whch are 0 f the assocated pxel keeps ts orgnal label n f and 1 f t acqures the new label. Agan, there are two cases that must be analyzed separately: l 6¼ and l ¼. Consder the case l 6¼. If at least one of labels f p ;f q ;f r s not l, then the term s zero for any values of bnary varables. If f p ¼ f q ¼ f r ¼ l, then the penalty s only mposed when the bnary varables are all 0. Graphcally, we show ths n Table 11. Now, consder the case l ¼. If at least one of the labels f p ;f q ;f r s then the penalty mposed s ndependent of one of the bnary varables; therefore, ths case reduces to a data term dependng on two varables, whch has been analyzed earler. Suppose that all labels f p ;f q ;f r are dfferent from. Then, the penalty s only mposed when the bnary varables are all 1, whch s wrtten graphcally n Table 12. We get the result smlar to the prevous one: The energy functon s regular f Dðp; q; rþ s nonpostve. 6 MORE GENERAL CLASSES OF ENERGY FUNCTIONS Fnally, we gve a necessary condton for graph representablty for arbtrary functons of bnary varables. 6. We thank Rck Szelsk for dscussons that have led to the development of ths term.
156 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 TABLE 13 Theorem 6.1 (regularty). Let E be a functon of bnary varables. If E s not regular, then E s not graph-representable. Ths theorem wll mply the correspondng drectons of the F 2 and F 3 theorems (Theorems 4.1 and 5.3). Defnton 6.2. Let G¼ðV; EÞ be a graph, v 1 ;...;v k be a subset of vertces V, and 1 ;...; k be bnary constants whose values are from f0; 1g. We wll defne the graph G½x 1 ¼ 1 ;...;x k ¼ k Š as follows: Its vertces wll be the same as n G and ts edges wll be all edges of G plus addtonal edges correspondng to vertces v 1 ;...;v k : for a vertex v, we add the edge ðs; v Þ f ¼ 0 or ðv ;tþ f ¼ 1, wth an nfnte capacty. It should be obvous that these edges enforce constrants fðv 1 Þ¼ 1,..., fðv k Þ¼ k n the mnmum cut on G½x 1 ¼ 1 ;...;x k ¼ k Š,.e., f ¼ 0, then v 2 S and f ¼ 1, then v 2 T. (If, for example, ¼ 0 and v 2 T, then the edge ðs; v Þ must be cut yeldng an nfnte cost, so t would not the mnmum cut.) Now, we can gve a defnton of graph representablty whch s equvalent to Defnton 3.1. Ths new defnton wll be more convenent for the proof. Defnton 6.3. We say that the functon E of n bnary varables s exactly represented by the graph G¼ðV; EÞ and the set V 0 V f for any confguraton 1 ;...; n the cost of the mnmum cut on G½x 1 ¼ 1 ;...;x k ¼ k Š s Eð 1 ;...; n Þ. Lemma 6.4. Any projecton of a graph-representable functon s graph-representable. Proof. Let E be a graph-representable functon of n varables and the graph G¼ðV; EÞ and the set V 0 represents E. Suppose that we fx varables x ð1þ ;...;x ðmþ. It s straghtforward to check that the graph G½x ð1þ ¼ ð1þ ;...;x ðmþ ¼ ðmþ Š and the set V 0 0 ¼V 0 fv ð1þ ;...;v ðmþ g represents the functon E 0 ¼ E½x ð1þ ¼ ð1þ ;...;x ðmþ ¼ ðmþ Š. tu Ths lemma mples that t suffces to prove Theorem 3.1 only for energy functons of two varables. Let Eðx 1 ;x 2 Þ be a graph-representable functon of two varables. Let us prove that E s regular,.e., that A 0, where A¼ ð EÞ¼ Eð0; 0Þþ Eð1; 1Þ Eð0; 1Þ Eð1; 0Þ. Suppose ths s not true: A>0. In Table 13, all the functons on the rght sde are graphrepresentable, therefore, by the addtvty theorem (see the Appendx), the functon E s graph-representable as well, where, n Table 14, consder a graph G and a set V 0 ¼fv 1 ;v 2 g representng E. Ths means that there s a constant K such that G, V 0 exactly represent E 0 ðx 1 ;x 2 Þ¼Eðx 1 ;x 2 ÞþK (Table 15). The cost of the mnmum s-t-cut on G s K (snce ths cost s justthemnmumentrynthetablefore 0 );hence,k 0.Thus, thevalueofthemaxmumflowfromstotngsk.letg 0 bethe resdual graph obtaned from G after pushng the flow K. Let E 0 ðx 1 ;x 2 Þ be the functon exactly represented by G 0 ; V 0. By the defnton of graph representablty, E 0 ð 1 ; 2 Þ s equal to the value of the mnmum cut (or maxmum flow) on the graph G½x 1 ¼ 1 ;x 2 ¼ 2 Š. The followng sequence of operatons shows one possble way to push the maxmum flow through ths graph.. Frst, we take the orgnal graph G and push the flow K; then we get the resdual graph G 0. (Ths s equvalent to pushng flow through G½x 1 ¼ 1 ;x 2 ¼ 2 Š, where we do not use edges correspondng to the constrants x 1 ¼ 1 and x 2 ¼ 2 ).. Then, we add edges correspondng to these constrants; we get the graph G 0 ½x 1 ¼ 1 ;x 2 ¼ 2 Š.. Fnally, we push the maxmum flow possble through the graph G 0 ½x 1 ¼ 1 ;x 2 ¼ 2 Š; the amount of ths flow s E 0 ð 1 ; 2 Þ accordng to the defnton of graph representablty. The total amount of flow pushed durng all steps s K þ E 0 ð 1 ; 2 Þ; thus, or E 0 ð 1 ; 2 Þ¼K þ E 0 ð 1 ; 2 Þ Eð 1 ; 2 Þ¼E 0 ð 1 ; 2 Þ: We have proven that E s exactly represented by G 0, V 0. The value of the mnmum cut/maxmum flow on G 0 s 0 (t s the mnmum entry n the table for E); thus, there s no augmentng path from s to t n G 0. However, f we add the edges ðv 1 ;tþ and ðv 2 ;tþ, then there wll be an augmentng path from s to t n G 0 ½x 1 ¼ 1 ;x 2 ¼ 2 Š snce Eð1; 1Þ ¼A>0. Hence, ths augmentng path wll contan at least one of these edges and, therefore, ether v 1 or v 2 wll be n the path. Let P be the part of ths path gong from the source untl v 1 or v 2 s frst encountered. Wthout loss of generalty, we can assume that t wll be v 1. Thus, P s an augmentng path from s to v 1 whch does not contan edges that we added, namely, ðv 1 ;tþ and ðv 2 ;tþ. Fnally, let us consder the graph G 0 ½x 1 ¼ 1;x 2 ¼ 0Š whch s obtaned from G 0 by addng edges ðv 1 ;tþ and ðs; v 2 Þ wth nfnte capactes. There s an augmentng path fp;ðv 1 ;tþg from the source to the snk n ths graph; hence, the mnmum cut/maxmum flow on t s greater than zero or Eð1; 0Þ > 0. We get a contradcton. TABLE 14 TABLE 15
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 157 7 RELATED WORK There s an nterestng relatonshp between regular functons and submodular functons. 7 Let S be a fnte set and g :2 S!Rbe a real-valued functon defned on the set of all subsets of S. g s called submodular f for any ; Y S gðþþgðy Þgð [ Y Þþgð \ Y Þ: See [15], for example, for a dscusson of submodular functons. An equvalent defnton of submodular functons s that g s called submodular f, for any S and ; j 2S, gð [fjgþ gðþ gð [f; jgþ gð [fgþ: Obvously, functons of subsets of S¼f1;...;ng can be vewed as functons of n bnary varables ðx 1 ;...;x n Þ; the ndcator varable x s 1 f s ncluded n and 0 otherwse. Then, t s easy to see that the second defnton of submodularty reduces to the defnton of regularty. Thus, submodular functons and regular functons are essentally the same. We use dfferent names to emphasze a techncal dstncton between them: Submodular functons are functons of subsets of S whle regular functons are functons of bnary varables. From our experence, the second pont of vew s much more convenent for vson applcatons. Submodular functons have receved sgnfcant attenton n the combnatoral optmzaton lterature. A remarkable fact about submodular functons s that they can be mnmzed n polynomal tme [19], [23], [38]. A functon g s assumed to be gven as a value oracle,.e., a black box whch, for any nput subset returns the value gðþ. Unfortunately, algorthms for mnmzng arbtrary submodular functons are extremely slow. For example, the algorthm of [23] runs n Oðn 5 mnflog nm; n 2 log ngþ tme, where M s an upper bound on jgðþj among 2 2 S. Thus, our work can be vewed as dentfyng an mportant subclass of submodular functons for whch much faster algorthm can be used. In addton, a relaton between submodular functons and certan graph-representable functons called cut functons s already known. Usng our termnology, cut functons can be defned as functons that can be represented by a graph wthout the source and the snk and wthout auxlary vertces. It s well-known that cut functons are submodular. Cunnngham [11] characterzes cut functons and gves a general-purpose graph constructon for them. It can be shown that the set of cut functons s a strct subset of F 2. We allow a more general graph constructon; as a result, we can mnmze a larger class of functons, namely, regular functons n F 3. 8 SUMMARY OF GRAPH CONSTRUCTIONS We now summarze the graph constructons used for regular functons. Software can be downloaded from http:// www.cs.cornell.edu/~rdz/graphcuts.html that takes an energy functon as nput and automatcally constructs the approprate graph, then mnmzes the energy. 7. We thank Yur Boykov and Howard Karloff for pontng ths out. Recall that, for a functon Eðx 1 ;...;x n Þ, we defne ðeþ ¼ n ¼1 ð 1Þx Eðx1 ;...;x n Þ: x 1 2f0;1g;...;x n 2f0;1g The notaton edgeðv; cþ wll mean that we add an edge ðs; vþ wth the weght c f c>0 or an edge ðv; tþ wth the weght c f c<0. 8.1 Regular Functons of One Bnary Varable Recall that all functons of one varable are regular. For a functon Eðx 1 Þ, we construct a graph G wth three vertces V¼fv 1 ;s;tg. There s a sngle edge edgeðv 1 ;Eð1Þ Eð0ÞÞ. 8.2 Regular Functons of Two Bnary Varables We now show how to construct a graph G for a regular functon Eðx 1 ;x 2 Þ of two varables. It wll contan four vertces: V¼fv 1 ;v 2 ;s;tg. The edges E are gven below:. edgeðv 1 ;Eð1; 0Þ Eð0; 0ÞÞ;. edgeðv 2 ;Eð1; 1Þ Eð1; 0ÞÞ;. ðv 1 ;v 2 Þ wth the weght ðeþ. 8.3 Regular Functons of Three Bnary Vvarables We next show how to construct a graph G for a regular functon Eðx 1 ;x 2 ;x 3 Þ of three varables. It wll contan fve vertces: V¼fv 1 ;v 2 ;v 3 ;u;s;tg.ifðeþ 0, then the edges are. edgeðv 1 ;Eð1; 0; 1Þ Eð0; 0; 1ÞÞ;. edgeðv 2 ;Eð1; 1; 0Þ Eð1; 0; 0ÞÞ;. edgeðv 3 ;Eð0; 1; 1Þ Eð0; 1; 0ÞÞ;. ðv 2 ;v 3 Þ wth the weght ðe½x 1 ¼ 0ŠÞ;. ðv 3 ;v 1 Þ wth the weght ðe½x 2 ¼ 0ŠÞ;. ðv 1 ;v 2 Þ wth the weght ðe½x 3 ¼ 0ŠÞ;. ðv 1 ;uþ, ðv 2 ;uþ, ðv 3 ;uþ, ðu; tþ wth the weght ðeþ. If ðeþ < 0, then the edges are. edgeðv 1 ;Eð1; 1; 0Þ Eð0; 1; 0ÞÞ;. edgeðv 2 ;Eð0; 1; 1Þ Eð0; 0; 1ÞÞ;. edgeðv 3 ;Eð1; 0; 1Þ Eð1; 0; 0ÞÞ;. ðv 3 ;v 2 Þ wth the weght ðe½x 1 ¼ 1ŠÞ;. ðv 1 ;v 3 Þ wth the weght ðe½x 2 ¼ 1ŠÞ;. ðv 2 ;v 1 Þ wth the weght ðe½x 3 ¼ 1ŠÞ;. ðu; v 1 Þ, ðu; v 2 Þ, ðu; v 3 Þ, ðs; uþ wth the weght ðeþ. APPENDI A PROOF OF THE NP-HARDNESS THEOREM We now gve a proof of the NP-hardness theorem (Theorem 4.2), whch shows that, wthout the regularty constrant, t s ntractable to mnmze energy functons n F 2. Proof. Addng functons of one varable does not change the class of functons that we are consderng. Thus, we can assume wthout loss of generalty that E 2 has the form that s shown n Table 16. (We can transform an arbtrary functon of two varables to ths form as follows: We subtract a constant from the frst row to make the upper left element zero, then we subtract a constant from the second row to make the bottom left element zero, and, fnally, we subtract a constant from the second column to make the upper rght element zero.) These transformatons preserve the functonal,soe 2 s nonregular, whch means that A>0.
158 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004 We wll prove the theorem by reducng the maxmum ndependent set problem, whch s known to be NP-hard, to our energy mnmzaton problem. Let an undrected graph G¼ðV; EÞ be the nput to the maxmum ndependent set problem. A subset UVs sad to be ndependent f, for any two vertces u; v 2U, the edge ðu; vþ s not n E. The goal s to fnd an ndependent subset U Vof maxmum cardnalty. We construct an nstance of the energy mnmzaton problem as follows: There wll be n ¼jVjbnary varables x 1 ;...;x n correspondng to the vertces v 1 ;...;v n of V. Let us consder the energy Eðx 1 ;...;x n Þ¼ E ðx Þþ E 2 ðx ;x j Þ; ð;jþ2n where E ðx Þ¼ A 2n x and N¼fð; jþjðv ;v j Þ2Eg. There s a one-to-one correspondence between all confguratons ðx 1 ;...;x n Þ and subsets UV: A vertex v s n U f and only f x ¼ 1. Moreover, the frst term of Eðx 1 ;...;x n Þ s A 2n tmes the cardnalty of U (whch cannot be less than A 2 ) and the second term s 0 f U s ndependent and at least A otherwse. Thus, mnmzng the energy yelds the ndependent subset of maxmum cardnalty. tu APPENDI B TABLE 16 THE ADDITIVITY THEOREM We now prove the addtvty theorem, whch plays an mportant role n our constructons. Theorem (addtvty). The sum of two graph-representable functons s graph-representable. It s mportant to note that the proof of ths theorem s constructve. The constructon has a partcularly smple form f the graphs representng the two functons are defned on the same set of vertces (.e., they dffer only n ther edge weghts). In ths case, by smply addng the edge weghts together, we obtan a graph that represents the sum of the two functons. If one of the graphs has no edge between two vertces, we can add an edge wth weght 0. Proof. Let us assume for smplcty of notaton that E 0 and E 00 are functons of all n varables: E 0 ¼ E 0 ðx 1 ;...;x n Þ, E 00 ¼ E 00 ðx 1 ;...;x n Þ. By the defnton of graph representablty, there exst constants K 0, K 00, graphs G 0 ¼ðV 0 ; E 0 Þ, G 00 ¼ðV 00 ; E 00 Þ, and the set V 0 ¼fv 1 ;...;v n g, V 0 V 0 fs; tg, V 0 V 00 fs; tg such that E 0 þ K 0 s exactly represented by G 0, V 0 and E 00 þ K 00 s exactly represented by G 00, V 0. We can assume that the only common vertces of G 0 and G 00 are V 0 [fs; tg. Let us construct the graph G¼ðV; EÞ as the combned graph of G 0 and G 00 : V¼V 0 [V 00, E¼E 0 [E 00. Let ~E be the functon that G, V 0 exactly represents. Let us prove that ~E E þðk 0 þ K 00 Þ (and, therefore, E s graph-representable). Consder a confguraton x 1 ;...;x n. Let C 0 ¼ S 0 ;T 0 be the cut on G 0 wth the smallest cost among all cuts for whch v 2 S 0 f x ¼ 0 and v 2 T 0 f x ¼ 1 (1 n). Accordng to the defnton of graph representablty, E 0 ðx 1 ;...;x n ÞþK 0 ¼ cðu; vþ: u2s 0 ;v2t 0 ;ðu;vþ2e 0 Let C 00 ¼ S 00 ;T 00 be the cut on G 00 wth the smallest cost among all cuts for whch v 2 S 00 f x ¼ 0, and v 2 T 00 f x ¼ 1 (1 n). Smlarly, E 00 ðx 1 ;...;x n ÞþK 00 ¼ cðu; vþ: u2s 00 ;v2t 00 ;ðu;vþ2e 00 Let S ¼ S 0 [ S 00, T ¼ T 0 [ T 00. It s easy to check that C ¼ S;T s a cut on G. Thus, ~Eðx 1 ;...;x n Þ cðu; vþ ¼ u2s 0 ;v2t 0 ;ðu;vþ2e 0 cðu; vþþ u2s;v2t;ðu;vþ2e u2s 00 ;v2t 00 ;ðu;vþ2e 00 cðu; vþ ¼ðE 0 ðx 1 ;...;x n ÞþK 0 ÞþðE 00 ðx 1 ;...;x n ÞþK 00 Þ ¼ Eðx 1 ;...;x n ÞþðK 0 þ K 00 Þ: Now, let C ¼ S; T be the cut on G wth the smallest cost among all cuts for whch v 2 S f x ¼ 0 and v 2 T f x ¼ 1 (1 n) and let S 0 ¼ S \V 0, T 0 ¼ T \V 0, S 00 ¼ S \V 00, T 00 ¼ T \V 0. It s easy to see that C 0 ¼ S 0 ;T 0, and C 00 ¼ S 00 ;T 00 are cuts on G 0 and G 00, respectvely. Accordng to the defnton of graph representablty, Eðx 1 ;...;x n ÞþðK 0 þ K 00 Þ¼ðE 0 ðx 1 ;...;x n ÞþK 0 Þ þðe 00 ðx 1 ;...;x n ÞþK 00 Þ cðu; vþ u2s 0 ;v2t 0 ;ðu;vþ2e 0 þ cðu; vþ u2s 00 ;v2t 00 ;ðu;vþ2e 00 ¼ cðu; vþ ¼ ~Eðx 1 ;...;x n Þ: u2s;v2t;ðu;vþ2e 0 ACKNOWLEDGMENTS The authors are grateful to Olga Veksler and Yur Boykov for ther careful readng of ths paper and for valuable comments that greatly mproved ts readablty. They would lke to thank Jon Klenberg and Eva Tardos for helpng them understand the relatonshp between ther work and submodular functon mnmzaton. They would also lke to thank Ian Jermyn and several anonymous revewers for helpng them clarfy the paper s presentaton. Ths research was supported by US Natonal Scence Foundaton grants IIS-9900115 and CCR-0113371 and by a grant from Mcrosoft Research. A prelmnary verson of ths paper appeared n the Proceedngs of the European Conference on Computer Vson, May 2002. ut
KOLMOGOROV AND ZABIH: WHAT ENERGY FUNCTIONS CAN BE MINIMIZED VIA GRAPH CUTS? 159 REFERENCES [1] R.K. Ahuja, T.L. Magnant, and J.B. Orln, Network Flows: Theory, Algorthms, and Applcatons. Prentce Hall, 1993. [2] A. Amn, T. Weymouth, and R. Jan, Usng Dynamc Programmng for Solvng Varatonal Problems n Vson, IEEE Trans. Pattern Analyss and Machne Intellgence, vol. 12, no. 9, pp. 855-867, Sept. 1990. [3] S. Barnard, Stochastc Stereo Matchng Over Scale, Int l J. Computer Vson, vol. 3, no. 1, pp. 17-32, 1989. [4] S. Brchfeld and C. Tomas, Multway Cut for Stereo and Moton wth Slanted Surfaces, Proc. Int l Conf. Computer Vson, pp. 489-495, 1999. [5] Y. Boykov and M.-P. Jolly, Interactve Organ Segmentaton Usng Graph Cuts, Proc. Medcal Image Computng and Computer-Asssted Interventon, pp. 276-286, 2000. [6] Y. Boykov and M.-P. Jolly, Interactve Graph Cuts for Optmal Boundary and Regon Segmentaton of Objects n N-D Images, Proc. Int l Conf. Computer Vson, pp. 105-112, 2001. [7] Y. Boykov and V. Kolmogorov, An Expermental Comparson of Mn-Cut/Max-Flow Algorthms for Energy Mnmzaton n Computer Vson, Proc. Int l Workshop Energy Mnmzaton Methods n Computer Vson and Pattern Recognton, Lecture Notes n Computer Scence, pp. 359-374, Sprnger-Verlag, Sept. 2001. [8] Y. Boykov and V. Kolmogorov, Computng Geodescs and Mnmal Surfaces va Graph Cuts, Proc. Int l Conf. Computer Vson, pp. 26-33, 2003. [9] Y. Boykov, O. Veksler, and R. Zabh, Markov Random Felds wth Effcent Approxmatons, Proc. IEEE Conf. Computer Vson and Pattern Recognton, pp. 648-655, 1998. [10] Y. Boykov, O. Veksler, and R. Zabh, Fast Approxmate Energy Mnmzaton va Graph Cuts, Proc. IEEE Trans. Pattern Analyss and Machne Intellgence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001. [11] W.H. Cunnngham, Mnmum Cuts, Modular Functons, and Matrod Polyhedra, Networks, vol. 15, pp. 205-215, 1985. [12] E. Dahlhaus, D.S. Johnson, C.H. Papadmtrou, P.D. Seymour, and M. Yannakaks, The Complexty of Multway Cuts, Proc. ACM Symp. Theory of Computng, pp. 241-251, 1992. [13] J. Das and J. Letao, The ZM Algorthm: A Method for Interferometrc Image Reconstructon n SAR/SAS IEEE Trans. Image Processng, vol. 11, no. 4, pp. 408-422, Apr. 2002. [14] L. Ford and D. Fulkerson, Flows n Networks. Prnceton Unv. Press, 1962. [15] S. Fujshge, Submodular Functons and Optmzaton, vol. 47, Annals of Dscrete Math., North Holland, 1990. [16] S. Geman and D. Geman, Stochastc Relaxaton, Gbbs Dstrbutons, and the Bayesan Restoraton of Images, IEEE Trans. Pattern Analyss and Machne Intellgence, vol. 6, pp. 721-741, 1984. [17] A. Goldberg and R. Tarjan, A New Approach to the Maxmum Flow Problem, J. ACM, vol. 35, no. 4, pp. 921-940, Oct. 1988. [18] D. Greg, B. Porteous, and A. Seheult, Exact Maxmum A Posteror Estmaton for Bnary Images, J. Royal Statstcal Soc., Seres B, vol. 51, no. 2, pp. 271-279, 1989. [19] M. Grötschel, L. Lovasz, and A. Schrjver, Geometrc Algorthms and Combnatoral Optmzaton. Sprnger-Verlag, 1988. [20] H. Ishkawa and D. Geger, Occlusons, Dscontnutes, and Eppolar Lnes n Stereo, Proc. European Conf. Computer Vson, pp. 232-248, 1998. [21] H. Ishkawa and D. Geger, Segmentaton by Groupng Junctons, Proc. IEEE Conf. Computer Vson and Pattern Recognton, pp. 125-131, 1998. [22] H. Ishkawa, Exact Optmzaton for Markov Random Felds wth Convex Prors, IEEE Trans. Pattern Analyss and Machne Intellgence, vol. 25, no. 10, pp. 1333-1336, Oct. 2003. [23] S. Iwata, L. Flescher, and S. Fujshge, A Combnatoral, Strongly Polynomal Algorthm for Mnmzng Submodular Functons, J. ACM, vol. 48, no. 4, pp. 761-777, July 2001. [24] J. Km, V. Kolmogorov, and R. Zabh, Vsual Correspondence Usng Energy Mnmzaton and Mutual Informaton, Proc. Int l Conf. Computer Vson, pp. 1033-1040, 2003. [25] J. Km and R. Zabh, Automatc Segmentaton of Contrast- Enhanced Image Sequences, Proc. Int l Conf. Computer Vson, pp. 502-509, 2003. [26] J. Km, J. Fsher, A. Tsa, C. Wble, A. Wllsky, and W. Wells, Incorporatng Spatal Prors nto an Informaton Theoretc Approach for FMRI Data Analyss, Proc. Medcal Image Computng and Computer-Asssted Interventon, pp. 62-71, 2000. [27] V. Kolmogorov and R. Zabh, Vsual Correspondence wth Occlusons Usng Graph Cuts, Proc. Int l Conf. Computer Vson, pp. 508-515, 2001. [28] V. Kolmogorov and R. Zabh, Mult-Camera Scene Reconstructon va Graph Cuts, Proc. European Conf. Computer Vson, vol. 3, pp. 82-96, 2002. [29] V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobck, Graphcut Textures: Image and Vdeo Synthess Usng Graph Cuts, ACM Trans. Graphcs, Proc. SIGGRAPH 2003, July 2003. [30] C.-H. Lee, D. Lee, and M. Km, Optmal Task Assgnment n Lnear Array Networks, IEEE Trans. Computers, vol. 41, no. 7, pp. 877-880, July 1992. [31] S. L, Markov Random Feld Modelng n Computer Vson. Sprnger- Verlag, 1995. [32] M.H. Ln, Surfaces wth Occlusons from Layered Stereo, PhD thess, Stanford Unv., Dec. 2002. [33] I. Mls, Task Assgnment n Dstrbuted Systems Usng Network Flow Methods, Proc. Combnatorcs and Computer Scence, Lecture Notes n Computer Scence, pp. 396-405, Sprnger-Verlag, 1996. [34] T. Poggo, V. Torre, and C. Koch, Computatonal Vson and Regularzaton Theory, Nature, vol. 317, pp. 314-319, 1985. [35] S. Roy, Stereo wthout Eppolar Lnes: A Maxmum Flow Formulaton, Int l J. Computer Vson, vol. 1, no. 2, pp. 1-15, 1999. [36] S. Roy and I. Cox, A Maxmum-Flow Formulaton of the n-camera Stereo Correspondence Problem, Proc. Int l Conf. Computer Vson, 1998. [37] D. Scharsten and R. Szelsk, A Taxonomy and Evaluaton of Dense Two-Frame Stereo Correspondence Algorthms, Int l J. Computer Vson, vol. 47, pp. 7-42, Apr. 2002. [38] A. Schrjver, A Combnatoral Algorthm Mnmzng Submodular Functons n Strongly Polynomal Tme, J. Combnatoral Theory, vol. B 80, pp. 346-355, 2000. [39] D. Snow, P. Vola, and R. Zabh, Exact Voxel Occupancy wth Graph Cuts, Proc. IEEE Conf. Computer Vson and Pattern Recognton, pp. 345-352, 2000. [40] H.S. Stone, Multprocessor Schedulng wth the Ad of Network Flow Algorthms, IEEE Trans. Software Eng., pp. 85-93, 1977. [41] R. Szelsk and R. Zabh, An Expermental Comparson of Stereo Algorthms, Proc. Vson Algorthms: Theory and Practce, Lecture Notes n Computer Scence, B. Trggs, A. Zsserman, and R. Szelsk, eds.,vol. 1883, pp. 1-19, Sprnger-Verlag, Sept. 1999. Vladmr Kolmogorov receved the MS degree from the Moscow Insttute of Physcs and Technology n appled mathematcs and physcs n 1999 and the PhD degree n computer scence from Cornell Unversty n January 2004. He s currently a postdoctoral researcher at Mcrosoft Research, Cambrdge, Unted Kngdom. Hs research nterests are graph algorthms, stereo correspondence, mage segmentaton, parameter estmaton, and mutual nformaton. Two of hs papers (wrtten wth Ramn Zabh) receved a best paper award at the European Conference on Computer Vson, 2002. He s a member of the IEEE and the IEEE Computer Socety. Ramn Zabh attended the Massachusetts Insttute of Technology as an undergraduate, where he receved BS degrees n computer scence and mathematcs and the MSc degree n computer scence. After earnng the PhD degree n computer scence from Stanford Unversty n 1994, he joned the faculty at Cornell Unversty, where he s currently an assocate professor of computer scence. Snce 2001, he has held a jont appontment as an assocate professor of radology at Cornell Medcal School. Hs research nterests le n early vson and ts applcatons, especally n medcne. He has served on numerous program commttees, ncludng the Internatonal Conference on Computer Vson (ICCV) n 1999, 2001, and 2003, and the IEEE Conference on Computer Vson and Pattern Recognton (CVPR) n 1997, 2000, 2001, and 2003. He coedted a specal ssue of the IEEE Transactons on Pattern Analyss and Machne Intellgence n October 2001. He s a member of the IEEE and the IEEE Computer Socety.