Parallel Total Variation Minimization. Diplomarbeit
|
|
|
- Emily Johns
- 10 years ago
- Views:
Transcription
1 Institut für Numerische und Angewandte Mathematik Parallel Total Variation Minimization Diplomarbeit eingereicht von Jahn Philipp Müller betreut von Prof. Dr. Martin Burger Prof. Dr. Sergei Gorlatch Münster
2 Abstract In [ROF92] Rudin, Osher and Fatemi introduced a denoising algorithm using total variation regularization. In this work we provide a parallel algorithm for this variational minimization problem. It is based on the primal-dual formulation and hence leads to solve the saddle point problem for the primal and dual variable. For that reason Newton s method with damping was used. The arising constraint for the dual variable is approximated with a penalty method. We apply domain decomposition methods to divide the original problem into several subproblems. The transmission conditions arising at the interfaces of the subbomains are handled via an overlapping decomposition and the well-known Schwarz method. To make the Message Passing Interface (MPI) available in MATLAB R we use MatlabMPI, a set of scripts provided by the MIT. The numerical results show a good convergence behavior and an excellent speedup of computation.
3 Contents 1 Introduction 1 2 Mathematical Preliminaries Derivatives Convexity Duality Optimization Total Variation Regularization and the ROF Model Primal Formulation Dual Formulation Primal-Dual Formulation Solution Methods Primal Methods Steepest Descent Fixed Point Iteration Newton s Method Dual Methods Primal-dual Methods Barrier and Penalty Method Newton Method with Damping Domain Decomposition Non Overlapping Decomposition Overlapping Decomposition Schwarz Iteration Multiplicative Schwarz Method Additive Schwarz Method Application i
4 6 Basic Parallel Principles MPI MatlabMPI Speedup Numerical Realization Sequential Implementation Schur Complement Parallel Implementation Additive Version Multiplicative Version Remarks Results Convergence Results Computation Time and Speedup Conclusion 63 ii
5 List of Figures 1.1 Parallel Computing Anisotropic vs. isotropic total variation Barrier method vs. Penalty method, ε = Barrier method vs. Penalty method, ε = Non overlapping domain decomposition Poisson equation with artifacts Overlapping domain decomposition Poisson equation solved with the multiplicative Schwarz method Coloring of decompositions with two colors Coloring of decompositions with four colors The degrees of freedom of the primal- dual problem Sequential version Parallel version with four processors Simple decomposition coloring with two colors Original and noisy test image Iterations of the sequential algorithm Iterations of the multiplicative parallel algorithm (2 CPUs) Iterations of the additive parallel algorithm (4 CPUs) Iterations of the multiplicative parallel algorithm (32 CPUs) Iterations of the additive parallel algorithm (32 CPUs) Additive algorithm, image size Multiplicative algorithm, image size Additive algorithm, image size Multiplicative algorithm, image size Computation time Speedup iii
6 Notation R R {+ } { } U L p L 1 loc BV dual space of U Lebesgue space, i.e. space of p-power integrable functions space of local integrable functions space of functions of bounded variation W (1,1) Sobolew space of functions with weak derivatives in L 1 C 0 space of infinity times differentiable functions with compact support H div functions in L 2 with weak divergence in L 2 dj(u; v) directional derivative of J at u in direction v χ K indicator function of the set K in terms of convex analysis, i.e. χ K (x) = 0 if x K and otherwise 1 K indicator function of the set K, i.e. 1 K (x) = 1 if x K and 0 otherwise sgn(p) signum function domj effective domain of a functional J J J u k u u k u convex conjugate of a functional J biconjugate of a functional J weak convergence weak* convergence TV total variation norm 2 L 2 norm l p natural norm of the sequence space l p Π K (x) projection of x to K closure of the set boundary of the set direct sum u, v duality product of u U and v U J(u) subdifferential of J at point u O Landau notation iv
7 Acknowledgments First of all I would like to thank Prof. Dr. Martin Burger for giving me the opportunity to work on this challenging and interesting topic, and for taking his time for assisting me with my problems and answering all my questions. Additionally I thank Prof. Dr. Sergei Gorlatch for being my co-advisor and also his doctoral candidate Mareike Schellmann, especially for her help in the final phase of this thesis. I further like to thank Martin Benning, Oleg Reichmann, Martin Drohmann, Alex Sawatzky and Christoph Brune for many helpful discussions and proof-reading this thesis. all staff members of the Institute for Computational and Applied Mathematics. I had a great time working here. all my friends who have supported me during the last years. Last but not least I would like to thank my whole family for all the support throughout the time of my studies. v
8 Chapter 1 Introduction The subject of this thesis is the parallelization of nonlinear imaging algorithms, particularly in the case of total variation minimization. We will limit ourselves to the consideration of the ROF (Rudin-Osher-Fatemi) model, introduced in [ROF92], but the concept might be easily adapted to other models based on convex variational problems with gradient energies. Since total variation regularization provides some advantageous properties, like preserving edges, it is one of the most widely used denoising methods today [CS05]. It is, e.g. used in combination with the EM (Expectation Maximization) algorithm [SBW + 08], [BSW + ] for reconstructing measured PET 1 - data. Parallelization is getting more and more important in many applications. Especially in imaging it is desirable to extend the existing 2D algorithms to the three dimensional cases. But due to the enormous calculation effort, currently used workstations reach their technical limitations. One expedient is to divide the original problem into several subproblems, solve them independently on several CPU s and merge them together to a solution of the complete problem (see Fig. 1.1 for an illustration). This can be done in parallel and promises a speedup for the computation. More important, due to reduction of the problem size, restrictions by technical requirements (i.e. too low main memory) become negligible. Unfortunately data dependences may arise between the subproblems necessitating a communication between the CPU s. Neglecting these dependences results in undesirable effects at the interfaces of the divisions. Hence parallel algorithms are needed to handle this issue and provide a solution coinciding to the original one. In Chapter 2 we are going to provide some definitions and results of the theory of convexity, duality, and optimization which we will need for the analysis of the 1 Positron emission tomography (PET) is a nuclear medicine imaging technique, where pairs of gamma photon annihilations of an injected radioactive tracer isotope are measured. 1
9 Chapter 1: Introduction (a) Sequential (b) Parallel Figure 1.1: Instead of computing the whole problem on one CPU (a), one can divide it into several subproblems and solve each of them on a different CPU (b). Probably communication between the CPU s is necessary. ROF model. The latter will be introduced in Chapter 3 where also the different formulations (primal, dual and primal-dual) will be discussed in detail. In Chapter 4 we will give an overview of existing solution methods for the ROF model and subsequently discuss a primal-dual Newton method, which is the basis of our parallel algorithm. A short introduction into domain decomposition and especially to the well-known Schwarz methods will be presented in Chapter 5. Since we use the Message Passing Interface (MPI) for the parallel implementation, we are going to illustrate the concept of MPI and how it can be made available in MATLAB R in Chapter 6. We will also mention some aspects of parallelization and speedup. The numerical realization of the proposed primal-dual algorithm will be explained in Chapter 7 where also two similar parallel versions will be presented. Finally in Chapter 8 the convergence results as well as the attained speedup will be illustrated. 2
10 Chapter 2 Mathematical Preliminaries In this chapter we will provide some mathematical background needed later in this thesis. Since we are interested in finding (unique) global minima of (strictly) convex functionals we will state how these minima can be computed if they exist. Therefore we will introduce the concept of derivatives and convexity as well as some important properties of duality. We will mainly follow [Bur03]. 2.1 Derivatives Similar to functions in R n we want to introduce a concept of derivatives for functionals defined on Banach spaces. Definition 2.1. Let J : U V be a continuous nonlinear operator where U, V are Banach spaces. The directional derivative of J at a point u in direction v is defined as J(u + tv) J(u) dj(u; v) := lim, t 0 t if the limit exists. J is called Gâteaux-differentiable at u if dj(u, v) exists for all v U and dj(u,.) is called Gâteaux-derivative. If additionally dj(u,.) : U V is continuous and linear, J is called Fréchet-differentiable with Fréchet-derivative J (u)v := dj(u; v) v U. The second Fréchet-derivative is defined by J J (u + tw)v J (u)v (u)(v, w) := lim. t 0 t 3
11 Chapter 2: Mathematical Preliminaries In an analogous way higher derivatives can be defined inductively. The directional derivation is also called first variation. Remark 2.1. Note that the directional derivative dj(u; v) equals Φ (t) t=0 with Φ(t) := J(u + tv). Example 2.1. Let J : L 2 () R + be defined by J(u) := λ 2 (u f) 2 dx with f L 2 (), λ R and R n being open and bounded. We set Φ(t) := J(u + tv) = λ 2 with an arbitrary v L 2 (). Differentiating leads to (u + tv f) 2 dx Φ (t) = λ d (u + tv f) 2 dx 2 dt = λ d 2 dt (u + tv f)2 dx = λ (u + tv f)v dx and we obtain the first Gâteaux-derivative of J as dj(u; v) = Φ (0) = λ (u f)v dx = J (u)v. Since dj(u;.) is continuous and linear, J is Fréchet-differentiable. With Φ(t) := J (u + tw)v we can compute the second Fréchet-derivative as Φ (t) t=0 = λ wv dx = J (u)(v, w). 2.2 Convexity We will see that convex functionals provide some advantageous properties like the concept of subdifferentials or the uniqueness of global minima in the case of strict convexity. Definition 2.2. A set C U is called convex, if for all α [0, 1] and u, v C: αu + (1 α)v C. 4
12 Chapter 2: Mathematical Preliminaries Let U be a Banach space and C U convex. A functional J : C R is called convex, if for all α [0, 1] and u, v C: J(αu + (1 α)v) αj(u) + (1 α)j(v). (2.1) If the inequality (2.1) holds strictly (except for u = v or α {0, 1}), J is called strictly convex. An optimization problem J(u) min u C is called convex, if J as well as C is convex. Example 2.2. The indicator function of a convex set C is convex. Let 0 if u C J(u) = χ C (u) := + else then and 0 if u, v C αj(u) + (1 α)j(v) = + else 0 if αu + (1 α)v C J(αu + (1 α)v) = + else. Therefore J(αu + (1 α)v) > αj(u) + (1 α)j(v) would only be possible in the case where u, v C. But then αu + (1 α)v C, due to the convexity of C and hence J(αu + (1 α)v) = 0 = αj(u) + (1 α)j(v). Remark 2.2. Due to the fact that J defined by J(u) if u C J(u) := else is convex if and only if J(u) and C are convex, we only need to consider functionals 5
13 Chapter 2: Mathematical Preliminaries defined on the whole space U. In order to generalize differentiability for non Fréchet-differentiable convex functionals, we introduce the concept of subgradients: Definition 2.3. Let U be a Banach space with dual U and J : U R be convex. Then the subdifferential J(u) at a point u is defined as: J(u) := {p U J(w) J(u) + p, w u, w U}. (2.2) J is called subdifferentiable at u if J(u) is not empty. An element p J(u) is called subgradient of J at point u. Example 2.3. As an example we take a look at the Euclidean norm f : R n R, f(x) = x. Although it is not differentiable in x = 0, it is subdifferentiable at every x R n. The subdifferential is given by {ˆx R n w ˆx, w, w R n } if x = 0 f(x) = x else. x Thus in x = 0 the subdifferential consists of the whole Euclidean unit ball, for instance the interval [ 1, 1] in the case n = 1. Remark 2.3. It can easily be seen, that if J : U R is a convex Fréchetdifferentiable functional then J(u) = {J (u)} holds (see [Bur03, Proposition 3.6]). In general it is not true that J(u) is a singleton, as we have seen in Example 2.3. We now want to state a criterion for strict convexity. Theorem 2.1. Let C U be open and convex and let the functional J : C R be twice continuously Fréchet-differentiable. Then, J (u)(v, v) > 0 for all u C and v U\{0} implies strict convexity of J. For a proof see [Bur03, Proposition 3.3]. One tremendous advantage of strictly convex functionals is the uniqueness of a global minimum. Theorem 2.2. Let J : U R and J(u) min u U 6
14 Chapter 2: Mathematical Preliminaries be a strictly convex optimization problem. Then there exists at most one local minimum, which is a global one. Proof. Let u be a local minimum of J and assume that it is no global minimum. Then there exists û U with J(û) < J(u). Let us define u α := αû + (1 α)u U for all α [0, 1]. Due to (strict) convexity of J J(u α ) αj(û) + (1 α)j(u) < J(u). Since u α u as α 0, this is a contradiction to u being a local minimum. Hence u is a global minimum. Now let u, v be two global minima of J. For u v this implies J(αu + (1 α)v) < αj(u) + (1 α)j(v) = inf J, for α ]0, 1[, which is a contradiction to the assumption. 2.3 Duality The concept of duality is very important in the theory of optimization. Instead of considering the given primal problem, one can deduce the complementary dual problem, which may be easier to solve. Let us recall some definitions first. Definition 2.4. A functional J : U R is called proper if u U J(u) and u U J(u) +. The set dom J := {u U J(u) < } is called the effective domain of J. In the following we consider proper functionals only. We are also in the need for a weaker concept of continuity: Definition 2.5. A functional J : U R is called lower semi-continuous if u U lim inf v u J(v) J(u). 7
15 Chapter 2: Mathematical Preliminaries J is called upper semi-continuous if J is lower semi-continuous. Obviously a functional is continuous at a point u if and only if it is upper and lower semi-continuous at u. Definition 2.6. Let J : U R (not necessarily convex), then the convex conjugate (or Legendre-Fenchel transform) J : U R is defined by J (p) := sup { u, p J(u)}. u U Example 2.4. Consider again the indicator function of a convex set C: 0 if u C J(u) = χ C (u) := + else. Then J (v) = sup{ u, v χ C (u)} u U = sup{ u, v }. u C As its name implies, the convex conjugate is always convex. Lemma 2.1. Let U be a Banach space and J : U R. Then J is convex and lower semi-continuous. Proof. The convex conjugate of J is given by ( ) J (p) = sup u, p J(u) = u U sup u dom J ( ) u, p J(u) i.e. J is the point wise supremum of the family of continuous affine functions u, J(u) with u dom J of U into R and hence J is lower semi-continuous and convex [ET76, Definition 4.1, p.17]. One may also build the biconjugate J (as the convex conjugate of the convex conjugate) and achieve the following result: 8
16 Chapter 2: Mathematical Preliminaries Theorem 2.3. Let U be a reflexive Banach space (i.e. U = U ), J : U R and J its biconjugate. Then J = J if and only if J is convex and lower semi-continuous. Proof. Let J be convex and lower semi-continuous and ū U arbitrary. We will show that J (ū) = J(ū). ( J (ū) = sup p, ū J (p) ) p U ( ( ) ) = sup p, ū sup p, v ) J(v) p U v U ( ) = sup inf p, ū v + J(v) p U v U } {{ } p,ū ū +J(ū)=J(ū) p J(ū) Since J is proper their exists ā R with ā < J(ū). Take such an ā arbitrary. Furthermore, due to the convexity and lower semi-continuity, the epigraph of J epi J = {(u, a) U R J(u) a} is a closed convex set (cf. [ET76, Proposition 2.1 and 2.3]) which does not contain the point (ū, ā). Hence, applying the Hahn-Banach Theorem, we can strictly separate the epigraph of J and the point (ū, ā) by a closed affine hyperplane H of U R given by H = {(u, a) U R q, u + αa = β} with α, β R. We thus have: q, ū + αā < β (2.3) q, u + αa > β (u, a) epi J. (2.4) If J(ū) < we can take u = ū and a = J(ū) in (2.4) and achieve together with (2.3): This implies q, ū + αj(ū) > β > q, ū + αā. α ( J(ū) ā ) > 0 } {{ } >0 and hence α > 0. When (2.4) is divided by α we can conclude J(v) + q α, v < β α 9
17 Chapter 2: Mathematical Preliminaries and with p = q α we obtain: J (ū) = sup inf p U v U = sup p U inf v U sup inf p U v U = sup (2.3) = ā. p U sup p U ( p, ū v + J(v) ) ( p, ū + J(v) p, v } {{ } ( p, ū + β α ( 1 α q, ū + β α ) ) > β α ( 1 α (αā β) + β ) α Hence J (ū) ā for all ā < J(ū) which implies ) J (ū) J(ū). If J(ū) = + then, by letting ā tend to + (resp. ā to ) for α > 0 (α < 0), (2.3) yields α = 0. Thus we have (cf. (2.3) and (2.4)): q, ū < β (2.5) q, u > β u dom J. (2.6) Now let β q, u = γ < 0 (2.7) and p = cq, with γ, c R. Then J (ū) = sup inf p U v U = sup inf p U v U = sup p U inf v U sup inf p U v U ( p, ū v + J(v) ) ( p, ū + J(v) p, v ) ( c q, ū +c q, v +J(v) ) } {{ } } {{ } > cβ = β γ ( ) cβ + cβ cγ + J(v) ( (inf = sup J(v) ) cγ p U v U ) 10
18 Chapter 2: Mathematical Preliminaries Since γ < 0, cγ is tending to for c and hence J (ū) = J(ū). In turn assume J not to be convex and lower semi-continuous. Since Lemma 2.1 yields the convexity and lower semi-continuouity of J = (J ), J can not equal to J. Example 2.5. In Example 2.4 we have computed the convex conjugate of J(u) = χ C (u) as J (v) = sup u C { u, v }. With the convexity of J(u) (see Example 2.2) we achieve J (u) = J(u) and hence the convex conjugate of sup u C { u, v } is given by χ C (u). Lemma 2.2. Let J : U R (not necessarily convex) and J : U R its convex conjugate, then p J(u) u J (p). Proof. Let p J(u) then by definition J(w) J(u) + p, w u (2.8) holds for all w U. Let v U be arbitrary, then J (p) + u, v p = ( ) sup p, w J(w) + u, v p = w U ( ) sup p, w u J(w) + u, v w U (2.8) ( ) sup J(u) + u, v w U ( ) J(u) + u, v sup u U = J (v) holds. Since v was chosen arbitrarily we have J (v) J (p) + u, v p, v U which is equivalent to u J (p). Note that if J is convex and lower semi-continuous, then in Lemma 2.2 equivalence holds. This follows from u J (p) p J (u) and J = J. 11
19 Chapter 2: Mathematical Preliminaries 2.4 Optimization To conclude this chapter we want to state how to obtain existence of a global minimum and which optimality conditions have to be fulfilled for convex functionals. The fundamental theorem of optimization provides the existence of a global minimum from lower semi-continuity in combination with compactness. Theorem 2.4. Let J : U R be a proper, lower semi-continuous functional and let there exist a non-empty and compact level set S := {u U J(u) M} for some M R. Then attains a global minimum. min J(u) u U For a proof see [Bur03, Theorem 2.3]. In the case of infinite-dimensional problems compactness is not caused by boundedness. Fortunately a similar property holds for (the dual of) Banach spaces. Therefore let us recall the definition of weak and weak* convergence. Definition 2.7. Let U be an Banach space and U its dual space. Then the weak topology is defined as u k u : v, u k v, u v U and the weak* topology is defined v k v : v k, u v, u u U. The theorem of Banach-Alaoglu provides the compactness of the set {v U v U C}, C R + in the weak*- topology. Let U be a Banach space, C U be convex, and J : U R be a functional. Since we are interested in solutions of a constrained minimization problem J(u) min u C 12
20 Chapter 2: Mathematical Preliminaries we can also set J : U R with J(u) if u C J(u) = if u U \ C and consider the following unconstrained minimization problem: J(u) min u U. It is obvious from Definition 2.5 and Remark 2.2 that J is convex and lower semicontinuous and hence without loss of generality we can assume J as a functional J : U R defined on the whole space U. Another advantage of convex functionals is that one has to consider only the first derivative to characterize a minimum. For general functionals we have the following necessary first-order condition: Lemma 2.3. Let J : U R be Fréchet-differentiable and let u be a local minimum of J. Then J (u) = 0 holds. For a proof see [Bur03, Proposition 2.8] Also for non Fréchet-differentiable convex functionals we can state a necessary and sufficient criterion for a minimum in the sense of subgradients: Lemma 2.4. Let J : U R be a convex functional. Then u U is a minimum of J if and only if 0 J(u). Proof. Let 0 J(u) then we have with (2.2): J(w) J(u) + 0, w u } {{ } = 0 w U So u is a global minimum of J. On the other hand let 0 J(u), then there exists at least one w U with J(w) < J(u) + 0, w u } {{ } = 0 which implies that u cannot be a minimum of J. 13
21 Chapter 3 Total Variation Regularization and the ROF Model In this chapter we want to give a brief introduction into total variation regularization, in particular by means of the ROF model. We will specify three different kinds of formulations (primal, dual and primal-dual) and their derivation. In the following R d will denote an open bounded set with Lipschitz boundary. Let f : R d R be a noisy version of a given image u 0 with noise variance given by (u 0 f) 2 dx σ 2. (3.1) The ROF (Rudin-Osher-Fatemi) model, first introduced in [ROF92], is based on using the total variation as a regularization term to find a denoised image û. It is defined by { λ û = arg min u BV () 2 (u f) 2 dx } {{ } data fitting + u TV } {{ } regularization } (3.2) where λ is a positive parameter specifying the intensity of regularization, that should be set depending on the noise variance σ (i.e. λ for σ 0). Here u TV denotes the so-called total variation of u u TV := sup u ϕ dx. (3.3) ϕ C0 ()d ϕ 1 In the literature also the notation Du was used for the total variation of u, corresponding to the interpretation of Du as a vector measure. For u W 1,1 () we have u TV = u dx. (3.4) 14
22 Chapter 3: Total Variation Regularization and the ROF Model BV denotes the space of functions with bounded total variation BV () = { u L 1 () u TV < } which is a Banach space endowed with the norm u BV := u TV + u L 1. TV is lower semi-continuous with respect to the strong topology in L 1 loc () ([AFP00, Proposition 3.6]) and hence due to the embedding L 2 () L 1 (), for bounded also with respect to L 2 (). Note that (3.3) is not unique for d > 1. Depending on the exact definition of the supremum norm 1 p := ess sup p(x) l r x we obtain a family of equivalent seminorms: Du l s = sup u ϕ dx ϕ C0 ()d ϕ 1 with 1 s and its Hölder conjugate r (i.e. 1 s + 1 r = 1). For example (cf. [Bur08]) we obtain the isotropic total variation (r = 2) which coincides with u TV = (u xi ) 2 u C 1 or a (cubicly) anisotropic total variation (r = ) which coincides with u TV = i u xi u C 1. (3.5) i According to expectations, the different definitions have effects on the nature of minimizers of (3.2). So in the case of the isotropic total variation, corners in the edge set will not be allowed [Mey01], whereas orthogonal corners are favored by the anisotropic variant [EO04]. See Figure 3.1 for an illustration. Overall the aim is to minimize the following functional to obtain a denoised version 1 ess sup x u(x) := inf{n µ(n) = 0} sup x \N u(x) 15
23 Chapter 3: Total Variation Regularization and the ROF Model (a) Original image (b) Noisy image (c) Isotropic TV (d) Anisotropic TV Figure 3.1: Different definitions of the supremum norm have effects on the nature of minimizers of (3.2). Images are taken from [BBD + 06]. of the noisy image f: J(u) := λ (u f) 2 dx+ u TV. (3.6) 2 } {{ } } {{ } :=J r(u) :=J d (u) This is a strictly convex optimization problem and hence provides the advantages stated in Chapter 2. Lemma 3.1. J as defined in (3.6) is strictly convex. Proof. In Example 2.1 we have computed the second Fréchet-derivative of the data fitting term J d (u) as J d (u)(v, w) = λ wv dx. Hence J d (u)(v, v) = λ v 2 dx λ>0 > 0 u BV () and v 0 and with Theorem 2.1 we achieve strict convexity of J d. 16
24 Chapter 3: Total Variation Regularization and the ROF Model Furthermore u TV is convex: αu + (1 α)v TV = sup ϕ 1 α sup ϕ 1 All in all J(u) = J d (u) + u TV is strictly convex. (αu + (1 α)v) ϕ dx u ϕ dx +(1 α) sup v ϕ dx ϕ 1 = α u TV + (1 α) v TV. Due to the convexity of J(u) we can apply Lemma 2.4 and achieve an optimality condition in terms of subgradients for a minimum as: 0 J(u). J d and TV are convex (see Lemma 3.1), lower semi-continuous and do not take the value (actually both functionals are not negative). Hence J is a lower semicontinuous convex functional defined over a Banach space and thus continuous over the interior of its effective domain (see [ET76, Corollary 2.5]). We thus can conclude the existence of ũ dom J d dom TV where J d is continuous and with ([ET76, Proposition 5.6]) we have J(u) = (J d + TV )(u) = J d (u) + u TV. Recall that Remark 2.3 gives us J d (u) = {J d (u)} and hence the optimality condition can be stated as 0 λ(u f) + u TV. (3.7) Remark 3.1. Defining K as the closure of the convex set { p p C 0 (), p 1} and using Example 2.5 we achieve the convex conjugate of J r (u) = u TV as 0 if v K Jr (v) = χ K (v) := + else. 17
25 Chapter 3: Total Variation Regularization and the ROF Model 3.1 Primal Formulation The primal formulation of the ROF model is given by J(u) = λ 2 (u f) 2 dx + u dx (3.8) for u sufficiently smooth, particularly u W 1,1 (). To obtain the associated Euler- Lagrange equation, we compute the first Gâteaux-derivative of J. For this purpose we set Φ(t) := J(u + tv) = λ 2 (u + tv f) 2 dx + (u + tv) dx with an arbitrary v BV (). Derivating leads to (assuming (u + tv) 0) Hence (assuming u 0) Φ (u + tv) (t) = λ (u + tv f)v + (u + tv) v. Φ (0) = λ = λ (u f)v + (u f)v u u v u u v + v u u n ds. Under the aspect that v was chosen arbitrary and assuming homogeneous Neumann boundary conditions for u, which is a natural choice for images, we obtain the Euler Lagrange equation: λ(u f) ( ) u = 0. (3.9) u To overcome the issue with the singularity at u = 0 the TV norm often is perturbed as follows: u β := u 2 + β, (3.10) or in the anisotropic case: u β := i uxi 2 + β, (3.11) with a small positive parameter β. The choice of β is of great importance due to the closeness to degeneration for a value chosen to small and undesirable smoothed edges in the case of an overlarged β. 18
26 Chapter 3: Total Variation Regularization and the ROF Model 3.2 Dual Formulation Under the aspect that the regularization term in the primal formulation is not differentiable we want to deduce another formulation for the TV-Minimization. As we will see, we can achieve differentiability at the cost of getting side conditions. In the following we write sup p 1 for the exact sup p C 0 () d p 1. Let us start with the exact formulation of the TV regularization: inf u BV () [ ] λ (u f) 2 dx + sup u p dx. (3.12) 2 p 1 Bounded sets in BV () are weak* compact [AFP00, Proposition 3.13] and due to [ET76, Corollary 2.2], TV is lower semi-continuous with respect to the weak*- topology. Hence J(u) attains a minimum in BV [Zei85], which is a unique one (cf. Theorem 2.2) and (3.12) can be rewritten as min u BV () [ ] λ (u f) 2 dx + sup u p dx. (3.13) 2 p 1 To allow a consideration in L2 we set J(u) if u BV () J(u) = if u L2() \ BV (). Setting A = L 2 () and B = {p H div () p 1} we obtain min u A [ λ sup p B 2 as an equivalent form of (3.13). (u f) 2 dx + } {{ } :=L(u,p) ] u p dx. (3.14) Lemma 3.2. With A, B and L(u, p) defined as above, the following four conditions hold: A and B are convex, closed and non-empty, (3.15) u A p L(u, p) is concave and upper semi-continuous, (3.16) p B u L(u, p) is convex and lower semi-continuous, (3.17) p 0 B such that lim u A, u 2 L(u, p) = + (coercivity). (3.18) Proof. (3.15) is obviously fulfilled for A as well as the convexity ( is convex) 19
27 Chapter 3: Total Variation Regularization and the ROF Model and the non-emptiness of B. The closedness of B can be seen as follows: For p k B let the sequence p k converge to p in H div. Then p k p in L 2 and hence p k converges pointwise to p almost everywhere (a.e.) and due to p k 1: p(x) = Therefore p 1 and hence p B. lim p k(x) 1 a.e. p k (x) p(x) L(u, p) is linear in p and therefore (3.16) holds. We have shown the convexity of λ (u 2 f)2 in Lemma 3.1 and the lower semi continuity is given by Remark I.2.2 [ET76]. Together with the fact that u p is linear in u this yields (3.17). To obtain (3.18) we choose p 0 = 0. Theorem 3.1. With A, B and L(u, p) defined as in (3.14) and the line before we have min u A sup p B L(u, p) = max p B min L(u, p). u A Proof. Due to Lemma 3.2 all assumptions for [ET76, Proposition 2.3, p. 175] are fulfilled and we achieve: min u A sup p B L(u, p) = sup p B Let us take a look at the righthand side of (3.19): sup p B inf u A inf L(u, p). (3.19) u A [ ] λ (u f) 2 dx + u p dx. 2 The first order optimality condition for the infimum leads to λ(u f) + p = 0 u = f 1 p. (3.20) λ Since f and p are in L 2 we have u L 2 and due to the strict convexity of L(u, p), u defined by (3.20) provides a unique minimum in L 2. 20
28 Chapter 3: Total Variation Regularization and the ROF Model Reinserting into the righthand side of (3.19) yields: [ 1 sup ( p) 2 dx + (f 1 ] p B 2λ λ p) p dx [ 1 = sup ( p) 2 dx + f p dx 1 ] ( p) 2 dx p B 2λ λ = sup [ 1λ p) p B ( 2 dx + 2 ] f p dx. 2 λ By adding the constant term f 2 (which does not affect the supremum) and completing the square, we achieve [ sup ( 1 ] p B λ p f)2. (3.21) Instead of computing the supremum, we can minimize the negative term and obtain [ ] inf ( 1 p p B λ f)2 = inf 1 p f 2. (3.22) p B λ 2 } {{ } =:G(p) Now let us consider the sublevel set S := {p B 1 p λ f 2 2 f 2 2 }. There 1 λ p f 2 f 2 holds and with the triangle inequality we have p 2 2λ f 2. Furthermore p 1 implies p 2 and we obtain p Hdiv = p p 2 2 2λ f +. This gives us the boundedness of the sublevel set S in H div (and obviously in L ) and with the theorem of Banach-Alaoglu, this implies the weak- compactness of the sublevel set in H div and weak*- compactness in L. Now let p k, p k 1, be a sequence with 1 p λ k f 2 inf 1 p f 2. k p λ Then for k sufficiently large, p k lies in the weak*- compact sublevel set S and hence 21
29 Chapter 3: Total Variation Regularization and the ROF Model there exists a subsequence of p k, again denoted by p k, with p k p in H div (3.23) p k p in L. (3.24) From [ET76, Corollary I.2.2] G, as defined in (3.22), is lower semi-continuous on S for the weak topology of H div and hence: G(p) lim inf k i.e. p is a solution of (3.22). Also due to (3.24), is fulfilled (cf. proof of Lemma 3.2). Summarizing we have shown G(p k) = inf p p = lim inf k p k 1 1 λ p f 2, sup p B inf u A L(u, p) = max p B and together with (3.19) this proves the assertion. min L(u, p) u A Due to Theorem 3.1 we may consider the so called dual problem min 1 p B λ p f 2 2 (3.25) and after solving (3.25) for p we obtain the solution for the primal variable u from (3.20). An alternative derivation using the convex conjugate (see Definition 2.6) is presented in [Cha04]. In the following we give a brief summary of this approach, adapted to our problem: We have stated the Euler equation for the TV- Regularization (3.6) in (3.7) as 0 λ(u f) + u TV 22
30 Chapter 3: Total Variation Regularization and the ROF Model which is equivalent to Lem. 2.2 λ(f u) u TV u Jr (λ(f u)) 0 u + Jr (λ(f u)) 0 λu + λf λf + λ J r (λ(f u)) 0 λ(f u) λf + λ Jr (λ(f u)). This implies that w = λ(u f) is a minimum of 1 (w λf) 2 + λ Jr 2 (w). Therefore (recalling the definition of Jr as stated in Remark 3.1) w is the projection of λf to K = { p p C0 ()d, p 1}, i.e w = Π K (λf). Since w = λ(u f) we achieve: u = f w λ = f Π K(λf) λ = f Π 1 λ K (f). Computing this nonlinear projection amounts exactly in solving (3.25). 3.3 Primal-Dual Formulation Another approach aims at solving directly the saddle point problem for u and p, given by the exact formulation of the TV regularization: inf sup u BV () p 1 [ ] λ (u f) 2 dx + u p dx. 2 } {{ } =:L(u,p) For a saddle point we achieve the following optimality conditions L u (3.26) = λ(u f) + p = 0 (3.27) 23
31 Chapter 3: Total Variation Regularization and the ROF Model and L(u, p) L(u, q) q, q 1, (3.28) where (3.28) can be rewritten as u (p q) 0 q, q 1. (3.29) Hereafter we will see that the conditions (3.27) and (3.28) imply the optimality condition (3.7). Lemma 3.3. (3.29) implies that p u TV. Proof. Let w BV () be arbitrary. We have u (p q) 0 q, q 1. Especially this inequality holds for the supremum of q: sup u (p q) 0, (3.30) q 1 which implies sup u (q p) 0. (3.31) q 1 Obviously we have sup q 1 w q w q q, q 1. (3.32) In particular (3.32) is true for q = p and with (3.31) we achieve sup q 1 sup q 1 w q sup q 1 u (q p) + w p } {{ } 0 w q sup u q u p + q 1 w p w TV u TV + p, w u. (3.33) Since w was chosen arbitrarily (3.33) is equivalent to p u TV 2.2). (see Def. 24
32 Chapter 3: Total Variation Regularization and the ROF Model Alternatively (3.29) implies that u χ(p) with 0 if p 1 χ(p) := else. This follows from Lemma 2.2 and the fact that χ(p) = J (p) for J = TV. 25
33 Chapter 4 Solution Methods 4.1 Primal Methods For the primal formulation of the TV regularization (3.8) there already exist several numerical solution methods. Most of them aim at solving the associated Euler- Lagrange equation (3.9). In this section we will give a brief overview, without raising the claim of completeness Steepest Descent Rudin et al. proposed in their original paper [ROF92] an artificial time marching scheme to solve the Euler-Lagrange equation (3.9). Considering the image u as a function of space and time, they seek the steady state of the parabolic equation ( ) u u t = λ(u f) u β with initial condition u 0 = f at time t = 0. Here β denotes the perturbed norm as introduced in (3.10). Using an explicit forward Euler scheme for time discretization one usually achieves slow convergence due to the Courant-Friedrich-Lewy (CFL) condition, especially in regions where u Fixed Point Iteration In [VO96] Vogel and Oman suggested to use a lagged diffusivity fixed point iteration scheme to solve the Euler-Lagrange equation (3.9) directly ( ) u k+1 λ(u k+1 f) = 0, u k β 26
34 Chapter 4: Solution Methods leading to solve the linear system ( ( )) λ u k+1 = λf u k β for each iteration k. In spite of only linear convergence, one obtains good results after a few iterations Newton s Method Vogel and Oman (cf. [VO96]) as well as Chan, Chan and Zhou (cf. [CZC95]) proposed to apply Newton s method: u k+1 = u k H 1 φ (uk )φ(u k ) with φ(u) := λu u u β being the gradient of J and H φ (u) its Hessian, given by H φ (u) = λ So in each step one has to solve ( λ ( 1 u k β ( ( 1 u β ( I uk u kt u k β I u ut u 2 β ) )) ) ). δu = φ(u k ) (4.1) with an update δu: u k+1 u k + δu. We have locally quadratic convergence, but especially in the case where β is small the domain of convergence turned out to be minor. So alternatively one can use a continuation procedure for β, i.e starting with a large value (where (4.1) is well defined) and successively decrease it to the favored value (cf. [CZC95]). 4.2 Dual Methods In [Cha04] Chambolle presents a duality based algorithm. It amounts to solving the following problem: 1 λ p f 2 min p i,j i,j=1,...,n (4.2) 27
35 Chapter 4: Solution Methods which is just the discrete version of (3.25). Here the discrete divergence is given by p 1 i,j p1 i 1,j if 1 < i < n ( p) i,j = p 1 i,j if i = 1 if i = n p 1 i 1,j p 2 i,j p2 i,j 1 if 1 < j < n + p 2 i,j if j = 1 if j = n p 2 i,j 1 (4.3) The Karush-Kuhn-Tucker conditions (cf. [Roc70], Theorem 28.3) yield the existence of a Lagrange multiplier α i,j (the index indicates the affinity to each constraint in (4.2)) ( ( 1 λ p g)) i,j + α i,jp i,j = 0, (4.4) with α i,j 0 and α i,j ( p i,j 2 1) = 0. Hence, either α i,j > 0 and p i,j = 1 or p i,j < 1 and α i,j > 0. In both cases this leads to α i,j = ( ( 1 λ p g)) i,j. Note that ( ( 1 λ p g)) i,j = 0 for α i,j = 0. Thus (4.4) can be solved by a fixed point iteration: p n+1 i,j = p n i,j + τ(( ( p n λg)) i,j ( ( p n λg)) i,j p n+1 i,j ) with initial value p 0 = 0 and τ > 0. Rewriting leads to the following projection algorithm: p n+1 i,j = pn i,j + τ(( ( pn λg)) i,j 1 + ( ( p n λg)) i,j. Convergence is given for τ 1 although in practice the optimal choice for τ appears 8 to be 1. Alternatively one can apply a simpler projection: 4 p n+1 i,j = p n i,j + τ(( ( pn λg)) i,j max{1, p n i,j + τ ( ( pn λg)) i,j }. This algorithm simply projects p n back to the unit ball if the constraint p n 1 is violated. Stability is ensured up to τ 1 (cf. [Cha05]) and in practice the algorithm 4 also converges for that choice of τ. 28
36 Chapter 4: Solution Methods 4.3 Primal-dual Methods For solving the primal-dual formulation we will use Newton s method with damping. Before we will go into detail we have to approximate the constraint for p. p 1 in (3.26) can be stated by adding the characteristic function χ, defined by 0 if p 1 χ(p) := else to L(u, p): [ ] λ inf sup (u f) 2 dx + u p dx + χ(p). (4.5) u BV () p 2 Under the aspect that χ is not differentiable (actually it is not a function in the classical sense), we use an approximation instead. For this purpose we will represent two different techniques in the next subsection Barrier and Penalty Method In the following we specify two alternatives for the approximation. Instead of using the exact formulation with χ as in (4.5) we replace L(u, p) by L ε (u, p) = L(u, p) 1 F( p 1) (4.6) ε with ε > 0 small and a term F penalizing if p 1 > 0. A typical example for F is: F(s) = 1 2 max{s, 0}2. (4.7) The so called Penalty approximation still allows violations of the constraint, alternatively barrier methods (also called interior-point methods ) can be used. Their idea is to add a continuous barrier term G(p) to L such that G(p) =, if the constraint is violated. Since the constraint p 1 is equivalent to p 2 1 we may replace p by its square to achieve differentiability: L ε (u, p) = L(u, p) εg( p 2 1). (4.8) For example one can choose G(s) = log( s). The choice of approximation effects the shape of the solution u of (4.5) which can be seen well in the one dimensional case. Therefore we want to solve the saddle point 29
37 Chapter 4: Solution Methods problem (3.26) with either of the two methods on the given domain = [ 1, 1]. Example 4.1. (Penalty approximation) Using Penalty approximation as introduced in (4.6), with F as defined in (4.7), we achieve the saddle point problem inf u BV () [ λ sup p 2 (u f) 2 dx + } {{ } :=L P (u,p) Therefore we have the following optimality conditions up dx 1 ε max{ p 1, 0}2 ]. L P u = λ(u f) + p = 0 L P p = u g = 0, with g defined by This leads to 1( p 1)sign(p) if p 1 ε g := 0 else. 1 (1 p )sign(p) if p 1 u ε = 0 else (4.9) Example 4.2. (Barrier approximation) Using Barrier approximation as introduced in (4.8), with G(s) = log( s), we achieve the saddle point problem inf u BV () ] (u f) 2 dx + up dx + ε log( ( p 2 1)) } {{ } :=L B (u,p) [ λ sup p 2 with optimality conditions L B u = λ(u f) + p = 0 (4.10) L B p = u + ε 2p p 2 1 = 0, (4.11) 30
38 Chapter 4: Solution Methods leading to u 2p = ε p 2 1. (4.12) We can see that in the case of Penalty approximation (Example 4.1) we have u = 0 for p 1. But we cannot really achieve u = ± since 1 ( p 1) sgn(p) ± is ε only fulfilled for p and this was prevented by the penalty term. This leads to the TV regularization typical stair casing effect but smoothed edges of order 1 ε (Fig. 4.1(a)). In the case of the Barrier method (Example 4.2) u = 0 is only possible at p = 0. However, we have p = 0 on an interval [a, b] [ 1, 1] only if f = c on [a, b] with c constant: p = 0 on [a, b] p = 0 on [a, b] (4.10) λ(u f) = 0 on [a, b] f = u on [a, b] u =0 f = c on [a, b] which is very unlikely for the noisy data f. Furthermore, for a minimum the Barrier term could be bounded by a constant c, i.e. ε log( ( p 2 1)) c (4.13) p 2 1 e c ε. (4.14) Hence p 2 1 might behave like e c ε in some points in which u ε e c ε = εe c ε (4.15) holds. As we can see the gradient of u can possibly take very large values already for a large ε (e.g ε = 0.1). Therefore we obtain sharp edges but no homogeneous areas (Fig. 4.1(b)). Remark 4.1. Note that if ε is small enough (i.e. smaller than the step size) this effect does not occur anymore (see Fig. 4.2). 31
39 Chapter 4: Solution Methods (a) Penalty method (b) Barrier method Figure 4.1: Example for Barrier and Penalty method in the one dimensional case, with step size h = 10 3 and ε = (a) Penalty method (b) Barrier method Figure 4.2: Example for Barrier and Penalty method in the one dimensional case, with step size h = 10 3 and ε = Newton Method with Damping Using penalty approximation as introduced before we achieve the following optimality conditions for our problem: L ε = λ(u f) + p = 0 u L ε p = u 2H(p) = 0 ε with H(p) being the derivative of F( p 1). We linearize the non linear term H(p) via a first-order Taylor-approximation, i.e. H(p k+1 ) H(p k ) + H (p k )(p k+1 p k ). 32
40 Chapter 4: Solution Methods Adding a damping term we have to solve the following linear system in each step: λ(u k+1 f) + p k+1 = 0 u k+1 2 ε H (p k )(p k+1 p k ) 2 ε H(pk ) τ k (p k+1 p k ) = 0. (4.16) Here the parameter τ k controls the damping. This linear system can be discretized easily and is the basis of our parallel algorithm. To achieve fast convergence we choose ε = ε k 0 during the iteration. For better performance of the algorithm it is recommended to start with a small value of τ and increase it during the iteration to avoid oscillations. The starting values of τ and ε we have used, as well as their adaption process, are chosen from some experimental runs of the algorithm. Certainly further research to find optimal values for these parameters is needed. 33
41 Chapter 5 Domain Decomposition As mentioned in the introduction one would like to divide the original problem into several subproblems to solve them in parallel. One idea is to split the given domain of the problem into subdomains i, i = 1,..., S. This approach is called domain decomposition and depending on the choice of subdomains one achieves overlapping or non overlapping decompositions. In the case that all unknowns of the problem are coupled, a straightforward splitting and independent computation on each subdomain results in significant errors across the interfaces. 5.1 Non Overlapping Decomposition Let R d. We split into S subregions i such that S i = with i j = for i j. i=1 For a better understanding let us restrict to the case of a decomposition into two subdomains in the two dimensional case (cf. Fig. 5.1). As an example let us consider the Poisson equation with homogeneous Dirichlet boundary conditions. Example 5.1. Let = [ 1, 1] 2 R 2 u = f u = 0 in on Then, with u i being the restriction of u to i and n i the outward normal to i 34
42 Chapter 5: Domain Decomposition Γ 1 2 Figure 5.1: Non overlapping decomposition with S = 2 and d = 2. Here Γ := 1 2 denotes the interfaces between the subdomains. (i = 1, 2), this is equivalent to (cf. [TW05]): u 1 = f in 1 u 1 = 0 on 1 \ Γ u 2 = f in 2 u 2 = 0 on 2 \ Γ u 1 = u 2 on Γ u 1 n 1 = u 2 n 2 on Γ As one can see, there are conditions on the interface Γ, so called transmission conditions. If they are neglected (as it is the case for an ad hoc approach) there may arise artifacts at the interface (see Fig 5.2 for an example). There are some algorithms to avoid this issue (e.g. the Dirichlet-Neumann algorithm, or the Neumann-Neumann algorithm). We limit ourselves on a more detailed consideration of overlapping domain decompositions, so for further information see [TW05]. 5.2 Overlapping Decomposition To avoid the computation of the transmission conditions in the case of non- overlapping methods one can apply overlapping partitions. At the cost of having redundant degrees of freedom, and thus larger systems to solve, the update of the boundary data can be easily obtained from exactly this redundancy. Expanding the i from 35
43 Chapter 5: Domain Decomposition y x y x (a) Solution without decomposition (a) Solution with a 1 2 decomposition Figure 5.2: Poisson equation with f = 3x 2 on = [ 1,1] 2 and homogeneous Dirichlet boundary conditions. Neglecting the transmission conditions results in artifacts at the interface (here x = 0). the previous section to i, such that d( i j, j i) δ for i j and i j whereby i is truncated at the boundary of we achieve an overlapping domain decomposition. In the case of being an uniform lattice with step size h, δ is given by δ = mh with m N. 1 { }} { { }} { Γ 1 Γ 2 } {{ } δ Figure 5.3: Overlapping decomposition with S = 2 and d = 2. Here Γ 1 : 1 2 and Γ 2 := 2 1 denotes the interfaces between the subdomains. 36
44 Chapter 5: Domain Decomposition 5.3 Schwarz Iteration One of the first approaches for domain decomposition was the multiplicative Schwarz method, introduced in 1870 by H.A. Schwarz [Sch70]. The proof of convergence can be obtained via a maximum principle (see, e.g., [Lio88]). A similar formulation leads to the additive Schwarz method, and as we will see there exist an affinity to well known techniques for solving linear equation systems y x y x (a) After 1 iteration (b) After 2 iterations y x y x (c) After 6 iterations (d) After 28 iterations Figure 5.4: Poisson equation with f = 3x 2 on = [ 1,1] 2 and homogeneous Dirichlet boundary conditions. The step size is h = 1/32 and the overlap δ = Multiplicative Schwarz Method The multiplicative Schwarz algorithm consists of two fractional steps: Let u (0) be an initial function, then we subsequently solve Lu (k+1) 1 = f, in 1 u (k+1) 1 = u (k) Γ 1, on Γ 1 u (k+1) 1 = 0, on 1 \ Γ 1 and Lu (k+1) 2 = f, in 2 u (k+1) 2 = u (k+1) 1 Γ 2, on Γ 2 u (k+1) 2 = 0, on 2 \ Γ 2. 37
45 Chapter 5: Domain Decomposition The next step is computed by u (k+1) (x) = u (k+1) 2 (x), if x 2 u (k+1) 1 (x), if x \ 2. For an example let us look again at the Poisson equation (cf. Example 5.1). Example 5.2. Let be a uniform lattice with step size h on [ 1, 1] 2 R 2 and 1, 2 a decomposition of as in Figure 5.3. Choosing an overlap of δ = 2 h and applying the multiplicative Schwarz Algorithm provides a good approximation for the solution after a few iterations (see Fig. 5.4). The multiplicative Schwarz method is related to the well-known Gauss-Seidel method and at a first view this approach is not convenient for a parallel implementation due to the need of u k+1 1 for the computation of u k+1 2. But dividing the domain into several subdomains and painting them in two colors (let us say black and white) such that divisions of the same color do not overlap allows a parallel computation. An easy example for such a colored division is shown in Figure 5.5. First solving on all black painted domains can be done in parallel and provides the boundary conditions for the white domains, on which the solution can be computed afterwards also in parallel. In the realization of the parallelization one would provide a domain of each color in each processor. Note that in the case of a two dimensional domain and a splitting in each direction one would need four colors to obtain a decomposition as mentioned above (see Fig. 5.6). Figure 5.5: Coloring with two colors in the case of 1 4 subdomains. The shaded areas indicates the overlapping. Figure 5.6: Coloring with four colors in the case of 4 4 subdomains (the overlap is not indicated). 38
46 Chapter 5: Domain Decomposition Additive Schwarz Method Alternatively one can use the additive Schwarz method, which provides a direct application of parallelization: Lu (k+1) 1 = f, in 1 u (k+1) 1 = u (k) Γ 1, on Γ 1 u (k+1) 1 = 0, on 1 \ Γ 1 The next step is computed by u (k+1) (x) = u (k+1) and Lu (k+1) 2 = f, in 2 u (k+1) 2 = u (k) 1 Γ 2, on Γ 2 u (k+1) 2 (x), if x \ 1 u (k+1) 1 (x), if x \ 2 2 = 0, on 2 \ Γ 2. u (k+1) 1 (x)+u (k+1) 2 (x) 2 if x 1 2. As we can see there are no dependences between the subdomains. Hence all subdomains can be assigned to different processors and computed in parallel without further modification. A coloring as used by the multiplicative Schwarz Method is not necessary. Since in the k + 1 iteration the boundary values are taken from the k step, this approach is akin to the well-known Jacobi method. 5.4 Application Using domain decomposition methods for image processing was inspired by an approach of M. Fornasier and C.-B. Schönlieb (cf. [FS07],[For07]). In analogy to the Schwarz multiplicative algorithm they present a subspace correction method to solve the minimization problem (3.6) based on the following iteration procedure: u (k+1) 1 arg min v1 V 1 J(v 1 + u (k) 2 ) u (k+1) 2 arg min v2 V 2 J(u (k+1) 1 + v 2 ) u (k+1) := u (k+1) 1 + u (k+1) 2 with initial condition u (0) = u (0) 1 +u (0) 2 V 1 V 2 and V 1, V 2 a splitting of the original space in two orthogonal subspaces. The subspace minimizations are solved by oblique thresholding, where the projection was computed by the algorithm proposed by Chambolle [Cha04] (see also Section 4.2). They also provide a modification for parallel computation. Under appropriate assumptions convergence of the algorithm to the minimum of J is ensured. But in general these conditions are not fulfilled and one only achieves 39
47 Chapter 5: Domain Decomposition convergence to a point where J is smaller then the initial choice. However the numerical results are still promising. The algorithm requires the computation of a fixed point η, which can be restricted to a small strip around the interface. Unfortunately the width of the strip is dependent on the parameter λ, in particular for λ increasing (i.e. stronger smoothing) the strip size decreases. Since the primal-dual Newton method as introduced in Section yields in solving a linear system, we can apply the Schwarz approach directly (see Section 7.3) and thus only need a overlap of one pixel, independent of the choice of λ. 40
48 Chapter 6 Basic Parallel Principles Traditionally, software has been written for sequential computation, which means that the program runs on a single computer having a single CPU (Central Processing Unit), whereas parallelization allows to run the programs on multiple CPUs. The computing resource for a parallel computation can be a single computer with multiple CPUs, a network of computers with single CPUs or a hybrid of both. The aim of parallelization is to save computational time. More important, due to the fact that some computations are limited by their memory requirements, they become only computable by dividing the given data to several processors. Once we have decomposed our domain into several subdomains we want to divide them on multiple tasks. Our choice of a parallel programming model is the Message Passing Model. It can be used on Shared Memory machines, Distributed Memory machines as well as on hybrid architectures. Since modern computers, like our test system ZIVHP 1, employ both shared and distributed memory architectures this is of great importance. For the implementation of our algorithms we have used MATLAB R 2. In order to make the Message Passing Interface (MPI) available in MATLAB R, we additionally used MatlabMPI 3 provided by the Lincoln Laboratory of the Massachusetts Institute Of Technology (MIT). In this chapter we want to state what MPI is and why we have used it for our problem. First we want to give a brief introduction to MPI. 6.1 MPI The Message Passing Interface (MPI) standard in its first version was introduced in 1994 by the MPI Forum. It supplies a set of C, C ++ and Fortran functions for writing parallel programs by using explicit communication between the tasks. Up to date 1 see 8.2 for more derails 2 The MathWorks TM,
49 Chapter 6: Basic Parallel Principles MPI is available in version The actual number of used processes is declared at startup. Processes that should communicate with each other are grouped in so-called communicators, where a priori all processes belong to a predefined communicator called MPI COMM WORLD, which is the only one we will use. All processes are numbered increasingly starting at 0 (its rank), which can be obtain during runtime with the function MPI Comm rank. Analogously the size of the communicator (i.e. the number of all processes) can be obtained via MPI Comm size. Point-to-point communication (e.g. MPI Recv, MPI Send) as well as collective operations (e.g. MPI Bcast) are available. Since we only use MatlabMPI, where just a few of the communication methods are implemented, we restrict ourselves to a detailed consideration only for this methods MatlabMPI MatlabMPI provides a set of MATLAB R scripts implementing some of the essential MPI routines, in particular MPI Send, MPI Recv and MPI Bcast. One main difference to MPI is the fact that MatlabMPI uses the MATLAB R standard I/O, i.e. the buffer files are saved in.mat files. The MatlabMPI routines are structured as follows: MPI Send(dest,tag,comm,var1,var2,...) where dest tag comm var1,var2,... rank of the destination task unique tag for communication a MPI communicator (typically MPT COMM WORLD) variables to be send [var1,var2,...] = MPI Recv(source,tag,comm) where source tag comm var1,var2,... rank of the source task unique tag for communication a MPI communicator (typically MPT COMM WORLD) variables to be received Note thatmpi Send is non-blocking in the sense of that the next statement can be executed immediately after the message was saved in the.mat file, whereas MPI Recv is blocking, i.e. the program is suspended until the message was received. To make MatlabMPI available
50 Chapter 6: Basic Parallel Principles in MATLAB R the src folder has to be added to the path definitions as well as the folder the m-file is started from. One has to ensure that the path definitions are set a priori at each start of a MATLAB R session. A script using MatlabMPI can be run as follows: eval(mpi Run(m-File, processes,machines)); where m-file is the script to be started, processes the number of processes and machines the machines that should be used. For running on local processors, machines is set as {}, otherwise it contains the list of nodes which are then connected via SSH (resp. RSH). Before running a script all files created by MatlabMPI have to be removed using the MatMPI Delete all command. For more detailed information see also the README files or the introduction given on the above mentioned homepage. 6.2 Speedup To measure how much faster a parallel algorithm on p processors is in comparison to its corresponding sequential version, one can look at the speedup, which is defined by S p := T 1 T p. Here T 1 denotes the execution time of the sequential algorithm and T p the time needed on p processors. One says that the speedup is linear or ideal if S p = p. In most cases the speedup is lower than linear (S p < p), which results from the overhead arising in parallel programming. The overhead occurs i.e. due to the communication effort between the processors, extra redundant computations, or a changed algorithm. Also, against the first impression, a superlinear speedup (S p > p) is possible. A main reason for superlinear speedup is the cache effect: Due to the smaller problem size on each CPU the cache swapping is reduced and the memory access time decreases sharply. 43
51 Chapter 7 Numerical Realization In this chapter we specify the numerical realization of the primal-dual Newton method with damping as introduced in Section Since we apply the anisotropic total variation (cf. (3.5)) we make use of a slightly changed penalty term F: F(p 1,p 2 ) = 1 2 max{ p 1 1,0} max{ p 2 1,0} 2 and hence H(p) = ( sgn(p 1 ) ( p 1 1) 1 { p1 1} sgn(p 2 ) ( p 2 1) 1 { p2 1} ) and its Hessian H (p) = ( 1 { p1 1} { p2 1} ). (7.1) Note that we limit ourselves to the consideration of the two dimensional case, but an adaption to higher dimensions can be easily done. First let us assume f [0,1] n n being a given noisy image. The aim is to find a denoised image u by solving the linear system (4.16). Laying the degrees of freedom of the dual variable p in the center between the pixels of u allows us to compute the divergence of p (using a single-sided difference quotient) effective as a value in each pixel. (cf. Fig. 7.1). For our given issue this looks as follows: Define a discrete gradient u by u := (( u) 1,( u) 2 )) (7.2) with ( u) 1 i,j = ( u) 2 i,j = u i+1,j u i,j if i < n 0 if i = n u i,j+1 u i,j if i < n 0 if i = n 44
52 Chapter 7: Numerical Realization u11 p 2 11 u12 p p 2 1n 1 u1n p 1 11 p p 1 1n u21 p 2 21 u22 p p 2 2n 1 u2n p 1 21 p p 1 2n. p 1 n 11 p 1 n p 1 n 1n un1 p 2 n1 un2 p 2 n2... p 2 nn 1 unn Figure 7.1: The degrees of freedom of the vector field p = (p 1,p 2 ) lay between the center of the pixels of u, where u is represented by a (n n) matrix. So p 1 can be construed as a (n 1 n) matrix and p 2 as a (n n 1) matrix. for i,j = 1,...,n. Hence the discrete divergence as the negative adjoint operator of the gradient is given by p 1 i,j p1 i 1,j if 1 < i < n ( p) i,j = p 1 i,j if i = 1 if i = n p 1 i 1,j p 2 i,j p2 i,j 1 if 1 < j < n + p 2 i,j if j = 1 p 2 i,j 1 if j = n. (7.3) 7.1 Sequential Implementation After having discretized the linear equation system (4.16), we describe the implementation process in detail. We first want to construct a matrix A and a vector b such that under an appropriate renumbering for u and p the following matrix equation Ax = b (7.4) is equivalent to the discretized linear system. For this purpose let us write u,p 1 and p 2 column wise in vectors: u = u 11. u n1 u 12. u n2., p 1 = p p 1 n 11 p p 1 n 12., p 2 = p 2 11 u nn p 1 n 1n p 2 nn 1 We define the matrices D 1 and D 2 to compute the gradient of u resp. the divergence of p. p 2 n1 p p 2 n2. 45
53 Chapter 7: Numerical Realization by: D 1 := n n 1 { }} { } {{ } N n N and D 2 := n N } {{ } N n Note that the structure of these matrices arises from writing the matrices for u, p 1 and p 2 column wise as vectors of size N = n 2, resp. N n. So (D 1 D 2 ) ( p 1 ) p computes the 2 divergence of p and, recalling that the divergence is the negative adjoint of the gradient, (D 1 D 2 ) t u = ( D1 t ) D u computes the negative gradient of u. 2 t 46
54 Chapter 7: Numerical Realization Now we can construct our system matrix A as follows: A = N N n N n { }} {{ }} {{ }} { λ... D1 D 2 λ D1 t 2 ε M 1 τi 0 D2 t 0 2 ε M 2 τi N N n N n with the (N n) (N n) identity matrix I and M 1, M 2 being diagonal matrices with (M i ) jj = (H (p i j)) ii for i = 1,2 and j = 1,...,N n. Note that H (p) is a diagonal matrix (see 7.1). Now let us write the primal and dual variables in one vector x: x = u p 1 p 2 } N } N n } N n With the righthand side λf 11. b = λf nn 2 ε H ( p 1 ) p 1 τp ε H( p 1 ) 2 ε H ( p 2 ) p 2 τp ε H( p 2 ) N } N n } N n we have to solve the linear equation system Ax = b. This is the sequential ad-hoc realization of (4.16) and is implemented in TV primaldual sequential.m. The matrix A is sparse and its shape provides some improvements for solving the equation system, i.e with the Schur Complement. 47
55 Chapter 7: Numerical Realization 7.2 Schur Complement We have to solve a linear equation system of the form ( )( ) ( C D u f = D t B p g ) (7.5) with diagonal matrices B and C. This can be written as Cu + Dp = f (7.6) D t u + Bp = g. (7.7) Solving (7.7) for p gives us p = B 1 (g D t u). (7.8) Reinserting this in (7.6) leads to (C DB 1 D t )u = f DB 1 g. (7.9) The Matrix S := (C D 1 B t D) is called Schur complement matrix. Computing the inverse matrix B 1 of B is quite easy since B is diagonal. After solving (7.9) one obtains p with (7.8). So we have reduced our equation system from size O(N 3 ) to O(N), however there have to be computed several matrix products. Note that B is only a diagonal matrix in the case of the anisotropic norm we have used in all of our considerations. If one would like to use other norms, one could alternatively eliminate u instead of p in (7.5). In this case the size of the equation system is only reduced to O(N 2 ). Analogous to what we have done before we achieve: (B D t C 1 D)p = g D t C 1 f. We then obtain u from u = C 1 (f Dp). Again C is easy to invert due to its diagonal shape. 7.3 Parallel Implementation In the style of the additive and multiplicative Schwarz method we present two parallel implementations of the primal-dual Newton method with damping. For this sake we have 48
56 Chapter 7: Numerical Realization to distribute the discrete domain to several processors. We split the domain at the primal variable u and hence need boundary values in p from the neighbor processors. Since this values (so-called ghost cells) are not available a priori they have to be communicated in each iteration step. Such a distribution is shown in Figure 7.3 in the case of an image of size 5 5 and a total number of four processors. The sequential version is shown in Figure 7.2. For simplification the primal variable u is denoted with and the dual variable p = (p 1,p 2 ) with, resp.. Figure 7.2: Sequential version: All values on one processor. The interface at which the image will be split in a parallel version with four processors is marked in green. Processor #1 Processor #2 Processor #3 Processor #4 Figure 7.3: Parallel version with four processors: Every processor gets its assigned values. The ghost cells (here marked in red) have to be communicated after each iteration. 49
57 Chapter 7: Numerical Realization Additive Version For the mathematical background we limit ourselves to the consideration of a decomposition of in two subdomains 1 an 2. The adaption to the general case is straight-forward. Assuming homogeneous Neumann boundary conditions, the linear system (7.4) is given by A ( u p ) = b in p = 0 on. Note that the Neumann boundary conditions for u turn into Dirichlet boundary conditions for p. Similar to the additive Schwarz algorithm introduced in Section we iteratively solve and ( (k+1) u 1 A 1 p (k+1) 1 ) = b 1 in 1 p (k+1) 1 = p (k) 2 on Γ 1 p (k+1) 1 = 0 on 1 \ Γ 1 A 2 ( u (k+1) 2 p (k+1) 2 ) = b 2 in 2 p (k+1) 2 = p (k) 1 on Γ 2 p (k+1) 2 = 0 on 2 \ Γ 2. Here A i, for i = 1,2, denotes the restriction of the system matrix A to the domain i. Analogous b i denotes the restriction of the righthand side b to i. Take notice of the notation p 1 (resp. p 2 ) here indicating the affinity to 1 (resp. 2 ) and not the components p 1, p 2 of the vector field p. After m iterations we obtain the solution u as u (m) 1 in 1 \ 2 u = u (m) = u (m) 2 in 2 \ 2 u (m) 1 +u (m) 2 2 in 1 1 The additive algorithm is implemented in TV primaldual schur.m. 50
58 Chapter 7: Numerical Realization Multiplicative Version Alternatively we can take the multiplicative Schwarz algorithm as a basis with a coloring as mentioned in Section Recall that for a division in each direction we would need four colors. Providing a domain of each color in each processor, the number of divisions will be four times higher then the number of processors, which is undesirable because the subdomains may become very small and the communication effort would be too high. We have found that for a subdomain size lower than the communication slows down the algorithm (for details see Chapter 8). Thus for an overlapping of δ = 2, we suggest to use only two colors and achieve a division similar to a checkerboard (Fig. 7.4). Also there will be an overlapping of domains of same color, the overlapping is only one pixel and hence is nearly neglectable. An additional advantage of this light version is the easier implementation. Figure 7.4: Division of a domain in colored subdomains, with color black and white. The overlapping pixels of same colored subdomains are marked in red. The multiplicative algorithm is implemented in TV primaldual checkerboard.m Remarks Since the sequential algorithm is already iterative, one achieves in the parallel versions two iteration procedures. The inner iteration solves the TV-minimization problem on each subdomain and the outer iteration corresponds to the Schwarz iterations. Setting the number of inner iterations to 1 leads to a well exchange of the boundary conditions but a high communication effort. Otherwise, let the TV-minimization converge on each subdomain before communicating (i.e. setting only the convergence criterion as a limit for the inner iteration) results in less outer iterations but a higher computation time. The best solution may be a trade off between both, but this would require a better understanding of the choice of the parameters τ and ε. In our algorithm only images of size 2 n 2 m, with m,n N are allowed. Particularly with regard to a suitable load balancing, we want to obtain subdomains of the same size (also this is easier to implement). Hence we duplicate the right and lower boundary and with an overlapping ) ( of ) δ = 2 we can divide the image in d = d x d y subdomains of size (2 n dx m dy
59 Chapter 8 Results Finally we present some results of the implemented algorithms. As we will see both parallel algorithms seem to converge well. They also provide a superlinear speedup if the image size is big enough. For all our test runs we have used the following stopping criterion: u k+1 u k 2 F ǫ where F denotes the Frobenius matrix norm and ǫ the error limit. 8.1 Convergence Results First we show some iteration plots for a test image perturbed with Gaussian noise (Fig. 8.1) computed on different number of CPUs and compare them with the results obtained by the sequential algorithm (Fig. 8.2) (a) Original (b) Noisy Figure 8.1: Original test image of size and noisy version with variance =
60 Chapter 8: Results (a) After 1 iteration (b) After 2 iterations (c) After 3 iterations (d) After 13 iterations Figure 8.2: Sequential Algorithm: λ = 2, ǫ = We can see that the multiplicative algorithm (Fig. 8.3 and 8.5) as well as the additive version (Fig. 8.4 and 8.6) seem to converge well also when applied to a large number of subdomains. The artifacts and displacements at the interfaces which can be observed in the first iterations disappear completely after a sufficient number of iterations. The algorithm preserves discontinuities crossing the interface as well as performs correctly in areas where the solution is continuous. This is independent of the image size and even when a small image is split in many subdivisions the results are still very good (see Figures 8.7 to 8.10). 53
61 Chapter 8: Results (a) After 1 iteration (b) After 2 iterations (c) After 3 iterations (d) After 13 iterations Figure 8.3: Multiplicative version: λ = 2, ǫ = 10 2 running on two processors (i.e. four domains) (a) After 1 iteration (b) After 2 iterations (c) After 3 iterations (d) After 13 iterations Figure 8.4: Additive Version: λ = 2, ǫ = 10 2 running on four processors. 54
62 Chapter 8: Results (a) After 1 iteration (b) After 2 iterations (c) After 3 iterations (d) After 14 iterations Figure 8.5: multiplicative Version: λ = 2, ǫ = 10 2 running on 32 processors (i.e. 64 domains) (a) After 1 iteration (b) After 2 iterations (c) After 3 iterations (d) After 16 iterations Figure 8.6: Additive Version: λ = 2, ǫ = 10 2 running on 32 processors. 55
63 Chapter 8: Results (a) Original (b) Noisy (c) 1 processor (d) 2 processors (e) 4 processors (f) 8 processors (g) 16 processors (h) 32 processors Figure 8.7: Additive algorithm, image size 16 16, noisy image with Gaussian noise of variance 0.001, λ = 2, ε = 1000, τ = 0.001, ǫ =
64 Chapter 8: Results (a) Original (b) Noisy (c) 1 processor (d) 2 processors (e) 4 processors (f) 8 processors (g) 16 processors (h) 32 processors Figure 8.8: Multiplicative algorithm, image size 16 16, noisy image with Gaussian noise of variance 0.001, λ = 2, ε = 1000, τ = 0.001, ǫ =
65 Chapter 8: Results (a) Original (b) Noisy (c) 1 processor (d) 2 processors (e) 4 processors (f) 8 processors (g) 16 processors (h) 32 processors Figure 8.9: Additive algorithm, image size 64 64, noisy image with Gaussian noise of variance 0.005, λ = 2, ε = 1000, τ = 0.001, ǫ =
66 Chapter 8: Results (a) Original (b) Noisy (c) 1 processor (d) 2 processors (e) 4 processors (f) 8 processors (g) 16 processors (h) 32 processors Figure 8.10: Multiplicative algorithm, image size 64 64, noisy image with Gaussian noise of variance 0.005, λ = 2, ε = 1000, τ = 0.001, ǫ =
67 Chapter 8: Results 8.2 Computation Time and Speedup Very important for an parallel algorithm is its scalability, i.e. its efficiency when applied to an increasing number of nodes. Hence we will measure the speedup (see Section 6.2) depending on the the image size and the number of CPUs. For this purpose we first specify the theoretical speedup on p processors. To simplify the computation we assume the image to be square and p to be a square number. Let n n be the image size of the original problem. Then the dimension of the system matrix using Schur complement is n 2 n 2 (cf. Section 7.2). Assuming that the equation system is solved with Gaussian elimination leads to a complexity of O(n 6 ). Under the above assumptions the image size of the subproblems on each processor is n p n p and hence the matrix dimensions are n2 p n2 p The theoretical speedup can thus be computed as leading to a complexity of O(n6 p 3 ). S p = T 1 T p = O(n6 ) O( n6 p 3 ) = O(p3 ). (8.1) Note that in this computation the overhead arising in parallelization (i.e. due to communication between the CPUs) was neglected. It also should be mentioned that we have actually used the mldivide command in MATLAB R which uses different solvers depending on the structure of the matrix. Hence the computed speedup can not be applied for comparison with the real speedup. However it indicates that we can expect a superlinear speedup. Definitely the sequential algorithm is not optimal. One might decompose the image in an analogous way to the parallel case and then solve the arising smaller subproblems one after the other, but this concept can not be realized without all the mathematical theory done before. In that case one obviously achieve at most linear speedup. Size n n 1 CPU 2 CPUs 4 CPUs 8 CPUs 16 CPUs 32 CPUs Table 8.1: Computation times for the additive algorithm. All computations are done on the test system ZIVHPC, consisting of four IBM blades 1 each equipped with two Intel Xeon Quadcore processors E5430 with 2.66GHz, 12 MB L2- Cache and 1333MHz FSB. The nodes are connected via a 10 Gigabit Ethernet and each 1 Four HS21 XM blades (model nb. 7995G4G) in a IBM BladeCenter H 88524XG. For more details see 60
68 Chapter 8: Results node provides a main memory of 32 GB. We can see (cf. Table 8.1 and 8.2) that for an image size less then about a parallel computation is useless. This is also true for a decomposition of larger images into subimages less then the above mentioned size, e.g. for the computation of size on more then eight CPUs, or for on 16 CPUs. Size n n 1 CPU 2 CPUs 4 CPUs 8 CPUs 16 CPUs 32 CPUs Table 8.2: Computation times for the additive algorithm. The computation times are also visualized in Figure Additive Multiplicative 400 Additive Multiplicative Time s Time s Processes Processes (a) Image size (b) Image size x Additive Multiplicative 6 Additive Multiplicative Time s 2500 Time s Processes Processes (c) Image size (d) Image size Figure 8.11: Computation time The speedup of both algorithms is superlinear (in the sense as mentioned above), although it should be mentioned that the discrepancy should not be so large when applying other solution methods for the linear system (e.g. preconditioned conjugate gradient methods). However we have presented a parallel algorithm for total variation minimization which needs only about 15 minutes for a image on 32 CPU s (or ca. 23 minutes on 61
69 Chapter 8: Results 30 Additive Multiplicative linear Speedup 30 Additive Multiplicative linear Speedup Speedup 15 Speedup Processes Processes (a) Image size (b) Image size Additive Multiplicative linear Speedup Additive Multiplicative linear Speedup Speedup 20 Speedup Processes Processes (c) Image size (d) Image size Figure 8.12: Speedup for the parallel algorithms in compare to the sequential algorithm 16 CPU s). The image size is comparable with a good resolution 3D image and promises similar good results when the algorithm is adapted to the three dimensional case. 62
70 Chapter 9 Conclusion We have given a brief overview of total variation regularization and its solution methods. With Theorem 3.1 we proved the opportunity to consider the dual problem, which leads to a constrained but differentiable minimization problem. We introduced a primal-dual Newton method with damping using penalty approximation and illustrated its numerical realization. More important two parallel algorithms based on the additive respectively the multiplicative Schwarz method are provided. Both seem to converge well and provide a superlinear speedup. The implementation was done in MATLAB R using MatlabMPI, making the Message Passing Interface available in MATLAB R. For an improvement of the algorithm there should be done some research on the optimal choice of the parameters ε and τ as well as their adaption process. Related to this problem is the optimal number of inner iterations, which in our implementation is fixed to one. The next step would be to extend the algorithm to the three dimensional case and rewrite it in C R, where the whole routine library of MPI is available. Although it should be integrated in the EM-TV algorithm for which already a parallel implementation of the EM part exists (cf. [SKG06]). 63
71 Bibliography [AFP00] L. Ambrosio, N. Fusco, and D. Pallara. Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. The Clarendon Press Oxford University Press, New York, [BBD + 06] B. Berkels, M. Burger, M. Droske, O. Nemitz, and M. Rumpf. Cartoon extraction based on anisotropic image classification. In Vision, Modeling, and Visualization Proceedings, pages , [BSW + ] [Bur03] [Bur08] [Cha04] [Cha05] [CS05] [CZC95] [EO04] C. Brune, A. Sawatzky, F. Wübbeling, T. Kösters, and M. Burger. EM-TV methods in inverse problems with poisson noise. in preparation. M. Burger. Lecture notes, 285j Infinite-dimensional Optimization and Optimal Design, Fall UCLA. M. Burger. Mixed discretization of total variation minimization problems. in preparation, A. Chambolle. An algorithm for total variation minimization and applications. J. Math. Imaging Vision, 20(1-2):89 97, Special issue on mathematics and image analysis. A. Chambolle. Total variation minimization and a class of binary MRF models. In 5th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR), volume 3757 of Lecture Notes in Computer Science, pages , Berlin, Springer-Verlag. T. F. Chan and J. Shen. Image Processing and Analysis Variational, PDE, Wavelet, and Stochastic Methods. SIAM, Philadelphia, T. F. Chan, H. M. Zhou, and R. H. Chan. Continuation method for total variation denoising problems. Technical report, S. Esedoḡlu and S. J. Osher. Decomposition of images by the anisotropic Rudin-Osher-Fatemi model. Comm. Pure Appl. Math., 57(12): ,
72 BIBLIOGRAPHY [ET76] [For07] [FS07] [Lio88] [Mey01] [Roc70] [ROF92] I. Ekeland and R. Temam. Convex Analysis and Variational Problems, volume 1 of Studies in Mathematics and its Applications. North-Holland Publishing Co., Amsterdam, Translated from the French, Studies in Mathematics and its Applications, Vol. 1. M. Fornasier. Domain decomposition methods for linear inverse problems with sparsity constraints. Inverse Problems, 23: , M. Fornasier and C.B. Schönlieb. Subspace correction methods for total variation and l 1 -minimization. submitted to SIAM J. Numer. Anal., December P. Lions. On the Schwarz alternating method. I. In R. Glowinski, G. H. Golub, G. A. Meurant, and J. Périaux, editors, First International Symposium on Domain Decomposition Methods for Partial Differential Equations, Philadelphia, SIAM. Y. Meyer. Oscillating Patterns in Image Processing and Monlinear Evolution Equations, volume 22 of University Lecture Series. American Mathematical Society, Providence, RI, The fifteenth Dean Jacqueline B. Lewis memorial lectures. R.T. Rockafellar. Convex Analysis. Princeton Landmarks in Mathematics. Princeton University Press, Princeton, L.I. Rudin, J.S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. In Proceedings of the eleventh annual international conference of the Center for Nonlinear Studies on Experimental mathematics : computational issues in nonlinear science, pages , Amsterdam, The Netherlands, The Netherlands, Elsevier North-Holland, Inc. [SBW + 08] A. Sawatzky, C. Brune, F. Wübbeling, T. Kösters, K. Schäfers, and M. Burger. Accurate EM-TV algorithm in PET with low SNR. October MIC Dresden. [Sch70] [SKG06] H. A. Schwarz. Über einen Grenzübergang durch alternierendes Verfahren. Vierteljahrsschrift der Naturforschenden Gesellschaft in Zürich, 15: , M. Schellmann, T. Kösters, and S. Gorlatch. Parallelization and runtime prediction of the listmode OSEM algorithm for 3D PET reconstruction. In IEEE Nuclear Science Symposium and Medical Imaging Conference Record, pages , San Diego, October IEEE. 65
73 BIBLIOGRAPHY [TW05] [VO96] [Zei85] A. Toselli and O. Widlund. Domain decomposition methods algorithms and theory, volume 34 of Springer Series in Computational Mathematics. Springer- Verlag, Berlin, C. R. Vogel and M. E. Oman. Iterative methods for total variation denoising. SIAM J. Sci. Comput, 17(1): , E. Zeidler. Nonlinear Functional Analysis and Its Applications III - Variatonal Methods and Optimization. Springer-Verlag, Berlin,
74 Eidesstattliche Erklärung Hiermit versichere ich, dass ich die am heutigen Tag eingereichte Diplomarbeit selbständig verfasst und ausschließlich die angegebenen Quellen und Hilfsmittel benutzt habe. Alle auf der CD beigefügten Programme, mit Ausnahme der Toolbox MatlabMPI, sind von mir selbst programmiert worden. Münster, den Unterschrift: 67
Convex analysis and profit/cost/support functions
CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
Date: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
Adaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
Duality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
Variational approach to restore point-like and curve-like singularities in imaging
Variational approach to restore point-like and curve-like singularities in imaging Daniele Graziani joint work with Gilles Aubert and Laure Blanc-Féraud Roma 12/06/2012 Daniele Graziani (Roma) 12/06/2012
17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
Dual Methods for Total Variation-Based Image Restoration
Dual Methods for Total Variation-Based Image Restoration Jamylle Carter Institute for Mathematics and its Applications University of Minnesota, Twin Cities Ph.D. (Mathematics), UCLA, 2001 Advisor: Tony
No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics
No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results
1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no
Numerical Verification of Optimality Conditions in Optimal Control Problems
Numerical Verification of Optimality Conditions in Optimal Control Problems Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Julius-Maximilians-Universität Würzburg vorgelegt von
Big Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
BANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
Applications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
Metric Spaces. Chapter 1
Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence
Numerical Methods For Image Restoration
Numerical Methods For Image Restoration CIRAM Alessandro Lanza University of Bologna, Italy Faculty of Engineering CIRAM Outline 1. Image Restoration as an inverse problem 2. Image degradation models:
Separation Properties for Locally Convex Cones
Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam
Metric Spaces. Chapter 7. 7.1. Metrics
Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some
Stochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
Numerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
Nonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris [email protected] Nonlinear optimization c 2006 Jean-Philippe Vert,
Introduction to the Finite Element Method
Introduction to the Finite Element Method 09.06.2009 Outline Motivation Partial Differential Equations (PDEs) Finite Difference Method (FDM) Finite Element Method (FEM) References Motivation Figure: cross
THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS
THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS KEITH CONRAD 1. Introduction The Fundamental Theorem of Algebra says every nonconstant polynomial with complex coefficients can be factored into linear
1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
CONSTANT-SIGN SOLUTIONS FOR A NONLINEAR NEUMANN PROBLEM INVOLVING THE DISCRETE p-laplacian. Pasquale Candito and Giuseppina D Aguí
Opuscula Math. 34 no. 4 2014 683 690 http://dx.doi.org/10.7494/opmath.2014.34.4.683 Opuscula Mathematica CONSTANT-SIGN SOLUTIONS FOR A NONLINEAR NEUMANN PROBLEM INVOLVING THE DISCRETE p-laplacian Pasquale
(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
Numerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
Research Article Stability Analysis for Higher-Order Adjacent Derivative in Parametrized Vector Optimization
Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 2010, Article ID 510838, 15 pages doi:10.1155/2010/510838 Research Article Stability Analysis for Higher-Order Adjacent Derivative
A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS. In memory of Rou-Huai Wang
A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS XAVIER CABRÉ, MANEL SANCHÓN, AND JOEL SPRUCK In memory of Rou-Huai Wang 1. Introduction In this note we consider semistable
Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
The Heat Equation. Lectures INF2320 p. 1/88
The Heat Equation Lectures INF232 p. 1/88 Lectures INF232 p. 2/88 The Heat Equation We study the heat equation: u t = u xx for x (,1), t >, (1) u(,t) = u(1,t) = for t >, (2) u(x,) = f(x) for x (,1), (3)
An optimal transportation problem with import/export taxes on the boundary
An optimal transportation problem with import/export taxes on the boundary Julián Toledo Workshop International sur les Mathématiques et l Environnement Essaouira, November 2012..................... Joint
ALMOST COMMON PRIORS 1. INTRODUCTION
ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type
ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE. 1. Introduction
ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE J.M. CALABUIG, J. RODRÍGUEZ, AND E.A. SÁNCHEZ-PÉREZ Abstract. Let m be a vector measure taking values in a Banach space X. We prove that
Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach
Outline Error Bound for Classes of Polynomial Systems and its Applications: A Variational Analysis Approach The University of New South Wales SPOM 2013 Joint work with V. Jeyakumar, B.S. Mordukhovich and
Convex Programming Tools for Disjunctive Programs
Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,
Properties of BMO functions whose reciprocals are also BMO
Properties of BMO functions whose reciprocals are also BMO R. L. Johnson and C. J. Neugebauer The main result says that a non-negative BMO-function w, whose reciprocal is also in BMO, belongs to p> A p,and
Quasi-static evolution and congested transport
Quasi-static evolution and congested transport Inwon Kim Joint with Damon Alexander, Katy Craig and Yao Yao UCLA, UW Madison Hard congestion in crowd motion The following crowd motion model is proposed
How To Prove The Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
Extremal equilibria for reaction diffusion equations in bounded domains and applications.
Extremal equilibria for reaction diffusion equations in bounded domains and applications. Aníbal Rodríguez-Bernal Alejandro Vidal-López Departamento de Matemática Aplicada Universidad Complutense de Madrid,
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
GI01/M055 Supervised Learning Proximal Methods
GI01/M055 Supervised Learning Proximal Methods Massimiliano Pontil (based on notes by Luca Baldassarre) (UCL) Proximal Methods 1 / 20 Today s Plan Problem setting Convex analysis concepts Proximal operators
Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.
1. Differentiation The first derivative of a function measures by how much changes in reaction to an infinitesimal shift in its argument. The largest the derivative (in absolute value), the faster is evolving.
Chapter 5. Banach Spaces
9 Chapter 5 Banach Spaces Many linear equations may be formulated in terms of a suitable linear operator acting on a Banach space. In this chapter, we study Banach spaces and linear operators acting on
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.
Summer course on Convex Optimization Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.Minnesota Interior-Point Methods: the rebirth of an old idea Suppose that f is
Several Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
Optimization of Supply Chain Networks
Optimization of Supply Chain Networks M. Herty TU Kaiserslautern September 2006 (2006) 1 / 41 Contents 1 Supply Chain Modeling 2 Networks 3 Optimization Continuous optimal control problem Discrete optimal
Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.
Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian
Math 4310 Handout - Quotient Vector Spaces
Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable
Class Meeting # 1: Introduction to PDEs
MATH 18.152 COURSE NOTES - CLASS MEETING # 1 18.152 Introduction to PDEs, Fall 2011 Professor: Jared Speck Class Meeting # 1: Introduction to PDEs 1. What is a PDE? We will be studying functions u = u(x
Systems with Persistent Memory: the Observation Inequality Problems and Solutions
Chapter 6 Systems with Persistent Memory: the Observation Inequality Problems and Solutions Facts that are recalled in the problems wt) = ut) + 1 c A 1 s ] R c t s)) hws) + Ks r)wr)dr ds. 6.1) w = w +
Fixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
Integrating Benders decomposition within Constraint Programming
Integrating Benders decomposition within Constraint Programming Hadrien Cambazard, Narendra Jussien email: {hcambaza,jussien}@emn.fr École des Mines de Nantes, LINA CNRS FRE 2729 4 rue Alfred Kastler BP
The Henstock-Kurzweil-Stieltjes type integral for real functions on a fractal subset of the real line
The Henstock-Kurzweil-Stieltjes type integral for real functions on a fractal subset of the real line D. Bongiorno, G. Corrao Dipartimento di Ingegneria lettrica, lettronica e delle Telecomunicazioni,
ROF model on the graph
Department of Mathematics ROF model on the graph Japhet iyobuhungiro and Eric Setterqvist LiTH-MAT-R--14/6--SE Department of Mathematics Linköping University S-581 83 Linköping, Sweden. LiTH-MAT-R-14/6-SE
Mathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
A FIRST COURSE IN OPTIMIZATION THEORY
A FIRST COURSE IN OPTIMIZATION THEORY RANGARAJAN K. SUNDARAM New York University CAMBRIDGE UNIVERSITY PRESS Contents Preface Acknowledgements page xiii xvii 1 Mathematical Preliminaries 1 1.1 Notation
4.6 Linear Programming duality
4.6 Linear Programming duality To any minimization (maximization) LP we can associate a closely related maximization (minimization) LP. Different spaces and objective functions but in general same optimal
MA651 Topology. Lecture 6. Separation Axioms.
MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Bargaining Solutions in a Social Network
Bargaining Solutions in a Social Network Tanmoy Chakraborty and Michael Kearns Department of Computer and Information Science University of Pennsylvania Abstract. We study the concept of bargaining solutions,
Mathematical Methods of Engineering Analysis
Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................
CHAPTER 9. Integer Programming
CHAPTER 9 Integer Programming An integer linear program (ILP) is, by definition, a linear program with the additional constraint that all variables take integer values: (9.1) max c T x s t Ax b and x integral
Online Convex Optimization
E0 370 Statistical Learning heory Lecture 19 Oct 22, 2013 Online Convex Optimization Lecturer: Shivani Agarwal Scribe: Aadirupa 1 Introduction In this lecture we shall look at a fairly general setting
A new continuous dependence result for impulsive retarded functional differential equations
CADERNOS DE MATEMÁTICA 11, 37 47 May (2010) ARTIGO NÚMERO SMA#324 A new continuous dependence result for impulsive retarded functional differential equations M. Federson * Instituto de Ciências Matemáticas
Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach
MASTER S THESIS Recovery of primal solutions from dual subgradient methods for mixed binary linear programming; a branch-and-bound approach PAULINE ALDENVIK MIRJAM SCHIERSCHER Department of Mathematical
Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter [email protected] February 20, 2009 Notice: This work was supported by the U.S.
A RIGOROUS AND COMPLETED STATEMENT ON HELMHOLTZ THEOREM
Progress In Electromagnetics Research, PIER 69, 287 304, 2007 A RIGOROU AND COMPLETED TATEMENT ON HELMHOLTZ THEOREM Y. F. Gui and W. B. Dou tate Key Lab of Millimeter Waves outheast University Nanjing,
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
Nonlinear Algebraic Equations. Lectures INF2320 p. 1/88
Nonlinear Algebraic Equations Lectures INF2320 p. 1/88 Lectures INF2320 p. 2/88 Nonlinear algebraic equations When solving the system u (t) = g(u), u(0) = u 0, (1) with an implicit Euler scheme we have
CONTINUED FRACTIONS AND FACTORING. Niels Lauritzen
CONTINUED FRACTIONS AND FACTORING Niels Lauritzen ii NIELS LAURITZEN DEPARTMENT OF MATHEMATICAL SCIENCES UNIVERSITY OF AARHUS, DENMARK EMAIL: [email protected] URL: http://home.imf.au.dk/niels/ Contents
Domain Decomposition Methods. Partial Differential Equations
Domain Decomposition Methods for Partial Differential Equations ALFIO QUARTERONI Professor ofnumericalanalysis, Politecnico di Milano, Italy, and Ecole Polytechnique Federale de Lausanne, Switzerland ALBERTO
OPTIMAL CONTROL OF A COMMERCIAL LOAN REPAYMENT PLAN. E.V. Grigorieva. E.N. Khailov
DISCRETE AND CONTINUOUS Website: http://aimsciences.org DYNAMICAL SYSTEMS Supplement Volume 2005 pp. 345 354 OPTIMAL CONTROL OF A COMMERCIAL LOAN REPAYMENT PLAN E.V. Grigorieva Department of Mathematics
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
ON FIBER DIAMETERS OF CONTINUOUS MAPS
ON FIBER DIAMETERS OF CONTINUOUS MAPS PETER S. LANDWEBER, EMANUEL A. LAZAR, AND NEEL PATEL Abstract. We present a surprisingly short proof that for any continuous map f : R n R m, if n > m, then there
Chapter 2: Binomial Methods and the Black-Scholes Formula
Chapter 2: Binomial Methods and the Black-Scholes Formula 2.1 Binomial Trees We consider a financial market consisting of a bond B t = B(t), a stock S t = S(t), and a call-option C t = C(t), where the
Optimal shift scheduling with a global service level constraint
Optimal shift scheduling with a global service level constraint Ger Koole & Erik van der Sluis Vrije Universiteit Division of Mathematics and Computer Science De Boelelaan 1081a, 1081 HV Amsterdam The
Fuzzy Differential Systems and the New Concept of Stability
Nonlinear Dynamics and Systems Theory, 1(2) (2001) 111 119 Fuzzy Differential Systems and the New Concept of Stability V. Lakshmikantham 1 and S. Leela 2 1 Department of Mathematical Sciences, Florida
Linear Programming Notes V Problem Transformations
Linear Programming Notes V Problem Transformations 1 Introduction Any linear programming problem can be rewritten in either of two standard forms. In the first form, the objective is to maximize, the material
Chapter 3 Nonlinear Model Predictive Control
Chapter 3 Nonlinear Model Predictive Control In this chapter, we introduce the nonlinear model predictive control algorithm in a rigorous way. We start by defining a basic NMPC algorithm for constant reference
Moving Least Squares Approximation
Chapter 7 Moving Least Squares Approimation An alternative to radial basis function interpolation and approimation is the so-called moving least squares method. As we will see below, in this method the
Lecture 11. 3.3.2 Cost functional
Lecture 11 3.3.2 Cost functional Consider functions L(t, x, u) and K(t, x). Here L : R R n R m R, K : R R n R sufficiently regular. (To be consistent with calculus of variations we would have to take L(t,
