Modélisation de problèmes spécifiques avec le logiciel libre FreeFem++ F. Hecht LJLL, UPMC Alpines, INRIA Rocquencourt with S. Auliac, I. Danaila P. Jolivet, A Le Hyaric, F. Nataf, O. Pironneau, P-H. Tournier http://www.freefem.org mailto:hecht@ann.jussieu.fr 4 juin 2013. F. Hecht and al. 1 / 29
Outline 1 Introduction 2 Academic Examples 3 elasticity equation 4 HPC 5 Optimization Tools 6 Simulation 4 juin 2013. F. Hecht and al. 2 / 29
Outline 1 Introduction History The main characteristics A example on industrial numerical development 2 Academic Examples 3 elasticity equation 4 HPC 5 Optimization Tools 6 Simulation 4 juin 2013. F. Hecht and al. 3 / 29
Introduction FreeFem++ is a software to solve numerically partial differential equations (PDE) in IR 2 ) and in IR 3 ) with finite elements methods. We used a user language to set and control the problem. The FreeFem++ language allows for a quick specification of linear PDE s, with the variational formulation of a linear steady state problem and the user can write they own script to solve no linear problem and time depend problem. You can solve coupled problem or problem with moving domain or eigenvalue problem, do mesh adaptation, compute error indicator, etc... By the way, FreeFem++ is build to play with abstract linear, bilinear form on Finite Element Space and interpolation operator. FreeFem++ is a freeware and this run on Mac, Unix and Window architecture, in parallel with MPI. The Next FreeFem++ days, the December 5th and 6th 2013 UPMC, Jussieu, Paris, France 4 juin 2013. F. Hecht and al. 4 / 29
History 1987 MacFem/PCFem les ancêtres (O. Pironneau en Pascal) payant. 1992 FreeFem réécriture de C++ (P1,P0 un maillage) O. Pironneau, D. Bernardi, F. Hecht, C. Prudhomme (adaptation Maillage, bamg). 1996 FreeFem+ réécriture de C++ (P1,P0 plusieurs maillages) O. Pironneau, D. Bernardi, F. Hecht (algèbre de fonction). 1998 FreeFem++ réécriture avec autre noyau élément fini, et un autre langage utilisateur ; F. Hecht, O. Pironneau, K.Ohtsuka. 1999 FreeFem 3d (S. Del Pino), Une première version de freefem en 3d avec des méthodes de domaine fictif. 2008 FreeFem++ v3 réécriture du noyau élément fini pour prendre en compte les cas multidimensionnels : 1d,2d,3d... 4 juin 2013. F. Hecht and al. 5 / 29
For who, for what! For what 1 R&D 2 Academic Research, 3 Teaching of FEM, PDE, Weak form and variational form 4 Algorithmes prototyping 5 Numerical experimentation 6 Scientific computing and Parallel computing For who : the researcher, engineer, professor, student... The mailing list mailto:freefempp@ljll.math.upmc.fr with 416 members with a flux of 5-20 messages per day. More than 2000 true Users ( more than 200 download / month). Some industrial like : Michelin, EADS, EDF, Orange, CEA, EDF, + des PMEs,... 4 juin 2013. F. Hecht and al. 6 / 29
The main characteristics (2d)(3d) Wide range of finite elements : continuous P1,P2 elements, discontinuous P0, P1, RT0,RT1,BDM1, elements,edge element, vectorial element, mini-element,... Automatic interpolation of data from a mesh to an other one ( with matrix construction if need), so a finite element function is view as a function of (x, y, z) or as an array. LU, Cholesky, Crout, CG, GMRES, UMFPack, SuperLU, MUMPS, HIPS, SUPERLU_DIST, PASTIX.... ; eigenvalue and eigenvector computation with ARPACK. Online graphics, C++ like syntax. An integrated development environment FreeFem++-cs (see A. Le Hyaric web page) Automatic mesh generator, based on the Delaunay-Voronoï algorithm. (2d,3d (tetgen) ) Mesh adaptation based on metric, possibly anisotropic (only in 2d), with optional automatic computation of the metric. (2d,3d). Full MPI interface Nonlinear Optimisation tools : CG, Ipopt, NLOpt, stochastic Plugin, and lot of of examples 4 juin 2013. F. Hecht and al. 7 / 29
Periodic Surface Acoustic Waves Transducer Analysis A true industrial numerical problem (2d) : 1 3 EDP : Dielectric, Elasticity, Piezoelectric, Linear, harmonic approximation 2 α-periodic B.C. (New simple idea) 3 semi-infinite Dielectric domain (classical Fourier/Floquet transforme) 4 semi-infinite Piezoelectric domain (Hard) In 9 month, we build with P. Ventura (100%, me 10%) a numerical simulator form scratch (8 months for the validation), The only thing to add to freefem++ is a interface with lapack to compute eigenvector of full 8 8 matrix. Bilan : la société Phonon Corporation a un simulateur. 4 juin 2013. F. Hecht and al. 8 / 29
For Orange, Freezing of pure water Configuration at (physical time) τ = 2340[s] : (a) experimental image from [Kowalewski-1999] ; the thick red line represents the solid-liquid interface computed with the present method (b) computed streamlines showing the two recirculating zones in the fluid phase. Movie:For orange : Frizing of water,flows Region (Boussinesq, enthalpy source) 4 juin 2013. F. Hecht and al. 9 / 29
Outline 1 Introduction 2 Academic Examples Poisson Equation mesh adaptation process 3 elasticity equation 4 HPC 5 Optimization Tools 6 Simulation 4 juin 2013. F. Hecht and al. 10 / 29
Le problème de Poisson dans le Poisson d avril mesh3 Th("fish-3d.msh");// read a mesh 3d fespace Vh(Th,P1); // define the P1 EF space Vh u,v;//set test and unknown function in Vh. macro Grad(u) [dx(u),dy(u),dz(u)] //EOM Grad def solve laplace(u,v,solver=cg) = int3d(th)( Grad(u) *Grad(v) ) - int3d(th) ( 1*v) + on(2,u=2); // int on γ 2 plot(u,fill=1,wait=1,value=0,wait=1); Run:fish.edp Run:fish3d.edp 4 juin 2013. F. Hecht and al. 11 / 29
Example of adaptation process Enter? for help Enter? for help Enter? for help Find optimal mesh in norm L to represent : u = (10 x 3 + y 3 ) + atan2(0.001, (sin(5 y) 2 x)) v = (10 y 3 + x 3 ) + atan2(0.01, (sin(5 x) 2 y)). Run:Adapt-uv.edp Run:CornerLap.edp Run:Laplace-Adapt-3d.edp 4 juin 2013. F. Hecht and al. 12 / 29
Outline 1 Introduction 2 Academic Examples 3 elasticity equation Linear elasticity equation Newton s Method 4 HPC 5 Optimization Tools 6 Simulation 4 juin 2013. F. Hecht and al. 13 / 29
Lame equation / figure Run:beam-3d.edp Run:beam-EV-3d.edp Run:beam-3d-Adapt.edp 4 juin 2013. F. Hecht and al. 14 / 29
Newton s Method To solve F (u) = 0 the Newton s algorithm is 1 u 0 a initial guest 2 do find w n solution of DF (u n )w n = F (u n ) u n+1 = un w n if( w n < ε) break ; For Navier Stokes problem the algorithm is : v, q, F (u, p) = (u. )u.v + u.v + ν u : v q.u p.v + BC Ω DF (u, p)(w, w p ) = Run:cavityNewtow.edp (w. )u.v + (u. )w.v Ω + ν w : v q.w p w.v + BC0 Ω Run:Hyper-Elasticity-2d.edp 4 juin 2013. F. Hecht and al. 15 / 29
Outline 1 Introduction 2 Academic Examples 3 elasticity equation 4 HPC Poisson equation with Schwarz method 5 Optimization Tools 6 Simulation 4 juin 2013. F. Hecht and al. 16 / 29
Poisson equation with Schwarz method To solve the following Poisson problem on domain Ω with boundary Γ in L 2 (Ω) : u = f, in Ω, and u = g on Γ, where f L 2 (Ω) and g H 1 2 (Γ) are two given functions. Let introduce (π i ) i=1,..,np a positive regular partition of the unity of Ω, q-e-d : π i C 0 (Ω) : π i 0 and N p π i = 1. i=1 Denote Ω i the sub domain which is the support of π i function and also denote Γ i the boundary of Ω i, and Ω i = {x/0 < π i < 1} The parallel Schwarz method is Let l = 0 the iterator and a initial guest u 0 respecting the boundary condition (i.e. u 0 Γ = g). Run:DDM-Schwarz-Lap-2dd.edp i = 1.., N p : u l i = f, in Ω i, and u l i = u l on Γ i (1) u l+1 = N p i=1 π iu l i (2) Run:DDM-Schwarz-Lame-3d.edp 4 juin 2013. F. Hecht and al. 17 / 29
Strong scalability in two dimensions heterogeneous elasticity Elasticity problem with heterogeneous coefficients Timing relative to 1 024 processes 8 4 2 1 25 20 15 Linear speedup 10 #iterations 1 024 2 048 4 096 8 192 #processes Speed-up for a 1.2 billion unknowns 2D problem. Direct solvers in the subdomains. Peak performance wall-clock time : 26s. 4 juin 2013. F. Hecht and al. 18 / 29
Strong scalability in three dimensions heterogeneous elasticity Elasticity problem with heterogeneous coefficients Timing relative to 1 024 processes 6 4 2 1 25 20 15 Linear speedup 10 #iterations 1 024 2 048 4 096 6 144 #processes Speed-up for a 160 million unknowns 3D problem. Direct solvers in subdomains. Peak performance wall-clock time : 36s. 4 juin 2013. F. Hecht and al. 19 / 29
Weak scalability in two dimensions Darcy problems with heterogeneous coefficients Weak efficiency relative to 1 024 processes 100 % 80 % 60 % 40 % 20 % 0 % 1 024 2 048 4 096 12 288 8 192 5 088 422 #d.o.f. (in millions) #processes Efficiency for a 2D problem. Direct solvers in the subdomains. Final size : 22 billion unknowns. Wall-clock time : 200s. 4 juin 2013. F. Hecht and al. 20 / 29
Weak scalability in three dimensions Darcy problems with heterogeneous coefficients Weak e ciency relative to 1 024 processes 100 % 80 % 60 % 40 % 20 % 0% 1 024 2 048 4 096 8 192 6 144 105 13 #d.o.f. (in millions) #processes Efficiency for a 3D problem. Direct solvers in the subdomains. Final size : 2 billion unknowns. Wall-clock time : 200s. 4 juin 2013. F. Hecht and al. 21 / 29
Outline 1 Introduction 2 Academic Examples 3 elasticity equation 4 HPC 5 Optimization Tools Ipopt interface Bose Einstein Condensate 6 Simulation 4 juin 2013. F. Hecht and al. 22 / 29
Ipopt The IPOPT optimizer in a FreeFem++ script is done with the IPOPT function included in the ff-ipopt dynamic library. IPOPT is designed to solve constrained minimization problem in the form : find s.t. x 0 = argminf(x) { x R n i n, x lb i x i x ub i i m, c lb i c i (x) c ub i (simple bounds) (constraints functions) Where ub and lb stand for "upper bound" and "lower bound". If for some i, 1 i m we have c lb i = c ub i, it means that c i is an equality constraint, and an inequality one if c lb i < c ub i. 4 juin 2013. F. Hecht and al. 23 / 29
Bose Einstein Condensate Just a direct use of Ipopt interface (2 day of works) The problem is find a complex field u on domain D such that : 1 u = argmin u =1 D 2 u 2 + V trap u 2 + g 2 u 4 Ωiu (( ) ) y x. u to code that in FreeFem++ use Ipopt interface ( https://projects.coin-or.org/ipopt) Adaptation de maillage Run:BEC.edp 4 juin 2013. F. Hecht and al. 24 / 29
Outline 1 Introduction 2 Academic Examples 3 elasticity equation 4 HPC 5 Optimization Tools 6 Simulation Simulation of water and nutrient uptake by plant roots Analysis of pulsatile blood flow in saccular aneurysm P.-H. Tournier 4 juin 2013. F. Hecht and al. 25 / 29
Simulation of water and nutrient uptake by plant roots Water movement in the soil is represented by Richards equation : θ(h) =. q + S t q = K(h) (z + h). Nutrient movement in the soil is represented by the following convection-diffusion equation : t (φ(c) + θc) =.(D c qc) + S c(c). The sink terms S and S c represent water and nutrient uptake by plant roots and are defined in the domain through a characteristic function representative of the geometry of the root system. A 1D model of upward water flows in the root system on a tree-like representation of the root system is coupled with Richards equation through the sink term S. 4 juin 2013. F. Hecht and al. 26 / 29
Simulation of water and nutrient uptake by plant roots Discretization schemes : θ(κ 1 (u ti+1 )) θ(κ 1 ((u ti )) =.( u ti+1 + K(κ 1 (u ti+1 )) z) + S. t φ(c ti+1 ) + θ ti c ti+1 φ(c ti ) θ ti c ti =.(D c ti+1 ) q. c ti+1 Sc ti+1 + S c (c ti+1 ). t Kirchhoff transformation of Richards equation Backward Euler time discretization Newton s method to handle nonlinearities Method of characteristics for the convective term Finite element method Adaptive mesh refinement relative to the characteristic function of the root system Schwartz domain decomposition method animation 4 juin 2013. F. Hecht and al. 27 / 29
Analysis of pulsatile blood flow in saccular aneurysm K. P. Gostaf Finite element simulations of arterial blood flow serve to provide pressure, velocity pattern and wall shear stress in regions of arterial dilatations and bifurcations. Even moderate size models require solving linear systems with millions of unknowns ; 15-20 hours per time step is an average sequential run time, therefore, parallel computing is a necessity. We present a hybrid parallelization frame- work to solve large scale problems of hemodynamics using OpenMP in conjunction with MPI environment. We demonstrate that computational performance of a finite element code could be dramatically improved, when a proper use of a multi-core architecture is carried out. We use an open source FreeFem++ platform. A speed-up of x56 is measured for the IBM idataplex UPMC cluster with 96 cores. MPI/OpenMP programming mode FreeFem++ code 3.4 million unknowns model animation 4 juin 2013. F. Hecht and al. 28 / 29
Conclusion/Future Freefem++ v3 is very good tool to solve non standard PDE in 2D/3D to try new domain decomposition domain algorithm The the future we try to do : Build more graphic with VTK, paraview,... (in progress) Add Finite volume facility for hyperbolic PDE (just begin C.F. FreeVol Projet) 3d anisotrope mesh adaptation automate the parallel tool add multi thread interface Thank for you attention. Run:NSCaraCyl-100-mpi.edp 4 juin 2013. F. Hecht and al. 29 / 29