INF5620: Numerical Methods for Partial Differential Equations p. 1. About the course. Basic features of the course. Course data.

Transcription

1 INF56: Numerical Methods for Partial Differential Equations Hans Petter Langtangen About the course Simula Research Laboratory, and Dept. of Informatics, Univ. of Oslo January 6 INF56: Numerical Methods for Partial Differential Equations p. About the course p. Course data Basic features of the course p Lectures: Wednesdays -4 in B7, math building Sometimes 4 h lectures, sometimes less, sometimes 4 h exercises Course web page: reachable from the official central UiO web page of the course Look for messages at the web page! Teachers: Xing Cai: [email protected] Hans Petter Langtangen: [email protected], Goal: produce solutions of PDEs Integrated approach: mechanics, numerics, algorithms, software Generic approach methods applicable to a wide range of PDE problems Modern numerical methods Modern implementation techniques Non-trivial applications with nonlinear systems of PDEs Analysis of simplified problems Discovery of numerical properties by computer experiments Carry out your own -week PDE project About the course p. 3 About the course p. 4 Contents How to learn it Numerical methods: Finite difference methods Finite element methods (main emphasis) Application areas: Heat transfer Diffusion Wave phenomena Thermo-elasticity Viscous fluid flow Overview from lectures Exercises with hand calculations (get the details!) Compulsory exercises: computer implementations of D finite difference methods D finite element hand calculations -week PDE project (comprehensive implementation) About the course p. 5 About the course p. 6 The exam Acronyms min talk Additional questions 6 topics given two weeks on beforehand Focus on overview and understanding Some focus on mathematical details, derivations, intricate steps in algorithms etc. No focus on details regarding software tools (but some topics will involve overview and principle workings of software tools) PDE = partial differential equation (plural: PDEs) ODE = ordinary differential equation (plural: ODEs) OOP = object-oriented programming About the course p. 7 About the course p. 8

2 Scientific software trends Diffpack Dramatic increase in the interest of problem solving environments: Maple, Matlab, Mathematica, S-Plus,... PDE solvers are often huge & expensive It s difficult to build a flexible Matlab for PDEs, but modern programming techniques and languages (e.g. C++) simplify the task Diffpack is one attempt (used in this course) Practical problem solving in industry makes use of large program packages that is one reason why we use a package in this course New numerical projects in industry make increasing use of C++ instead of Fortran therefore we expose students to C++ and more modern implementation techniques We also see the potential of high-level languages like Python, in combination with C++ or Fortran, for solving PDEs INF566 may be a companion course Numerical library for PDE solution (Almost) a full problem-solving environment for PDEs A tool for programmers Implementated in C++ and requires you to program in C++ Relies on object-oriented programming Reduced implementation efforts for finite elements and PDEs Enables real-world problem solving in a course About the course p. 9 About the course p. Some features of Diffpack The Diffpack philosophy Free version (though with array-size limition) Free version at UiO and for students Over commercial installations: (Siemens, Xerox, DaimlerCrysler, Mitsubishi, NASA, Intel, Stanford, Cornell, Cambridge, Harvard,...) Some application areas: basic model equations in applied math. (Laplace, heat and wave equations) viscous fluid flow (Navier-Stokes equations) many types of water wave equations heat transfer, incl. phase changes thermo-elasticity stochastic PDEs and ODEs computational engineering, medicine, geology, finance D, D, 3D within the same code lines Diffpack relies on programming and scripting Diffpack is a set of libraries, consisting of C++ classes in hierarchies (OO design), applications (examples), and (Perl/Python) scripts A simulator mainly contains problem-dependent code; generic methods and data structures are already programmed in the libraries Diffpack acts as a computational engine with a layered design: primitive layers: arrays, input/output,... intermediate layers: linear systems/solvers, grids, fields,... higher-level layers: simulators, parallel toolbox,... About the course p. About the course p. How to learn Diffpack Literature Required: good general programming skills some familiarity with the class concept thorough knowledge of the numerics the right attitude: don t reinvent the wheel learn to use others code don t try to understand all details utilize black boxes Principles: learn on demand rely on program examples stay cool! Have access to a C++ textbook, e.g., Barton and Nackman s Scientific and Engineering C++ H. P. Langtangen: Computational Partial Differential Equations, Springer, nd ed., 3 About the course p. 3 About the course p. 4 Warnings Numerical solution of PDEs is a huge field in rapid growth; it takes years to master the field Many other fields (computer science, physics, mathematics) are wired into PDE numerics C++ takes time to master OOP takes time to understand Diffpack requires you to have a thorough and generic understanding of the numerics Difficulties with this course are usually not due to C++/OOP/Diffpack details but lack of the proper overview of mathematics and numerics D heat conduction About the course p. 5 D heat conduction p. 6

3 Heat conduction in the continental crust Basic assumptions x= earth surface T s x= T s x=b x -Q mantle Knowing the temperature at the earth s surface and the heat flow from the mantle, what is the temperature distribution through the continental crust? Interesting question in geology and geophysics and for those nations exploring oil resources... D heat conduction p. 7 x x=b -Q Physical assumptions: Crust of infinite area Steady state heat flow Heat generated by radioactive elements Physical quantities: u(x) : temperature q(x) : heat flux (velocity of heat) s(x) : heat release per unit time and mass D heat conduction p. 8 Summary of the model Derivation of the model () Differential equations and boundary conditions: u (x) = f(x), x (, ), u() =, u () = x= inflow s(x)=r exp(-x/l) (f(x) is a scaled version of s(x)) Finite difference method (h = cell size): u = u i+ u i + u i = h f i u n u n = h h f n which can be written as a linear system Au = b where u = (u,..., u n) and A is tridiagonal What to do: Fill A and b, solve for u by Gaussian elimination Physical principles: x=b First law of thermodynamics: x outflow net outflow of heat = total generated heat Fourier s law: heat flows from hot to cold regions (i.e. heat velocity is poportional with changes in temperature) q(x) = λu (x) λ reflects the material s ability to conduct heat D heat conduction p. 9 D heat conduction p. Derivation of the model () Derivation of the model (3) The first law of thermodynamics: outflow = heat generation x= x=b x s(x) q(x-h/) q(x+h/) q(x + h/) q(x h/) = s(x)h Here: heat generation s(x) due to radioactive decay, s(x) = R exp( x/l) Divide left-hand side by h and make h small, h We have more information (boundary conditions): u() = T s (at the surface of the earth) q(b) = Q (at the bottom of the crust) We need to get u into the model; combining the st law of thermodynamics q (x) = s(x) with Fourier s law q(x) = λu (x) we can eliminate q and get a differential equation for u: d ( λ du ) = s(x) dx dx q(x + h/) q(x h/) h = s(x) q (x) = s(x) D heat conduction p. D heat conduction p. Mathematical model Scaling d ( dx or if λ is constant: λ du dx ) = Re x/l, u() = T s, λ(b)u (b) = Q u (x) = λ Re x/l, Observe: u = u(x; λ, R, L, b, T s, Q) u varies with 7 parameters! u() = T s, λ(b)u (b) = Q Suppose that we want to investigate the influence of the different parameters. Assume (modestly) three values of each parameter: Number of possible combinations: 3 6 = 79. Using scaling we can reduce the six physical parameters λ, R, L, b, T s, Q to only two! We introduce dimensionless quantities (see HPL A. and assume that λ is constant): x = xb, u = T s + Qbū/λ, s(b x) = R s( x) d ū d x = γe x/β, ū =, where we have two dimensionless quantities β = b/l, γ = br/q Dropping the bars, we get a problem on the form u (x) = f(x), x (, ) u() = u () = dū d x () = D heat conduction p. 3 D heat conduction p. 4

4 Discretization of our equation Finite difference approximations (). Divide the domain [, ] into n cells, the cell edges x i are called nodes (i =,..., n). Let u i = u(x i), our goal is to let the computer calculate u,u,u 3,... u u u 3 u 4 u 5 Recall the definition of the derivative from introductory calculus: u(x + h) u(x) lim = u (x) h h Idea: use this formula with a finite h this is a finite difference approximation to the derivative What is the error in this approximation? Expand u(x + h) in a Taylor series and compute x x= x= 3. The differential equation is to be fulfilled at the nodes only: u (x i) = f(x i), i =,..., n (u(x + h) u(x)) = h ( u(x) + u (x)h + ) h u (x)h + u(x) = u (x) + u (x)h + 4. Derivatives are approximated by finite differences D heat conduction p. 5 The largest error term is u h/, proportional to h D heat conduction p. 6 Finite difference approximations () Finite difference approximations (3) An alternative finite difference approximation: Approximation to u (x): u (x) u(x + h) u(x h) h Compute the error by Taylor series expansion of u(x + h) and u(x h) around x: u(x + h) u(x h) h Leading error term proportional to h = u (x) + 6 u (x)h +... or u (x) u (x i) u(x + h) u(x) + u(x h) h u(xi + h) u(x) + u(xi h) h Alternative notation, noting that u i u(x i), u i+ = u(x i + h), and u i = u(x i h): [u ] i ui+ ui + ui h Show that the error is O(h ) (Hint: expand u i+, u i, and u i in Taylor series around x i and insert the series in the finite difference formula) D heat conduction p. 7 D heat conduction p. 8 The discrete differential equation Discretizing boundary conditions The equation at the nodes: u (x i) = f(x i), Replace u by a centered finite difference: u (x i) i =,..., n ui+ ui + ui h The differential equation is transformed to a system of algebraic equations: ui+ ui + ui h = f i, i =,..., n u() = simply becomes u = u () = can be approximated as u n+ u n h Problem: u n+ is not in the mesh! = Solution: Use the discrete differential equation for i = n: un un + un+ = f h n and the discrete boundary condition to eliminate u n+ The result is u n u n = h h f n D heat conduction p. 9 D heat conduction p. 3 System of equations Tridiagonal coefficient matrix The complete set of finite difference equations, u = u i+ u i + u i = h f i u n u n = h h f n can be written as a linear system on matrix form Au = b where u = (u,..., u n) and A is a tridiagonal matrix A, A,. A, A, A , A = Ai,i A i,i A i,i An,n A n,n A n,n D heat conduction p. 3 A, =, A, =, A n,n = A i,i =, A i,i+ =, i =,..., n A i,i =, i =,..., n D heat conduction p. 3

5 Solution of linear systems Solution of linear systems; general case The system is solved by Gaussian elimination: Compute the LU factorization: A = LU L: lower triangular matrix U: upper triangular matrix Solve Ly = b (easy) Solve Ux = y (easy) Computational work: A is dense: O(n 3 ) A is tridiagonal: O(n) LU factorization (Gaussian elimination) is the optimal solution method when A is tridiagonal However, in D and 3D problems, LU factorization is a very slow process (A is no longer tridiagonal) the structure of A favors iterative methods, which are very much faster than LU factorization Iterative methods are discussed at the end of the course D heat conduction p. 33 D heat conduction p. 34 Implementation We want the computer to solve our linear system (for arbitrary n) This task can easily be accomplished using any computer language and any computer The program fills A and b with numbers according to the derived formulas and then calls a Gaussian elimination procedure to find u In numerical simulation in general, computer codes are large and complicated and using effective tools is fundamental We shall use a comprehensive tool, Diffpack, even for this very simple problem The Diffpack code will be close similar codes in Python, Fortran 77, Matlab, C, C++, Java,... There is no particular advantage of using Diffpack (except that Diffpack has a solver for tridiagonal linear system), but it is a simple problem for the first Diffpack encounter Diffpack intro D heat conduction p. 35 Diffpack intro p. 36 The standard intro to a new language Compiling and linking A scientific Hello World code: #include <iostream> // make input/output functionality available #include <cmath> // make math functions available: e.g. sin(x) int main () // function "main" is always the main program std::cout << "Hello, World! Give a number: "; double r; std::cin >> r; // read number into double precision r double s = sin(r); // declare s and initialize with sin(r) std::cout << "\nthe value of sin(" << r << ") is " << s << "\n"; This is pure C++ - no Diffpack! Compile: g++ -c hw.cpp Link hw.o to the C/C++ standard and math library g++ -o app hw.o -lm # -lm (link to math lib.) can often be left out: g++ -o app hw.o Run the program:./app Compiling and linking in one step: g++ -o app hw.cpp -lm C++ compilers can have other names: CC, xlc Diffpack intro p. 37 Diffpack intro p. 38 The corresponding Diffpack program Compiling and linking Make special directory: Mkdir myfirstdp cd myfirstdp Make a file hw.cpp with the following contents: #include <IsOs.h> // Diffpack tools for input/output #include <cmath> // make math functions available: sin(x) int main (int argc, const char* argv[]) initdiffpack (argc, argv); // should always be performed s_o << "Hello world! Give a number: "; real r; s_i >> r; // read real number into r real s = sin(r); s_o << "\nthe value of sin(" << r << ") is " << s << "\n"; /* Explanation: IsOs.h : input/output in Diffpack, much like iostream real : real variables in Diffpack, equals double by default s_i : standard input in Diffpack, corresponds to std::cin s_o : standard input in Diffpack, corresponds to std::cout */ Diffpack is compiled using makefiles (which are automatically generated by the Mkdir command) Compilation and linking is just a matter of Make (safe, but results in slow code) Make MODE=opt (fast code, but less safety checks) Always start with Make; use only optimized mode (MODE=opt) when the program is thoroughly tested! Diffpack intro p. 39 Diffpack intro p. 4

6 Arrays in Diffpack Code example with arrays Conventions as in Fortran: first index is subscript syntax: a(i) Different from C, where arrays start at and brackets are used: a[], a[],... Diffpack arrays are not a built-in feature of C++, but they are defined by a programmer (and can in principle be extended by anybody to meet the demands in a particular application) #include <Arrays_real.h> int main (int argc, const char* argv[]) initdiffpack (argc, argv); int i,j,k,n,m,p; real r; n = m = 4; p = 3; Vec(real) w(n); w.redim (m); i = w.size(); w = -3.4; Vec(real) z; z = w; z(n-) = w() - 4.3; z.print (s_o, "z"); z.printascii(s_o,"z"); z.print (s_o); Diffpack intro p. 4 Diffpack intro p. 4 Heat conduction problem in Diffpack Test problem for debugging Find a suitable test problem with known analytical solution Read β, γ and n Initialize A and b u (x) = γ exp ( βx), u() =, u () = u(x) = γ ( ) e βx β + ( γβ ) e β x, β ( u(x) = x + γ ( )) x, β = Call a Gaussian elimination procedure in Diffpack to solve for u Choose e.g. n = and solve the discrete equations by hand, Solution u : u =, u u = h h γe β u = + γ e β When β =, the numerical solution is exact for all n (!), i.e., the analytical solution ( u(x i) = u i = (i )h + γ ( )) (i )h fulfills the discrete equations In general, u = const is solved exactly by finite difference methods on uniform grids Diffpack intro p. 43 Diffpack intro p. 44 Diffpack/C++ program in F77/C style Fill matrix and right-hand side Declaration and initialization of variables: #include <Arrays_real.h> // for array functionality (and I/O) #include <cmath> // for the exponential function int main(int argc, const char* argv[]) initdiffpack(argc, argv); s_o << "Give number of solution points: "; // write to the screen int n; // declare an integer n (no of grid points) s_i >> n; // read n from s_i, i.e. the keyboard real h=./(n-); // note: /(n-) gives integer division (=) Mat(real) A(n,n); // create an nxn matrix ArrayGen(real) b(n); // create a vector of length n. ArrayGen(real) u(n); // the grid point values s_o << "Give beta: "; real beta; s_i >> beta; s_o << "Give gamma: "; real gamma; s_i >> gamma; A.fill(.); // set all entries in A equal to. b.fill(.); // set all entries in b equal to. real x; int i; i = ; A(i,i) = ; b(i) = ; // inner grid points: for (i = ; i <= n-; i++) x = (i-)*h; A(i,i-) = ; A(i,i) = -; A(i,i+) = ; b(i) = - h*h*gamma*exp(-beta*x); // i = n: i = n; x = (i-)*h; A(i,i-) = ; A(i,i) = -; b(i) = - *h - h*h*gamma*exp(-beta*x); if (n <= ) A.print (s_o,"a matrix"); b.print (s_o,"right-hand side"); // i++ means i=i+ // print matrix to the screen // print vector to the screen Diffpack intro p. 45 Diffpack intro p. 46 Solve for u and write out solution Tridiagonal matrices A.factLU(); A.forwBack(b,u); // Gaussian elimination s_o << "\n\n x numerical error:\n"; real u_exact; for (i = ; i <= n; i++) // \n is newline x = (i-)*h; if (beta <.E-9) // is beta zero? u_exact = x*( + gamma*( -.5*x)); else u_exact = gamma/(beta*beta)*( - exp(-beta*x)) + ( - gamma/beta*exp(-beta))*x; s_o << oform("%4.3f %8.5f %.5e\n", x,u(i),u_exact-u(i)); // test for the case of only one cell: if (n == ) s_o << "u()=" << +.5*gamma*exp(-beta) << "\n"; // write results to the file "SIMULATION.res" Os file ("SIMULATION.res", NEWFILE); // open file for (i = ; i <= n; i++) file << (i-)*h << " " << u(i) << "\n"; file->close(); A is tridiagonal Mat(real) A(n,n) is a dense matrix Save memory and CPU-time: use tridiagonal matrix (this can be quite dramatic savings!) MatTri(real) A(n) A(i,-), A(i,), A(i,) for A i,i, A i,i, A i,i+ Otherwise the program remains the same Diffpack intro p. 47 Diffpack intro p. 48

7 Exercises The heat conduction coefficient. Perform the steps to be a Diffpack user. Type in the Diffpack version of our numerical Hello World! program, compile and run the program 3. Introduce MatTri instead of Mat in the D heat conduction program (Exercise.4 in HPL) The derivation of the D model ends in d ( λ du ) = s(x) dx dx and allows a variable λ λ: heat conduction coefficient The continental crust is typically not homogeneous! λ varies in space! Model simplification: λ = λ(x) (λ = λ(x, y, z) would require a 3D model) Need to discretize the operator ( d λ du ) dx dx Diffpack intro p. 49 Diffpack intro p. 5 Discretization of variable coefficients Finite difference equations Mathematical problem d ( λ(x) du ) = f(x), < x < dx dx u() =, u () =. NEVER expand (λu ) (by the rule of product differentiation) Two-step discretization, first outer operator: ( d λ(x) du dx dx) λ du h dx λ du x=xi x=xi+ dx x=xi Then inner operator: λ du dx x=xi+ u i+ u i λ i+ h Diffpack intro p. 5 Left point, inner points, right point: u = λ i+ (ui+ ui) λ i (ui ui ) = h f i, i =,..., n Arithmetic mean: Harmonic mean: Geometric mean: λ n(u n u n) = hλ n+ f n λ i+ = (λi + λi+) λ i+ = ( + ) λ i λ i+ λ i+ = (λiλi+)/ Diffpack intro p. 5 Nonlinear heat conduction A nonlinear problem Heat conduction typically depends upon the temperature d ( λ(u) du ) = f(x), < x <, u() =, u () = dx dx This is a nonlinear differential equation Using the same discretization reasoning as when λ = λ(x), u = λ i+ (ui+ ui) λ i (ui ui ) = h f i λ n(u n u n) = hλ n+ h f n where λ i+ λ(u i+ ) A nonlinear problem p. 53 A nonlinear problem p. 54 The new problem Solution method Our discrete equations contains λ(u i+ ), i.e., the coefficients that we previously put in the matrix A, now depend on the solution u i and u i+ The linear system can be written as A(u)u = b This is a set of nonlinear algebraic equations The nonlinearity arises from the λ(u)u product in the underlying differential equation We cannot use LU decomposition because A depends on u What can we do? If we only had a linear equation, we would get a linear system Au = b, which know how to solve... Idea: Guess a solution u and use this in λ: d ) (λ(u ) du = f(x) dx dx u is hopefully a better approximation than u This approach suggest an iteration procedure: use solution from last iteration in λ the equation is now linear use the solution technology for (λ(x)u (x)) = f(x) A nonlinear problem p. 55 A nonlinear problem p. 56

8 Algorithm The complete scheme () Guess a solution u (need not be correct) Solve the recursive equations ( d λ(u k ) du dx dx k ) = f(x), u k () =, k du () = dx For i =,,..., n : ( ) (u λ(u k i ) + λ(u k i+ ) ) k i+ u k i ( (u λ(u k i ) + λ(uk k i )) i ui ) k = h f(x i) until difference between u k and u k is small Small can mean n u k j uk j ɛ j= Pros: may reuse previous code by inserting an evaluation of λ(u i+ ) Cons: slow convergence (faster methods exist) A nonlinear problem p. 57 A nonlinear problem p. 58 The complete scheme () Implementation For i = n Now, λ(u k n )(u k n u k n) = hλ n+ h f(x n) λ n+ = Using the boundary condition gives λ n+ = ( λ(u k n ) + λ(u k n+ )) u k n+ uk n =, k > h ( λ(u k n ) ) + λ(u k n + h) Reuse old program (HeatD) with: Loop around system generation and solution Two arrays uk and ukm Initial guess in ukm New auxiliary variables (for iteration etc.) Function lambda to evaluate λ(u) Update A and b for each step Call A.resetFact() to enable new LU decomposition prior to call to A.factLU() Check for termination upon convergence Set ukm equal uk before new iteration A nonlinear problem p. 59 A nonlinear problem p. 6 The central code segment () The central code segment () int k = ; // iteration counter const int k_max = ; // max no of iterations real udiff = INFINITY; // udiff = uk - ukm const real epsilon =.; // tolerance in termination crit. while (udiff > epsilon && k <= k_max) k++; // increase k by A.fill(.); b.fill(.); // initialize A and b for (i = ; i <= n; i++) if (i == ) A(,) = ; else if (i > && i < n) lambda = lambda(ukm(i-), m); lambda = lambda(ukm(i), m); lambda3 = lambda(ukm(i+), m); A(i,-) =.5*(lambda + lambda); A(i, ) = -.5*(lambda + *lambda + lambda3); A(i, ) =.5*(lambda + lambda3); else if (i == n) A(i,-) = *lambda(ukm(i), m); A(i, ) = - A(i,-); b(i) = -(h*lambda(ukm(i-)+*h,m)+lambda(ukm(i),m)); A.resetFact(); // ready for new factlu A.factLU(); A.forwBack(b,uk); // Gaussian elimination // check termination criterion: udiff = ; for (i = ; i <= n; i++) udiff += sqr(uk(i) - ukm(i)); udiff = sqrt(udiff); s_o << "iteration " << k << ": udiff = " << udiff << "\n"; ukm = uk; // ready for next iteration A nonlinear problem p. 6 A nonlinear problem p. 6 No of iterations Numerical error λ(u) = u m λ(u) = u m 6 5 number of iterations n= -5 numerical error n= no of iterations 4 3 log(error) m m A nonlinear problem p. 63 A nonlinear problem p. 64

9 Convergence Convergence plot () Define the error from iterations: λ(u) = u m where is some norm, e.g., Define the discretization error: E I = u k u k u = n u i n i= log(error) numerical error m=. m=. m=3. E = u u k -8 Basic issue in discretization: how does E vary with the cell size h? Investigation: make E I negligible (E I E ), compute E for different choices of h Common model for relating discretization error to grid size: A nonlinear problem p log(h) A nonlinear problem p. 66 E = Ch r Convergence plot () fit C and r to data (linear least squares): Summary of results log E = log C + r log h λ(u) = ( + u) m y = aξ + b, y = log E, b = log C, ξ = log h, r = a numerical error Does E -4 as h? And how fast? Second-order -6 m=. finite difference approximationsm=. suggest r = -8 m=3. - log(error) log(h) Number of iterations increase with m m = : numerical solution is exact (!) λ(u) = u m : O(h r ) for r (despite our use of O(h ) accurate finite differences!) λ(u) = ( + u) m : O(h ) as expected Explanation: u m gives u (), need a very fine grid around x = to get accurate results Note: theory does not extend well to nonlinear problems; systematic experiments may be an important additional tool A nonlinear problem p. 67 A nonlinear problem p. 68 Vibration of a string Mathematical model: the wave equation Simulation of waves u t = γ u, x (a, b) x - Time- and space-dependent problem - This is a partial differential equation (PDE) - Boundary conditions at x = a, b (u or u/ x) - Initial conditions: known u(x, ) and u t(x, ) Explicit finite difference method: u l+ i = f(u l i, u l i, u l i+, u l i ) Implementation: run through a space-time grid and compute u l+ i for each grid point Simulation of waves p. 69 Simulation of waves p. 7 Derivation of the model () Derivation of the model () y y ρ s T(x+h/) Physical assumptions: the string = a line in D space no gravity forces up-down movement (i.e., only in y-direction) Physical quantities: r = xi + u(x, t)j : position T (x) : tension force (along the string) θ(x) : angle with horizontal direction ϱ(x) : density x T(x-h/) u(x,t) h Physical principle: Newton s second law applied to a small (infinitesimal) part of the string sum of forces = total mass acceleration x Simulation of waves p. 7 Simulation of waves p. 7

10 Derivation of the model (3) Derivation of the model (4) Great mathematicians had great problems with understanding how to set up the mathematical model for a vibrating string Euler, D Alambert and Taylor all made various attempts (which look stupid by today s standards...) Lagrange was the first one to derive the right partial differential equation This happened about years after Newton had presented the mathematics and physics we need to derive this PDE The derivation to be presented here is typical: simple principles, but lots of mathematical details; it s easy to get lost in the details Acceleration: y T(x-h/) ρ s h u(x,t) a = r t = u t j Newton s law applied to a string element: T(x+h/) ( T x + h ) ( T x h ) = ϱ(x) s u (x, t) j t A vector equation with two components x Simulation of waves p. 73 Simulation of waves p. 74 Derivation of the model (5) Derivation of the model (6) y T(x-h/) The tension reads ρ s h u(x,t) T(x+h/) x Divide the first component by h and let h ( ) T cos θ = x Similarly for the second component ( ) ( s ) u T sin θ = ϱ lim x h h t T (x) = T (x) cos θ(x) i + T (x) sin θ(x) j Newton s law in component form T (x + h ) cos θ(x + h ) T (x h ) cos θ(x h ) = T (x + h ) sin θ(x + h ) T (x h ) sin θ(x h ) = ϱ(x) s u t Simulation of waves p. 75 Simulation of waves p. 76 Derivation of the model (7) Derivation of the model (8) We need to determine the limit lim h s/h Assume linear segment, then by Pythagoras: Furthermore, s = h + u, which means that sin θ = i.e., s h tan θ = u x, tan θ + tan θ = = lim h u x + ( u x + ) ( ) u x Altogether this gives [ ( ) ] [ u u ϱ + x t = ( ) ] u u T + x x x which is a nonlinear partial differential equation. Assume small vibrations, i.e., ( u/ x). For small vibrations, θ(x), such that = ( ) T cos θ = (T ( θ x x! +... )) x T This means that T is approximately a constant and that the square roots are Simulation of waves p. 77 Simulation of waves p. 78 Summing up The scaled wave equation problem The governing PDE: String fixed at the ends: String initially at rest: u t = c u x c = T/ϱ u(a, t) = u(b, t) = u(x, ) = I(x), u (x, ) = t We scale the equations (γ, but kept as a label) and arrive at the following initial-boundary value problem: u t = γ u, x x (, ), t > u(x, ) = I(x), x (, ) u(x, ) =, t x (, ) u(, t) =, t >, u(, t) =, t > Simulation of waves p. 79 Simulation of waves p. 8

11 Finite difference approximation () Finite difference approximation () Introduce a grid in space-time Central difference approximations: t t x i = (i )h, i =,..., n t l = l t, l =,,... x u(xi, tl) = ul i ul i + ul i+ + O(h ) h ul i u l i u(xi, tl) = + ul+ i t t + O( t ) Insert these in the PDE u t = γ u x h x Simulation of waves p. 8 Simulation of waves p. 8 Finite difference approximation (3) The computational procedure The PDE has been transformed to a difference equation u l i u l i + ul+ i t = γ ul i ul i + ul i+ h All u values at time levels l and l are assumed known Only one unknown term: u l+ i Can solve for u l+ i explicitly: u l+ i = u l i u l i + γ t ( ) u l h i u l i + u l i+ t t h Can find u l+ i for one i at a time if u at t l and t l is known Need u i and u i for all i to start the algorithm x This scheme is classified as a an explicit finite difference method; no need to solve coupled systems of linear equations ( easier programming!) Simulation of waves p. 83 Simulation of waves p. 84 Initial conditions () Initial conditions () u i = I(xi): evaluate directly A bit more challenging: u t = u i u i t= t = u i = u i It is awkward to have a special first step. Instead we introduce u i = u i + γ t h (u i+ u i + u i ) and use the standard difference equation also at the first step but u i is outside the legal time grid... Idea: eliminate u i by using the discrete PDE at t =, which gives a special formula for the first step: u i = u i + γ t ( ) u h i u i + u i+ Simulation of waves p. 85 Simulation of waves p. 86 Algorithm Diffpack code in F77/C style Define storage u + i, u i, u i for u l+ i Set initial conditions: u i = I(x i),, u l i, ul i and set C = γ t/h i =,..., n Define the artificial quantity u i (i =,..., n ) u i = u i + C (u i+ u i + u i ), Set t = ; while t < t stop t = t + t Update all inner points (i =,..., n ) We use functions in C++ void timeloop (ArrayGen(real)& up, ArrayGen(real)& u, ArrayGen(real)& um, real tstop, real C); void setic (real C, ArrayGen(real)& u, ArrayGen(real)& um); The main program: ArrayGen(real) up (n); // u at time level l+ ArrayGen(real) u (n); // u at time level l ArrayGen(real) um (n); // u at time level l- // get n and Courant number C=dt/dx from the user timeloop (up, u, um, tstop, C); // finite difference scheme u + i = u i u i + C (u i+ u i + u i ) Set boundary condition: u + =, u+ n = Initialize for next step u i = u i, u i = u + i, all i Simulation of waves p. 87 Simulation of waves p. 88

12 The timeloop function The setic function void timeloop (ArrayGen(real)& up, ArrayGen(real)& u, ArrayGen(real)& um, real tstop, real C) int n = u.size(); // length of the vector u (no of grid points) real h =./(n-); // length of grid intervals real dt = C*h; // time step, assumes unit wave velocity!! real t = ; // time setic (C, u, um); // set initial conditions int i; // loop counter over grid points int step_no = ; // current step number while (t <= tstop) t += dt; step_no++; // increase time; count no. of steps // update inner points according to finite difference scheme: for (i = ; i <= n-; i++) up(i) = *u(i) - um(i) + sqr(c)*(u(i+) - *u(i) + u(i-)); up() = ; up(n) = ; // update boundary points: um = u; u = up; // update data struct. for next step void setic (real C, ArrayGen(real)& u, ArrayGen(real)& um) int n = u.size(); // length of the vector u real x; // coordinate of a grid point real h =./(n-); // length of grid intervals real umax =.5; // max string displacement int i; // loop counter over grid points for (i = ; i <= n; i++) // set the initial displacement u(x,) x = (i-)*h; if (x <.7) u(i) = (umax/.7) * x; else u(i) = (umax/.3) * ( - x); for (i = ; i <= n-; i++) // set the help variable um: um(i) = u(i) +.5*sqr(C) * (u(i+) - *u(i) + u(i-)); um() = ; um(n) = ; // dummy values, not used in the scheme Simulation of waves p. 89 Simulation of waves p. 9 Dumping solution to file Visualizing the results We dump the solution at each time point to file such that we can make a movie after the simulation is finished Diffpack has tools for managing a large number of curves on files CurvePlotFile: manager for a collection of curves CurvePlot: a variable that holds a curve Code example: int n = u.size(); // the number of unknowns real h =./(n-); // length of grid intervals CurvePlot plot (plotfile); // a single plot plot.initpair ("displacement", // plot title oform("u(x,%g)",t), // name of function "x", // name of indep. var. oform("c=%g, h=%g",c,h)); // comment for (int i = ; i <= n; i++) // add (x,y) data points plot.addpair (h*(i-) /* x-value */, u(i) /* y value */); plot.finish(); The simulation produces a Diffpack case with name SIMULATION Central files generated in the simulation: SIMULATION.dp logfile for simulation, i.e., runtime SIMULATION.map overview of data files SIMULATION.files explanation of what the files are.simulation_,.simulation_,... the (hidden) data files Simulation of waves p. 9 Simulation of waves p. 9 Animation Varying the Courant number C Make the animation using Diffpack features: curveplotmovie gnuplot SIMULATION.map -.. (script) (program) (name of map file) (ymin) (ymax) Can replace gnuplot by matlab curveplotmovie matlab SIMULATION.map -.. C = γ t, t =.5, h = / h (a) C =. (b) C =.5 Simulation of waves p. 93 Simulation of waves p. 94 Varying the Courant number C Another numerical example Time t=., σ= 3 Time t=., σ= Time t=5., C=. Time t=5., C= (c) C =.8 (d) C = Time t=5., C=.8 Time t=5., C= Simulation of waves p. 95 Simulation of waves p. 96

13 Numerical stability and accuracy We have two parameters: t and h How do we choose t and h? Too large t and h give - too large numerical errors - or in the worst case: unstable solutions Too small t and h require too much computing power Simplified problems can be analyzed theoretically, which yields a guide to choosing t and h Basic result for our wave equation: t h/γ (derived later) Peculiar case: exact solution is obtained by t = h/γ, regardless of h (!!!) D wave equation Simulation of waves p. 97 D wave equation p. 98 A more general wave equation Applications of wave the equation General form of a D/D/3D wave equation for u(x, t): Wave travels with velocity γ = λ u t = [λ(x) u] The operator [λ(x) u] is frequently encountered in this course! In D the operator is written out as ( λ(x, y) u ) + ( λ(x, y) u ) x x y y Vibrations of a string (D) Vibrations of a drum (D) Large destructive water waves (D; water elevation) Sound waves (D: organ pile, flute; 3D: room, space) Light and radio waves (3D) Goal for the next slides: Learn how to discretize a D wave equation with variable coefficients We shall do this by putting together elements we have learned so far D wave equation p. 99 D wave equation p. Example: earthquake-generated waves Principles u t = x u t = [λ(x) u] ( λ u ) + ( λ u ) x y y Time discretization as for D wave eq. Space discretization: generalized from (λ(x)u ) Boundary conditions: generalized from u() = and u () = Initial conditions as for D wave eq. Overall algorithm as for D wave eq. Z Y X D wave equation p. D wave equation p. Note: scales are distorted! Earthquake close to seamount Discretization Effect of earthquake: Sudden elevation of the surface, modeled here as a prescribed initial surface at rest Seek approximation u Domain = segment of l i,j on a rectangular grid to u(xi, yj, tl) the ocean x i = (i ) x, y j = (j ) y, t l = l t Approximate derivatives by central differences The scheme The finite difference scheme takes the form u l+ i,j Exercise: derive the expression for [ u] l i,j = u l i,j u l i,j + [ u] l i,j ( A spatial term like y ( λ y i,j+ [ ] l u t i,j λ u y ul+ i,j u l i,j + ul i,j t ) takes the form ( ) u l i,j+ u l i,j y λ i,j ( )) u l i,j u l i,j y D wave equation p. 3 D wave equation p. 4

14 Algorithm (BC: u = ) Algorithm (BC: u = ) DEFINITIONS: Storage u + i,j, ui,j, and u i,j for ul+ i,j, ul i,j, and ul i The whole grid: ( ) = i =,..., n x, j =,..., n y Inner points: ( ) = i =,..., n x, j =,..., n y INITIAL CONDITIONS: u i,j = I(x i, y j), (i, j) ( ) SET ARTIFICIAL QUANTITY u i,j : While t t stop t t + t UPDATE ALL INNER POINTS: u + i,j = ui,j u i,j + [ u]i,j, INITIALIZE FOR NEXT STEP: (i, j) ( ) u i,j = ui,j, ui,j = u+ i,j, (i, j) ( ) u i,j = ui,j + [ u]i,j, Set t = (i, j) ( ) D wave equation p. 5 D wave equation p. 6 A model for water waves Scaling Physical assumption: long waves in shallow water Corresponding mathematical model: u t = [gh(x) u ] Physical quantities: u(x, y, t) : water surface elevation g : acceleration of gravity H(x, y) : still-water depth Boundary condition at coastline u n u n = Let H c be a characteristic value of H(x, y) We introduce new variables x = x/h c, ȳ = y/h c, t = t/ H c/g, λ = H/H c ū = u/u c (u c cancels and can be arbibtary) Inserted in the equation: g (the driving force) is scaled away ū t = [λ ] ū (full reflection of waves) D wave equation p. 7 D wave equation p. 8 Implementing boundary conditions () Implementing boundary conditions () There are two ways of handling u/ n = conditions: Ghost cells at boundary with explicit updating of fictitious values Modify stencil at boundary We choose the second option, as this allows direct output of u i,j to a visualization program, i.e., no need to remove ghost cells. Consider the boundary i = (x=const) Boundary condition: Discrete version: u n u x = u,j u,j x u,j is outside the legal mesh = u,j = u,j Use the discrete PDE for i = and eliminate u,j, that is, just replace u,j by u,j D wave equation p. 9 D wave equation p. Modified difference operator Efficiency issues The boundary condition modifies the finite difference equations, and this can be viewed as modifying the operator [ u] i,j (in this example at i = ) according to [ u],j:i i+ ( ) ( t ) λ x +,j (u,j u,j) λ,j (u,j u,j) + ( ) ( t ) λ y,j+ (u,j+ u,j) λ,j (u,j u,j ) Two things should be considered: Loops should be ordered such that u(i, j) is traversed in the order it is stored. In Diffpack ArrayGen objects are stored columnwise. Therefore the loop should read: for (j = ; j <= ny; j++) for (i = ; i <= nx; i++) u(i,j) =... One should avoid if statements in loops if possible (they prevent many compiler optimization techniques); hence we will have separate loops over inner and boundary points. Remark I: Debug code before optimizing it!! Remark II: Focus on a readable and maintainable code before thinking of efficiency D wave equation p. D wave equation p.

15 Updating of internal points Updating of boundary points () We define a function for updating the solution: WAVE(u +, u, u, a, b, c) This function reads, at inner points, u + i,j = aui,j bu i,j + c[ u]i,j, (i, j) ( ) i =, j =,..., n y ; u + i,j = aui,j bu i,j + c[ u]i,j:i i+, i = n x, j =,..., n y ; u + i,j = aui,j bu i,j + c[ u]i,j:i+ i, j =, i =,..., n x ; u + i,j = aui,j bu i,j + c[ u]i,j:j j+, j = n y, i =,..., n x ; u + i,j = aui,j bu i,j + c[ u]i,j:j j+, D wave equation p. 3 D wave equation p. 4 Updating of boundary points () Modified algorithm (BC: u/ n = ) i =, j = ; u + i,j = aui,j bu i,j + c[ u]i,j:i i+,j j+ i = n x, j = ; u + i,j = aui,j bu i,j + c[ u]i,j:i+ i,j j+ i =, j = n y; u + i,j = aui,j bu i,j + c[ u]i,j:i i+,j+ j i = n x, j = n y; u + i,j = aui,j bu i,j + c[ u]i,j:i+ i,j+ j DEFINITIONS: as above INITIAL CONDITIONS: u i,j = I(x i, y j), (i, j) ( ) SET ARTIFICIAL QUANTITY u i,j : WAVE(u, u, u,.5,,.5) Set t = D wave equation p. 5 D wave equation p. 6 Modified algorithm (BC: u/ n = ) Diffpack/C++ written in F77/C style While t t stop t t + t UPDATE ALL POINTS: WAVE(u +, u, u,,, ) INITIALIZE FOR NEXT STEP: u i,j = ui,j, ui,j = u+ i,j, (i, j) ( ) #include <Arrays_real.h> #include <FieldLattice.h> #include <SimResmtv.h> src/fdm/intro/waved // We define a macro LaplaceU to save typing of long // finite difference formulas. For example, // // #define mac(x) q(i,j-x) // // defines a macro mac(x) and any text mac(i+) will then be // transformed to q(i,j-i+) by the C/C++ preprocessor (cpp). #define LaplaceU(i,j,im,ip,jm,jp) \ sqr(dt/dx)*\ (.5*(lambda(ip,j )+lambda(i,j ))*(u(ip,j )-u(i,j )) \ -.5*(lambda(i,j )+lambda(im,j ))*(u(i,j )-u(im,j )))\ +sqr(dt/dy)*\ (.5*(lambda(i,jp)+lambda(i,j ))*(u(i,jp)-u(i,j )) \ -.5*(lambda(i,j )+lambda(i,jm))*(u(i,j )-u(i,jm))) D wave equation p. 7 D wave equation p. 8 More code... More code... void WAVE (ArrayGenSel(real)& up, const ArrayGen(real)& u, const ArrayGenSel(real)& um, real a, real b, real c, const ArrayGenSel(real)& lambda, real dt, real dx, real dy) int nx, ny; up.getdim (nx, ny); int i,j; // update inner points according to finite difference scheme: for (j = ; j <= ny-; j++) for (i = ; i <= nx-; i++) up(i,j) = a**u(i,j) - b*um(i,j) + c*laplaceu(i,j,i-,i+,j-,j+); // update boundary points (modified finite difference schemes): i=; for (j = ; j <= ny-; j++) up(i,j)=a**u(i,j)-b*um(i,j) +c*laplaceu(i,j,i+,i+,j-,j+); i=nx; for (j = ; j <= ny-; j++) up(i,j)=a**u(i,j)-b*um(i,j) + c*laplaceu(i,j,i-,i-,j-,j+); j=; for (i = ; i <= nx-; i++) up(i,j)=a**u(i,j)-b*um(i,j) + c*laplaceu(i,j,i-,i+,j+,j+);... int main (int argc, const char* argv[]) initdiffpack (argc, argv); s_o << "Give number of intervals in x and y direction: "; int h; s_i >> h; int nx = h+; s_i >> h; int ny = h+; s_o << "Give width of domain in x direction: "; real wx; s_i >> wx; s_o << "Give width of domain in y direction: "; real wy; s_i >> wy; // ArrayGenSel is like ArrayGen, but has increased functionality // for finite difference methods // (we need it for the FieldLattice object for visualization) ArrayGenSel(real) up (nx,ny); // u at time level l+ ArrayGenSel(real) u (nx,ny); // u at time level l ArrayGenSel(real) um (nx,ny); // u at time level l- ArrayGenSel(real) lambda (nx,ny); // variable coefficient const real dx = wx/(nx-); // length of grid intervals in x dir. const real dy = wy/(ny-); // length of grid intervals in y dir. s_o << "Give time step length: ( gives dt=dx): "; real dt; s_i >> dt; D wave equation p. 9 D wave equation p.

16 Time t= Time t= Time t= Time t= Time t= Time t= More code... Visualizing the results // fill lambda with values... (see source file) // fill um with initial values: um.fill(.); // set the help variable um: WAVE (um, u, um,.5,,.5, int step_no = ; while (t <= tstop) lambda, dt, dx, dy); // current step number t += dt; // increase time by the time step step_no++; // increase step number by s_o << "t=" << t << "\n"; WAVE (up, u, um,,,, lambda, dt, dx,dy); um = u; u = up; // update data struct. for next step // dump solution to file... (see source file) unix> plotmtv -colorps W7.tmp.mtv D wave equation p. D wave equation p. Hyperbolic equations () Linear PDEs can be divided into basic categories: elliptic, parabolic, and hyperbolic Nature of some PDEs A typical hyperbolic equation is the wave equation which has the general solution u t = γ u x u(x, t) = f(x γt) + g(x + γt) i.e., two waves propagating to the left and to the right The functions f and g are determined from u(x, ) and u(x, )/ t. Disturbances travel with a finite wave speed. Reference: HPL appendix A.5 Nature of some PDEs p. 3 Nature of some PDEs p. 4 Changing boundary conditions Observations Boundary conditions: u(, t) =, u(, t) = Changing the BC influences the solution at only some (x, t) points (e.g., at the midpoint, after the initial disturbance has left, a change in the BC at x = is not felt before the pulse has been reflected from the boundary) An initial disturbance is transported without change of shape (essential for oral communication!) Boundary conditions: u(, t) =, u x(, t) = Nature of some PDEs p. 5 Nature of some PDEs p. 6 BC: let waves leave the domain Other hyperbolic equations () Boundary conditions: u(, t) = and ( u t + γu x ) x= = The wave propagates out of the domain, exactly as we want for ocean waves! Unfortunately, the condition is hard to generalize successfully to physically relevant D cases Uni-directional wave or transport equation: where I(x) is the initial condition u(x, ) u t + γ u x = u(x, t) = I(x γt) Systems of wave equations (here: long ocean waves) η t u t v t = x (uh) ( y (vh) H ) t = η x, x, t > = η y, x, t > Nature of some PDEs p. 7 Nature of some PDEs p. 8

17 Other hyperbolic equations () Elliptic equations Multi-dimensional standard wave equation: u t = (λ u) Applications: radio waves, light, sound, membranes,... Nonlinear hyperbolic conservation law: Elliptic equations are stationary (equilibrium or steady-state physical conditions) Example: u (x) =, u() =, u () = Let us look at the effect of changing u () = to u() = or a system of such equations: u t + x f(u) = u t + x f(u) = Applications: gas dynamics, oil reservoir flow Nature of some PDEs p. 9 Nature of some PDEs p. 3 Changing boundary conditions Multi-dimensional elliptic equations u () = versus u() = : u()= u ()= The Poisson equation is a typical multi-dimensional elliptic equation: u(x) = f(x) or (λ(x) u(x)) = f(x) Another elliptic equation: Observe: All points in the interior are effected by the boundary condition! And the solution is smooth. (λ(x) u(x)) + αu = f(x) Nature of some PDEs p. 3 Nature of some PDEs p. 3 The Helmholtz equation Parabolic equations () Sometimes one solves the wave equation u t = γ u by assuming periodical waves in time: A D heat equation is a typical parabolic equation: u t = u +, u(x, ) =, u(, t) = x Typically for parabolic equations: u(x, y, z, t) = exp ( iωt)û(x, y, z), This results in the famous Helmholtz equation i = time derivative = elliptic counterpart That is, as t, the eq. above tends to the elliptic equation û + k û =, k = ω/γ The Helmholtz equation is not an elliptic equation (wrong sign!) u =, u(x, ) =, u(, t) = x Nature of some PDEs p. 33 Nature of some PDEs p. 34 Changing boundary conditions Parabolic equations () Let us see the effect of two different conditions at x = : u(, t) = versus u x(, t) =.5.4 Time t=.3 Multi-dimensional parabolic (heat) equation: u t = (λ(x) u) + f Recall: as t, we normally have that u/ t, and the parabolic equation approaches the elliptic counterpart As for the elliptic counterpart, all points in the interior are effected by the boundary condition Nature of some PDEs p. 35 Nature of some PDEs p. 36

18 Numerical methods Why solutions of elliptic eqs. are smooth Hyperbolic equations: explicit schemes, must choose t h Elliptic equations: all grid-point values are coupled in a linear system, time-consuming to solve, requires sophisticated iterative methods Parabolic equations: can use explicit schemes, if t h, but implicit schemes coupling all points (like in elliptic equations) are preferred Solutions of elliptic equations are smooth! u (x) = f(x) with noisy f(x): ( ) u = f(x)dx dx Twice integration of noisy f(x) smooth u(x) (λu ) = with noisy coefficient λ(x): x dτ u(x) = const λ(τ) Rough λ smooth u Nature of some PDEs p. 37 Nature of some PDEs p. 38 More general result from Fourier series Smoothness and variational calculus u (x) = f(x) solved by Fourier series (here: sine series): u(x) = u i sin iπx i= Expanding f(x) also in a Fourier series: u = is equivalent with min v d v(x) u is smoothest among all v over f(x) = f i sin iπx i= Inserting this in the equation gives u i f i/i i.e., the u series converges faster than the f series and from Fourier series theory this means that u is smoother than f Nature of some PDEs p. 39 Nature of some PDEs p. 4 Outline Analysis of difference schemes Try to explain the observed numerical behavior (accuracy, stability) by mathematical means Tool: exact solution of the discrete equations Main focus on the wave and heat equations Intro to classical topics like truncation error and von Neumann stability Analysis of difference schemes p. 4 Analysis of difference schemes p. 4 Operator notation () Operator notation () Finite difference schemes are often long and difficult to read compared to the underlying PDE Operator notation gives condensed expressions much like the PDE Define u l [δ xu] l i+ i,j,k,j,k ul i,j,k x with similar definitions of δ y, δ z, and δ t Another difference: [δ xu] l i,j,k ul i+,j,k ul i,j,k x Compound difference (D now, to save index writing): One-sided forward difference: and the backward difference: [δ xδ xu] l i = ( u l h i u l i + u l ) i+ [δ + x u] l i ul i+ ul i h [δ x u] l i ul i ul i h Operator notation for arithmetic average: [u x ] l i ( ) u l i+ + u l i Analysis of difference schemes p. 43 Analysis of difference schemes p. 44

19 u(x,) u(x,.) Operator notation (3) Typical solution of a wave equation Put the whole equation inside brackets: [δ xδ xu = f] i This is a finite difference scheme for u = f Example: (λu ) = is discretized as [δ xλ x δ xu = ] i Another example, the heat equation: ( u ) t = κ u x + u y Wave equation: Typical solution for arbitrary k u t = γ u x u = Ae i(kx ωt) Only real or imaginary part has physical interpretation, e.g., the real part is A cos(kx ωt) Inserting solution gives the dispersion relation ω = ω(k): ω = ±γk u = Ae ik(x±γt) [δ + t u = κ (δ xδ xu + δ yδ yu)] l i,j Analysis of difference schemes p. 45 Analysis of difference schemes p. 46 General solutions Physical interpretation Can build general solutions as Fourier series...or Fourier integrals u = k A ke i(kx ωt) u = A(k)e i(kx ωt) dk What do A, k, and ω in really mean? u = Ae i(kx ωt) A λ The basic component in these general solutions is - c which we will study in the following e i(kx ωt) λ = π/k is the wave length, c = ω/k = γ is the wave velocity Analysis of difference schemes p. 47 Analysis of difference schemes p. 48 A heat equation The damping in the heat equation The heat (diffusion) equation: u t = u κ x Typical solution u = A exp (i(kx ωt)) for arbitrary k Inserting solution gives ω = iκk (called the dispersion relation) The form of the solution: u = Ae κkt e ikx Can build more complicated solutions through Fourier series or integral A general solution component: u = Ae κkt e ikx damps short waves (big k) significantly (exp ( k )) Example: add two components, k = π and k = π (choose κπ = ), consider the imaginary part: u(x, t) = e t sin πx +.6 e t sin πx This is a sine with period plus a 6 percent perturbation which oscillates times faster The damping factor of the perturbation is exp ( ) that of the first term; after t = / this damping os about 5 5 Recall: no damping in the wave equation Analysis of difference schemes p. 49 Analysis of difference schemes p. 5 Plot of the damping Solution of discrete equations Consider the explicit finite difference for the wave equation, with u l j as unknown A typical solution reads u l j = Ae i(kjh ωl t) = Ae i(kx ωt) same structure as the solution of the PDE! Inserting this form of u l i in the scheme: (numerical dispersion relation) ω = ω(k, h, t) The velocity c = ω/k should be γ (constant), but will now depend on k, h, and t Hopefully, c γ as h, t Analysis of difference schemes p. 5 Analysis of difference schemes p. 5

20 The wave equation scheme Stability () Inserting u l j = exp (i(kjh ωl t)) in the wave equation scheme gives sin ω t = ± γ t h sin kh Can solve for ω and then we have an analytical solution of the discrete equations: ω = ± ( γ t t arcsin h sin kh ) Can assess the accuracy by plotting ω ω Better: plot c = ω/k as a function of kh for different Courant (γ t/h) numbers Can assess the accuracy by investigating ω ω = 4 γk3 (h γ t ) + O(h t, h 4, t 4 ) Analysis of difference schemes p. 53 We know that the exact solution of the wave equation PDE contains no damping or no growth of a wave component, i.e., ω is real A numerical wave component should exhibit the same qualitative behavior...it can be slightly damped, but not amplified (for sufficiently large t the wave becomes arbitrarily large) Should have real ω (or a small negative imaginary part only, i.e., slight damping) The equation for ω: sin ω t = ± γ t h sin kh Analysis of difference schemes p. 54 Stability () The effect of round-off errors The equation for ω, can also have complex ω sin ω t = ± γ t h sin kh Complex ω will occur in conjugate pairs, i.e., one root has positive imaginary part, leading to wave growth, and cannot be allowed Only real ω can be accepted sin = const sin the const must be in [, ], here or γ t h Consider the initial condition u(x, ) = A exp (ikx) for the wave eq. General solution: Perturb initial condition: u(x, t) = Ae i(kx ωt) û(x, ) = Ae ikx + ae ikx, a A This always happens on a computer, because of round-off errors If t h/γ, no numerical wave components are damped or amplified and the solution reads u l j = Ae i(kjh ωl t) + ae i(kjh ωl t) Initial perturbation (round-off errors) is not amplified t h γ which is the stability criterion Analysis of difference schemes p. 55 Analysis of difference schemes p. 56 Consequences of stability Accuracy If t > h/γ, at least one wave component starts to grow in time, i.e., the initial perturbation is amplified After some time, the solution is completely nonphysical When a program gives nonsense solutions, recall that the reason can either be a bug or a too large t! The exact and numerical solutions, exp (i(kx ωt)) and exp (i(kx ωt)) have the same basic structure, only the ω values differ Can assess the numerical accuracy as E ω = ω ω Can plot E ω or make a Taylor series expansion in terms of the grid and physical parameters: E ω = 4 γk3 (h γ t ) + O(h t, h 4, t 4 ) Errors go to zero as h, t It turns out that E ω = if t = h/γ (!) Analysis of difference schemes p. 57 Analysis of difference schemes p. 58 Summary of numerical properties The heat equation; stability The wave equation: Accuracy: O(h, t ), but exact curves can also be produced Special result: C γ t/h = implies that the numerical solution is exact Stability: t h/γ The method of analysis applies to linear, homogeneous, time-dependent equations with constant coefficients The heat equation: can be discretized by u t = u κ x [δ + t u = κδ xδ xu] l j (explicit forward scheme, u l+ i = old values) Inserting discrete wave component u l j = Ae i(kjh ωl t) = Aξ l e ikjh, ξ = e i ω t in the numerical scheme results in ξ = κ 4 t h sin kh Analysis of difference schemes p. 59 Analysis of difference schemes p. 6

21 Stability criterion Truncation error Damping, i.e. no growth, implies ξ With this leads to as the stability criterion ξ = κ 4 t h sin kh t h κ PDE: L(u) = f Numerical approximation: u, L (u ) = f The truncation error is defined as τ = L (u) f i.e. τ reflects the residual when the analytical solution is inserted in the numerical scheme Computational technique: expand analytical u(x) at x i± in Taylor series about x i: u(x i±) = u(x i) ± u x h + u xi x h ± 3 u xi 6 x h 3 xi + 4 u 4 x h 4 + xi Analysis of difference schemes p. 6 Analysis of difference schemes p. 6 Truncation error; example () What is the truncation error? Problem: Don t multiply by h! (τ L ) Insert Taylor expansions: L(u) u (x) ui ui + ui+ L (u ) h τ measures the error in the discrete equations when the exact solution of the continuous problem is inserted That is, τ measures the error in an equation (this error is called a residual), not the error in the solution u u Hopefully, τ reflects the true error u u τ = u (x i) + f(x i) + u (x)h + O(h 4 ) PDE is fulfilled pointwise: u (x i) + f(x i) = Second order scheme because: τ = u (x)h + O(h 4 ), τ O(h ) Analysis of difference schemes p. 63 Analysis of difference schemes p. 64 Truncation error; example () von Neumann stability analysis () Model problem: the D wave equation L(u) u t γ u x L (u ) ul i u l i + ul+ i t γ ul i ul i + ul i+ h τ = u t + 4 u t 4 t + O( t 4 ) γ u x 4 u γ x 4 h + O(h 4 ) τ = O( t, h ) Make an equation for the numerical error (see later) Seek discrete solution for the error: e l j = k ξ l exp (ikjh) Insert e l j in error equation, compute for a single component k: e l j = exp (ikjh) Common stability requirement: e l j < ξ Gives a condition on t Very similar to discussing numerical dispersion relations Analysis of difference schemes p. 65 Analysis of difference schemes p. 66 von Neumann stability analysis () Consistency and convergence D heat equation: u t = κu xx Perturbed solution v: v t = κv xx Equation for error e = u v: e t = κe xx Inserting e l j = ξl exp (ikjh): ξ ξ : ξ t = 4κ h sin kh ξ = 4κ t h D wave equation: u tt = c u xx, t h κ u: solution of continuous problem u : solution of discrete problem : mesh parameter (h, t etc) Definition of consistency: τ as Example: τ = u (x i)h + O(h 4 ) as h Interpretation: The analytical solution fulfulls the discrete equations as Convergence: u u as τ does not imply u u t h c Analysis of difference schemes p. 67 Analysis of difference schemes p. 68

22 Lax theorem Convergence consistency and stability easy tool for proving convergence! Intro to finite elements Analysis of difference schemes p. 69 Intro to finite elements p. 7 Features of the method Basic principles Flexibility Straightforward handling of complicated geometries Easy to construct higher-order approximations Broad spectrum of applications Popular method for demanding engineering applications Strong mathematical foundation Finite difference method: Finite element method: u ui+ ui x h M u(x) û(x) = u jn j(x) j= N j(x): prescribed functions u j: unknown parameters Need M equations for determining u j Optimal goal: find u j such that u û is minimized Realistic goal: find u j such that the residual is small Intro to finite elements p. 7 Intro to finite elements p. 7 A least-squares method Least squares: example We look at a general PDE Boundary value problem: L(u) =, x u (x) = f(x), x (, ), u() = u() = Insert û = j ujnj for u, but L(u) is then not zero, L(û) = R, R = R(u,..., u M; x) Idea: adjust u,..., u n such that R (u,..., u M; x)d is minimized with respect to u,..., u M Result: R R u i d =, i =,..., M. N j(x) = sin jπx, û = j uj sin jπx Boundary conditions are satisfied since each term vanishes (sin iπ =, sin iπ = ) Residual: Least-squares equations: M R = u jn j (x) + f(x), j= R u i = N i (x) M u jn j (x) + f(x) N i (x)dx = j= algebraic system of equations for u,..., u M Intro to finite elements p. 73 Intro to finite elements p. 74 Writing up a system of linear equations Weighted residual method (WRM) The least-squares equations M u jn j (x) + f(x) N i (x)dx = j= is a system of linear equations To see this, we write the system on standard form Au = b or n A i,ju j = b i, i =,..., n j= Interchange integration and summation, factor u j out: n ( ) N i N j dx u j = fn i dx j= Ai,j bi Intro to finite elements p. 75 We look at a general PDE again L(u) =, x Insert û = j ujnj for u and obtain a residual R: R L(û), R = R(u,..., u M; x) Require R to be zero in a weighted mean: RW i(x)d =, i =,..., M W i(x): prescribed weighting functions Linear system for u,..., u M Galerkin s method: W i = N i (common choice - often optimal) Observation: least squares = WRM with W i = R/ u i Intro to finite elements p. 76

23 Galerkin: example Writing up the linear system Boundary value problem: N j(x) = sin jπx u (x) = f(x), x (, ), u() = u() = Boundary conditions are satisfied Residual: Galerkin equations: M R = u jn j (x) + f(x) j= M u jn j (x) + f(x) N i(x)dx = j= We interchange summation and integration to write the equations on standard form j Ai,juj = bi such that we can identify the coefficient matrix and the right-hand side (these must be known before we can call up software to solve the linear system) A i,j = N i(x)n j (x)dx, b i = f(x)n i(x)dx Intro to finite elements p. 77 Intro to finite elements p. 78 Linear algebra interpretation () Linear algebra interpretation () Recall from linear algebra: (u, v) = v V u = or u V (u V ) Define function space V spanned by Define inner product B = N, N,..., N M (u, v) = uvd With these definitions we can redefine/interpret the Galerkin method as a geometric or linear algebra approach Galerkin s method: Find M û = u jn j V j= such that the resulting residual is orthogonal to V : (R, v) = v V ( R V or R ) We hope that V contains the "important" functions in the problem R is "small" Hope for convergence: û u as M, when R Intro to finite elements p. 79 Intro to finite elements p. 8 Collocation methods A worked example () Force the residual to vanish at M distinct points Collocation: R(u,..., u M; x [i] ) =, Cf. the finite difference method: the PDE is fulfilled at M points Note: collocation also arise from WRM with W i(x) = δ(x x [i] ) (Recall: f(x)δ(x x)dx = f(x)) Subdomain collocation: Rd =, i i =,..., M i =,..., M, = M i= i Problem: u (x) = f(x), x (, ), u() = u() = Approximation: M u(x) û(x) = u jn j(x) j= Force boundary conditions: N j() = N j() =, i =,..., M Intro to finite elements p. 8 Intro to finite elements p. 8 A worked example () A worked example (3) Choices of N j(x): Least squares: N j(x) = sin jπx N j(x) = x j ( x) M N i (x)n j (x)dx u j = f(x)n i (x)dx j= Galerkin s method: M N i(x)n j (x)dx u j = f(x)n i(x)dx j= Observation: N i = sin iπx A i,j =, i j Therefore (Galerkin and least squares): u j = π j f(x) sin jπxdx No need for solving linear systems! Intro to finite elements p. 83 Intro to finite elements p. 84

24 A worked example (4) A worked example (5) Collocation: The residual is forced to vanish at M points x [],..., x [i],..., x [M] With N j = sin jπx the coefficient matrix A i,j = N j (x[i] ) is in general full (dense), e.g, A i,j = sin[j(i )hπ], i =,..., M, h = /(M ) called the collocation points Equivalent view: use û in PDE and let the PDE be fullfilled at the collocation points M N j (x [i] )u j = f(x [i] ), j= i =,..., M which is a linear system with A i,j = N j (x[i] ) as coefficient matrix and b i = f(x [i] ) as right-hand side Intro to finite elements p. 85 Intro to finite elements p. 86 Ill-conditioning Cure for ill-conditioning u =, u() = u() = u = x( x)/ û = M j= ujxj ( x) contains the exact solution Galerkin gives u = /, u j =, j > On the computer (with 6 digits): M (u,..., u M) (-.5,.) 4 (-.5,.39, -.79,.48 ) 6 (-.57,.96, -.733,.756, -.877,. 73 ) 8 ( , -.5,.485,.669, ,.5698,...) The method does not converge! N j = x j ( x) almost linearly dependent for j > 5 Ill-conditioned coefficient matrix, round-off errors accumulate Choose orthogonal or nearly orthogonal N i Fourier series provide orthogonal N i Generalized Fourier series, using e.g. Legendre polynomials, Bessel functions, Laguerre polynomials, etc., also give orthogonal N i Finite elements provide nearly orthogonal N i Intro to finite elements p. 87 Intro to finite elements p. 88 A new view on Fourier series () A new view on Fourier series () We can interpret Fourier series, e.g., u û = u j sin jπx on (, ) j= u j = u(x) sin jπxdx as a special case of a Galerkin or least-squares method Solve u = f by an approximate method Let = (, ), choose N i = sin iπx Since these N i are orthogonal, NiNjdx = δij Hence, the matrix becomes diagonal, and we can solve the system by hand: u j = f(x) sin jπxdx These are the well-known coefficients in the Fourier sine series of f(x) Fourier series can be viewed as a least-squares or Galerkin method u û = j u jn j Residual: R = û f = j ujnj f Least squares and Galerkin: M ( j= ) N in jd u j = fn id, i =,..., M Treatment of boundary conditions () Intro to finite elements p. 89 Treatment of boundary conditions () Intro to finite elements p. 9 u = on N i = on u = ψ(x) on : M û = ψ(x) + u jn j(x), j= N j = on What about u() = U L and u () = β? M u(x) û(x) = U L + u jn j(x), N j() = j= Integration by parts: ψ = ψ on Example: u() = U L, u() = U R, choose ψ = xu R + ( x)u L Note: ψ is not uniquely determined û N idx = û N idx [û N i] = û N idx = fn idx (Galerkin) fn idx fn idx + βn i() Recall that N i() = due to u() = U L. Must have N i(). Intro to finite elements p. 9 Remark: could add an equation/constraint û = u L, e.g., Intro to finite elements p. 9

25 Advantages of integration by parts Essential and natural boundary conditions A way to incorporate derivative BCs The physical derivative BCs naturally arise from the integration by parts (because the BC is closely related to the PDE, e.g., (λu ) = f has derivative condition λu = ) Symmetric matrix: N i N j dx u = U L must be enforced in û essential boundary conditions u = β appears naturally in the formulas natural boundary conditions Observation: forgetting boundary term implies u = Lower continuity requirements on N i Intro to finite elements p. 93 Intro to finite elements p. 94 Multidimensional problems Multidimensional integration by parts Boundary-value problem [k(x) u(x)] = f(x), x, k(x) u n = g(x), x N, u(x) = ψ(x), x E. Expansion: M u(x) û(x) = ψ(x) + u jn j(x) j= with N j = on E Integration by parts lemma: j= [k u]w id = k u W id W ik u n dγ Galerkin s method: M k(x) N i N j d u j = f(x)n id g(x)n idγ k ψ N jd N Common physical flux-conditions appear as natural boundary conditions Intro to finite elements p. 95 Intro to finite elements p. 96 D example Time-dependent problems () = (, ) (, ) Expansion: n n û = u i,jn i,j, N i,j = sin iπx sin jπx i= j= Weighting functions (Galerkin): N k,l Linear system: A i,j,k,lu i,j = b k,l i j For implementation: convert double to indices to single N (j )n+i (x, x ) = sin iπx sin jπx i, j =,..., n, M = n u t = (c u), x, t > u(x, ) = f(x), x u(x, ) t =, x u n =, x Finite differences in time: t u(x, tl) = ul u l + u l+ t + O( t ) Spatial problem at each time level: u l+ (x) = u l (x) u l (x) + (c t) u l (x) Intro to finite elements p. 97 Intro to finite elements p. 98 Time-dependent problems () Summary of the time-discrete equations Initial condition: u = f(x) u t = u u = t (can develop a special formula for u such that the main scheme can be used for l =,,,..., cf. the method for the D wave equation) The spatial variation of u, u l (x), is expanded in the standard way: u = f(x), x u = u + c t u, x u l+ = u l u l + c t u l, x, l =,,... u l n =, x k =,... M u l (x) û l = u l jn j(x), l =,,,,... j= Intro to finite elements p. 99 Intro to finite elements p.

26 Weighted residual methods Treatment of the right-hand sides All the equations are on the form u = g, where g is known and u l û l = j ul j Nj Could, in principle, start with an analytical f(x) and analytically derive u, u and so on, but these expressions become complicated Working with u l û l = j ul jnj instead allows easy update of u l,..., u l M Galerkin method for u = g: insert û l, multiply by N i and integrate: giving a linear system (!): N i u l jn jd = N igd j M ( ) N in jd u l j = gn id, j= i =,..., M Intro to finite elements p. When g contains u l we integrate by parts on the right-hand side Example: u l+ = u l u l + c t u l û l+ N id = (û l û l + c t û l )N id integration of N i ûd by parts: û l+ N id = (û l N i û l N i c t N i û l )d (the surface integral vanishes since u l / n = ) Expand û to a sum and identify coefficient the matrix and the right-hand side Intro to finite elements p. Spatial problems Matrix notation M M i,ju j = j= M j= M j= where M i,ju j = M i,ju l+ j = is the mass matrix f(x)n id, [û N i ] (c t) N i û d + t c f n NidΓ, [ (û l (x) û l (x) ) N i (c t) N i û (x)] l d, M i,j = N in jd Can introduce matrix-vector notation; M = M ij Mu = f Mu = Mu Ku + f n Mu l+ = Mu l Mu l Ku l Intro to finite elements p. 3 Intro to finite elements p. 4 Finite elements A nice feature of finite element N i N i: piecewise polynomials Example: piecewise linear N i gives a piecewise linear û = j ujnj: u Define elements e and nodes x [i] Definition of N i:. polynomial over each element. N i(x [j] ) = δ ij; if i = j and if i j This is the type of N i we use in the finite element method x Property, N i(x [j] ) = δ ij; if i = j and if i j implies that u j is the value of û at node j Proof: û(x [i] ) = j u jn j(x [i] ) = u i This interpretation of u i is very convenient both for practical work, implementation and for comparison with finite difference methods Intro to finite elements p. 5 Intro to finite elements p. 6 Piecewise linear N i Quadratic basis functions Each element has nodes x Each element has tree nodes Intro to finite elements p. 7 Intro to finite elements p. 8

27 Essential boundary conditions A worked example () Boundary-value problem Boundary-value problem u = f, x (, ), u() = u L, u() = u R u = f, x (, ), u() = u L, u() = u R With u i = û(x [i] ) we can construct ψ in a general way: ψ(x) = u LN (x) + u RN n(x) n û(x) = ψ(x) + û jn j(x) In general (D/3D): B = boundary nodes with essential conditions, I = internal nodes û = u jn j + u jn j j B j () j= Galerkin s method: n A i,ju j = b i, j= i =,... n A i,j = N i(x)n j(x)dx, b i = f(x)n i(x)dx Observation: N i(x) and N i (x) vanish over large parts of the domain ("nearly" orthogonal functions) A i,j only for j = i, i, i + Only u j, j I, enter the linear system as unknowns Intro to finite elements p. 9 Intro to finite elements p. A worked example () A worked example (3) Computations: A i,i = N i N idx = h, Ai,i = N in idx = h A i,i+ = N in i+dx = h, A, = A n,n = h, bi = A, = An,n = h f(x)n i(x)dx Numerical integration, trapezoidal rule: f(x)n i(x)dx n f(x[] )N i(x [] )h + f(x [j] )N i(x [j] )h + f(x[n] )N i(x [n] )h j= = n fδih + f jδ ijh + fnδinh j= where f i f(x [i] ) (FDM-inspired notation) For i n the integral becomes f ih (as with finite differences!!) For i =, n we get fh and fnh Intro to finite elements p. Intro to finite elements p. Piecewise constant N i (x) The resulting equations Replace eq. no. and n by boundary conditions u = u L, u n = u R The linear system: x u = u L, h ui + h ui h ui+ = f(x[i] )h, u n = u R Same result as from the finite difference method! i =,..., n, Exact or more accurate numerical integration: different right-hand side term Intro to finite elements p. 3 Intro to finite elements p. 4 Element by element computations () Element by element computations () Split integral into a sum over each element: m A i,j = N in jdx = A (e) i,j, A(e) i,j = N in jdx e= m b i = fn idx = b (e) i, b (e) i = fn idx e= A (e) i,j iff i and j are nodes in element e b (e) i iff i is node in element e Collect nonzero A (e) i,j in a element matrix: e e Similar strategy for b (e) i ; we collect the nonzero entries on element e (e) in b r, with r =, counting local node numbers Algorithm: run through all elements, compute Ã(e) (e) r,s and b r, and combine all element matrices and vectors into a linear system The local nature of N i gives a method where one can compute just a few numbers for an element, independent of the other elements The result is a sparse matrix and possibility for performing the elementwise computations in parallel Ã (e) r,s, r, s =,, r, s : local node numbers Intro to finite elements p. 5 Intro to finite elements p. 6

28 Local coordinates () Local coordinates () Map element e = [x [e], x [e+] ] to [, ] Formula: local ξ [, ] to global x, x (e) (ξ) = Define N i in local ξ coordinates Ñ (ξ) Ñ (ξ) ξ ( x [e] + x [e+]) + ξ (x [e+] x [e]) Perform all computations in local coordinates Local node r (=,) in element e corresponds to global node i = q(e, r) Local linear basis functions: Ñ (ξ) = ( ξ), Ñ (ξ) = ( + ξ) In general: can always compute finite element equation with such local basis functions in a reference element of fixed size We need to change variables: x = ( x [e] + x [e+]) + ξ (x [e+] x [e]) dx dξ = h Intro to finite elements p. 7 Intro to finite elements p. 8 Local coordinates (3) Local coordinates (3) In the integral: x [e+] x [e] Ni(x)N j(x)dx = = dñ r(ξ) dξ ( h dξ dx ) dñr(ξ) dξ dñ s(ξ) dξ dξ dx dx dξ dξ dñs(ξ) h dξ dξ The variable transformation can be expressed in general formulas applicable to general finite element problems in D, D and 3D General (isoparametric) mapping: ne x (e) (ξ) = Ñ r(ξ)x [q(e,r)] r= (specializes to the previous formula for linear Ñr) Change integration variable from x to ξ: x [e+] x [e] Ni(x)N j(x)dx = J dñr(ξ) dñs(ξ) J det Jdξ dξ dξ Intro to finite elements p. 9 We often write e dñr dx dñs det Jdξ dx Intro to finite elements p. u (x) = f(x) Local coordinates (4) Jacobian matrix of mapping: J ( in D) Uniform partition in D: J = h/ Element matrix and vector: Example: r = s =, Ã (e) r,s = b(e) r = hñ r(ξ) hñ s(ξ) h dξ f(x (e) (ξ))ñ h r(ξ) dξ Ã (e), = h ( )( )dξ = h as the expression in local coordinates, knowing that dñr dx Boundary-value problem Results u = f, x (, ), Element matrix and vector: Ã(e) r,s b(e) r = J dñ dξ = h ( = h ( = h where numerical integration is used: dñ dξ u() = u L, u() = u R ) f(x (e) ( )) f(x (e) ()) g(ξ)dξ g( ) + g() ) Intro to finite elements p. Intro to finite elements p. Essential boundary conditions Symmetric element/coefficient matrix Incorporate essential boundary conditions at the element level Element level equations: s= Ã (e) (e) r,sũ s = b r, r =, Example: Essential condition ũ = u L Replace eq. no. by ũ = u L: The element matrix is actually symmetric The essential BC modification makes the symmetric nonsymmetric Symmetrization: subtract column in Ã (e) times u L from b (e), then replace eq. no. This modification preserves the symmetry property of the element matrix and the resulting coefficient matrix A symmetric coeff. matrix gives less storage and enables application of some efficient iterative solution methods ũ = u L Ã (e) (e),ũ + Ã(e),ũ = b Modify the element matrix and vector: ( ) ( ) ( ũ = h h ũ u L hf(x() ()) ) Intro to finite elements p. 3 Intro to finite elements p. 4

29 Numerical integration Assembly Integration rules are normally tabulated for integrals on [, ]: ξ k: integration points w k: integration weights ni g(ξ)dξ g(ξ k)w k k= Some rules integrating polynomials of degree p exactly: name n I p weights points Gauss-Legendre () () Gauss-Legendre 3 (, ) ( / 3, / 3) Gauss-Legendre 3 5 (5/9, 8/9, 5/9) ( 3/5,, 3/5) Gauss-Lobatto (, ) (, ) Gauss-Lobatto 3 3 (/3, 4/3, /3) (,, ) Element matrices and vectors must be assembled (added) into the global system of linear equations Essential: local global mapping, q(e, r), local node r in element e has global node number q(e, r) In D, q(e, r) = e + r, but in D/3D the grid is more complicated and q is just a table Algorithm: A q(e,r),q(e,s) := A q(e,r),q(e,s) + Ã(e) r,s, r, s =, (e) b q(e,r) := b q(e,r) + b r, r =, Intro to finite elements p. 5 Intro to finite elements p. 6 Illustration of the assembly process Summing up the procedures element matrices 3 4 q(e,r) global matrix Weighted residual formulation, often Galerkin s choice W i = N i Integration by parts Derivative boundary conditions in boundary terms Compute element matrices and vectors Local coordinates with local numbering Numerical integration Enforce essential boundary conditions Assemble local contributions Solve linear system Intro to finite elements p. 7 Intro to finite elements p. 8 Generality Nonconstant element size u = f, u() = u L, u() = u R is just an example The algorithm works in D, D, 3D Complicated geometries can be handled Element shapes in D: triangles, quadrilaterals Element shapes in 3D: boxes, tetrahedra Time dependency: "time loop outside a stationary solver" Trivial to work with varying element size Just replace h by a h e in the formulas from element e Result in model problem: u = u L ( u i + + ) u i u i+ = h i h i h i h i (hi + hi)f(x[i] ) u n = u R i =,..., n. Varying element shape and size is straightforward in multi-dimensional problems Intro to finite elements p. 9 Intro to finite elements p. 3 The elementwise algorithm () The elementwise algorithm () initialize global linear system: set Ai,j = for i, j =,..., n set bi = for i =,..., n loop over all elements: for e =,..., m set Ã (e) r,s =, r, s =,..., ne (e) set b r =, r =,..., ne loop over numerical integration points: for k =,..., ni evaluate Ñr(ξk), derivatives of Ñr wrt. ξ and x, J contribution to element matrix and vector from the current integration point for r =,..., ne for s =,..., ne Ã (e) r,s := Ã(e) r,s + dñr dx Ñs det Jwk dx b(e) r := b (e) + f(x (e) (ξk))nr det Jwk incorporate essential boundary conditions: for r =,..., ne if node r has an essential boundary condition then modify Ã(e) r,s and assemble: for r =,..., ne for s =,..., ne b (e) r due to this condition A q(e,r),q(e,s) := A q(e,r),q(e,s) + Ã(e) r,s (e) b q(e,r) := b q(e,r) + b r Intro to finite elements p. 3 Intro to finite elements p. 3

30 Exercise.7 Exercise.7 cont. u (x) = f(x) (α + )x α Good repetition of the previous material A new point: xα N i(x)dx The final discrete equations can be written h[δ xδ xu] i = hb i for internal nodes i =,..., n, where A standard finite difference method gives [δ xδ xu] i = f i = (α + )(x [i] ) α for internal nodes i =,..., n FEM and FDM give slightly different equations Which method is most accurate? By accident, FEM solves this problem exactly (!) b i = ( (x [i ] h ) α+ + (x [i] ) α+ (x [i+] ) α+) α + For i = : u = For i = n: a slightly modified equation Intro to finite elements p. 33 Intro to finite elements p. 34 FEM for the wave equation Time stepping u t = (c u), x, t > u(x, ) = f(x), x u(x, ) t =, x u n =, x Finite differences in time: t u(x, tl) = ul u l + u l+ t + O( t ) u = f(x), x u = u + c t u, x u l+ = u l u l + c t u l, x, l =,,... u l n =, x l =,... u l (x) û l = n u l jn j(x), l =,,,,... j= Spatial problem at each time level (chosen c = const for simplicity): u l+ (x) = u l (x) u l (x) + (c t) u l (x) (can be discretized by finite elements) Intro to finite elements p. 35 Intro to finite elements p. 36 Spatial problems Spatial problems n M i,ju j = f(x)n id, j= n M i,ju j = [û N i ] (c t) N i û d + t c f n NidΓ, j= n [ (û M i,ju l+ j = l (x) û l (x) ) N i (c t) N i û (x)] l d, j= where is the mass matrix M i,j = N in jd FEM for the D wave equation Intro to finite elements p. 37 Alternative notation: where n M i,ju j = b i, j= n n n M i,ju j = M i,ju j + (c t) K i,ju j + b i, j= j= j= n n n n M i,ju l+ j = M i,ju l j M i,ju l j (c t) K i,ju l j, j= j= j= j= M i,j = N in jd from u terms K i,j = N i N jd from u terms Interpreting the mass matrix term () Intro to finite elements p. 38 D: u = u,xx Linear elements Compute element matrices corresponding to the two principal terms (u,tt and u,xx): M (e) ij = h 6 ( ) K (e) ij = c h (mass matrix and stiffness matrix at the element level) Assembling the stiffness matrix: (same as FDM) K i,ju l j = c ( u l h i u l i + u l i+) j ( ) Assembling the mass matrix contributions: j With FDM, only hu l+ i would appear M i,ju l+ j = h ( u l+ i 6 + 4ul+ i + u l+ ) i+ Can rewrite this as ( h u l+ i + ( u l+ i 6 ul+ i + u l+ ) ) i+ or expressed with difference operators: h[u + h 6 δxδxu]l+ i FDM representation + a diffusion term Intro to finite elements p. 39 Intro to finite elements p. 4

31 Interpreting the mass matrix term () Lumping the mass matrix The complete equation: [δ tδ t(u + h 6 δxδxu) = c δ xδ xu] l i FDM representation + [ 6 h δ tδ tδ xδ xu] l i (dispersion) Notice: FEM gives an implicit scheme (must solve a linear system to find u l+ i ) In D: solution of tridiagonal systems is fast In D/3D: solving linear systems slows down the method significantly If we apply nodal-point integration the trapezoidal rule the element mass matrix becomes ( ) h Assembling: j M i,ju l+ j = hu l+ i i.e. the same result as a finite difference method! Making the mass matrix diagonal (by e.g. nodal point integration) is called mass lumping Final lumped scheme: [δ tδ tu = c δ xδ xu] l i Intro to finite elements p. 4 i.e. a standard finite difference scheme Intro to finite elements p. 4 Some questions Analysis of FEM for the wave equation () What is best, consistent or lumped mass matrix? That depends on the equation! We shall make an analysis of discrete wave eqs. Is there any physical justification of lumping? Yes! see Exercise. Derive the discrete equations Look for analytical solutions of the discrete equations Find corresponding analytical solutions of the continuous problem Compare principal quantities, e.g., wave velocity This is called numerical dispersion analysis Intro to finite elements p. 43 Intro to finite elements p. 44 Analysis of FEM for the wave equation () Analysis of FEM for the wave equation (3) The discrete equations from FEM: [δ tδ t(u + h 6 δxδxu) = c δ xδ xu] l i Lumped mass: only δ tδ tu on the left-hand side Inserting a discrete solution: u l j = Ae i(kjh ωl t) = Ae i(kx ωt) Results in a numerical dispersion relation: ω = ω(k, h, t) Can use this for stability and accuracy analysis Numerical dispersion relation follows from (and solving for ω) sin ω t = c t ( h kh ) 3 sin sin kh Truncation error or series expansion of error in wave velocity: τ t ( 4 ) l u t 4 c h i No big difference from FDM Stability: must require real ω ( 4 ) l ( u x 4 + h 4 ) l u i 6 x t i c t h 3 A reducing factor / 3 compared with FDM! Intro to finite elements p. 45 Intro to finite elements p. 46 Consistent vs. lumped mass matrix Other problems FEM w/lumped mass = FDM scheme FEM w/consistent mass: - same order of accuracy as FDM - lower stability (/ 3) - exact solution for C = is not true - must solve linear systems Use lumped mass for this wave equation! The conclusions here apply to D/3D wave equations u t = [c u] Another PDE (uni-directional wave eq.): u t + v u = Here, lumped mass reduces the accuracy significantly, so don t generalize too much Intro to finite elements p. 47 Intro to finite elements p. 48

32 Error in numerical wave velocity () Error in numerical wave velocity () consistent mass lumped mass or FDM p ( c c)/c as function of p = kh. C = c t/h. Top curve: C = / 3 (max); mid curve: C =.3; bottom curve: C = ( c c)/c as function of p = kh. C = c t/h. C = gives exact solution. Top curve: C =.98 (max); mid curves: C =.9,.5; bottom curve: C =.. Intro to finite elements p. 49 Intro to finite elements p. 5 Software tools for experimentation () Quadratic D elements Propagation of numerical high-frequency noise: src/fdm/waved/steep (steep) σ (sigma) is a steepness parameter in a plug-shaped profile: f(x) =.5 π arctan(σ(x )), x >,.5 + π arctan(σ(x + )), x Vary the resolution and σ. Study the effect on the wave propagation. Error in wave velocity for a sine component: c c = ω ω k k use this information to explain the visual observationbs Piecewise quadratic N i piecewise quadratic û Three nodes per element: one in the middle plus the two at the ends We always have N i(x [j] ) =, i j Previous algorithms and techniques still work Intro to finite elements p. 5 Intro to finite elements p. 5 Quadratic D elements in local coordinates Why quadratic elements? Three nodes per element: Calculation of basis functions: ξ =, ξ =, ξ 3 = Ñ r(ξ) = a rξ + b rξ + c r, 3 equations for a r, b r, c r (r =,, 3) Ñ r(ξ s) = δ rs Ñ (ξ) = ξ(ξ ), Ñ (ξ) = ( + ξ)( ξ), Ñ 3 (ξ) = ξ( + ξ) Isoparametric mapping: PDE: u (x) =, x (, ) Basic error estimate: ( (u û) dx ) = O(h s+ ), s = degree of N i where h is the distance between two neighboring nodes Linear elements: error h Quadratic elements: error h 3 doubling the number of nodes reduces the error by /8 3 x = Ñ r(ξ)x [q(e,r)] r= (could also use the linear mapping if the mid node is in the center) 3 3 element matrix Quadratic elements: example Intro to finite elements p. 53 Implementation of D FE problems Intro to finite elements p. 54 PDE: u (x) =, x (, ) Apply the general elementwise algorithm with n e = 3, quadratic Ñr, analytical integration Element matrix: 3h e h e: physical length of element e. element vector: h e 6 4 Direct implementation in terms of arrays Linear D elements, u = β = const, but still fairly general implementation scan: read n and β, allocate vectors, matrices, etc., call initgrid. initgrid: compute x i and q(e, r), i.e., the finite element grid. makesystem: calculate the linear system. solve: solve linear system by Gaussian elimination. calcelmmatvec: compute element matrix and vector for an element. integrands: evaluate the integrands of the weighted residual statement N: evaluate the basisfunctions in local coordinates. dn: evaluate the derivatives of the basisfunctions in local coordinates. Intro to finite elements p. 55 Intro to finite elements p. 56

33 N = ξ, N = ξ, N 3 = ξ ξ Intro to finite elements p. 64 Extensions of the program D domains D to D (3D): big job Another PDE: easy - integrands Quadratic elements: small, scattered modifications - N, dn, integrands, initgrid... Goal: solver code is independent of element type linear system solver matrix format grid type no of space dimensions Strength of the finite element method: easy to work with geometrically complicated domains Lake Superior with 6 islands, 33 triangles Intro to finite elements p. 57 Intro to finite elements p. 58 Element shapes D rectangular bilinear element Rectangular, triangular Straight or curved sides ξ 3 4 ξ Illegal: illegal node 4 nodes Bilinear functions: Ñ r(ξ, ξ ) = a r + b rξ + c rξ + d rξ ξ Conditions for determining a r, b r, c r, d r: N r(node s) = δ rs 4 equations for a r, b r, c r, d r (fixed r) Intro to finite elements p. 59 Intro to finite elements p. 6 Mapping of D bilinear element D linear 3-node element ξ x ξ x ξ 3 ξ x x local global local global linear Ñ i(ξ, ξ ), straight sides Intro to finite elements p. 6 Intro to finite elements p. 6 Typical D linear basis function Construction of the basis functions ξ (,) 3 (,) (,) ξ reference element Principles for the construction:. Ñ i(ξ, ξ ) is a polynomial. Ñ i = δ ij at local node j 3 constraints Ñ i must be linear Intro to finite elements p. 63

34 Construction of D basis functions; Example D quadratic 6-node element Example: Ñ (ξ, ξ ) Ñ i(ξ, ξ ) = α i + β iξ + γ iξ Ñ (, ) = α + β + γ ξ x Ñ (, ) = α + β + γ Ñ (, ) = α + β + γ 3 linear equations in 3 unknowns α, β, and γ. Solution: α = β =, γ = Ñ(ξ, ξ ) = ξ local ξ global x quadratic Ñ i(ξ, ξ ), curved sides (parabola) Intro to finite elements p. 65 Intro to finite elements p. 66 D quadratic 9-node element D quadratic 8-node element ξ x ξ x ξ ξ x x local biquadratic Ñ i(ξ, ξ ), curved sides (parabola) global local global biquadratic Ñ i(ξ, ξ ) minus ξ ξ term, curved sides (parabola) Intro to finite elements p. 67 Intro to finite elements p. 68 3D elements Triangular vs. box shape tetrahedron with 4 corner nodes (linear N i) tetrahedron with nodes (quadratic N i), mid-node on each edge box with 8 corner nodes (tri-linear N i) box with nodes (quadratic N i), mod-node on each edge box with 7 nodes (tri-quadratic N i), mod-node on edges and sides Any D geometry can be divided into triangles (if the boundaries are approx. by polygons) Any 3D geometry can be divided into tetrahedra (if the boundaries are approx. by polygons) Many geometries can be divided into rectangles/boxes, but one may need an extra triangle/tetrahedron Different element shapes have different properties, depending on the PDE system, so chosing the right element is not obvious, and it is not only a geometry-approximation thing Intro to finite elements p. 69 Intro to finite elements p. 7 Projects.5. and.5.3 Model problem: u (x) = ɛu (x), x (, ), u() =, u() = Convection-dominated flow ɛ small: boundary layer at x = Standard numerics (i.e. centered differences) will fail! Cure: upwind differences Convection-dominated flow p. 7 Convection-dominated flow p. 7

35 Notation for difference equations () Notation for difference equations () Define u l [δ xu] l i+ i,j,k,j,k ul i,j,k h with similar definitions of δ y, δ z, and δ t Another difference: Compound difference: [δ xu] l i,j,k ul i+,j,k ul i,j,k h [δ xδ xu] l i = ( u l h i u l i + u l ) i+ One-sided forward difference: and the backward difference: [δ + x u] l i ul i+ ul i h [δ x u] l i ul i ul i h Put the whole equation inside brackets: [δ xδ xu = f] i is a finite difference scheme for u = f Convection-dominated flow p. 73 Convection-dominated flow p. 74 Centered differences Numerical experiments () or u (x) = ɛu (x), x (, ), u() =, u() = u i+ u i h Analytical solution: ui ui + ui+ = ɛ h, i =,..., n u =, u n = [δ xu = ɛδ xδ xu] i u(x) = ex/ɛ e /ɛ u (x) >, i.e., monotone function u(x) n=, epsilon=. centered exact x Convection-dominated flow p. 75 Convection-dominated flow p. 76 Numerical experiments () Numerical experiments (3) u(x) n=, epsilon=. centered exact x u(x) n=8, epsilon=. centered exact x Convection-dominated flow p. 77 Convection-dominated flow p. 78 Numerical experiments (4) Numerical experiments; summary u(x) n=, epsilon=. centered exact x The solution is not monotone if h > ɛ The convergence rate is h (in agreement with truncation error analysis) provided h ɛ Completely wrong qualitative behavior for h ɛ Convection-dominated flow p. 79 Convection-dominated flow p. 8

36 Analysis Important result Can find an analytical solution of the discrete problem (!) Method: insert u i β i and solve for β cf. HPL app. A.4.4 Complete solution: β =, β = + h/(ɛ) h/(ɛ) Observe: u i oscillates if β < + h/(ɛ) h/(ɛ) < h > ɛ Must require h ɛ for u i to have the same qualitative property as u(x) This explains why we observed oscillations in the numerical solution u i = C β i + C β i Determine C and C from boundary conditions u i = βi β β n β Convection-dominated flow p. 8 Convection-dominated flow p. 8 Upwind differences Numerical experiments () Problem: u (x) = ɛu (x), x (, ), u() =, u() = Use a backward difference, called upwind difference, for the u term: u i u i h The scheme can be written as ui ui + ui+ = ɛ, i =,..., n h u =, u n = [δ x u = ɛδ xδ xu] i u(x) n=, epsilon=. upwind exact x Convection-dominated flow p. 83 Convection-dominated flow p. 84 Numerical experiments () Numerical experiments; summary u(x) n=, epsilon=. upwind exact x The solution is always monotone, i.e., always qualitatively correct The boundary layer is too thick The convergence rate is h (in agreement with truncation error analysis) Convection-dominated flow p. 85 Convection-dominated flow p. 86 Analysis Centered vs. upwind scheme Analytical solution of the discrete equations: Using boundary conditions: u i = β i β =, β = + h/ɛ u i = C + C β i u i = βi β β n β Truncation error: centered is more accurate than upwind Exact analysis: centered is more accurate than upwind when centered is stable (i.e. monotone u i), but otherwise useless ɛ = 6 5 grid points to make h ɛ Upwind gives best reliability, at a cost of a too thick boundary layer Since β > (actually β > ), β i does not oscillate Convection-dominated flow p. 87 Convection-dominated flow p. 88

37 An interpretation of the upwind scheme Finite elements for the model problem The upwind scheme or can be rewritten as or u i u i h u i+ u i h ui ui + ui+ = ɛ h [δ x u = ɛδ xδ xu] i = (ɛ + h ui + ui+ )ui h [δ xu = (ɛ + h )δxδxu]i Galerkin formulation of u (x) = ɛu (x), x (, ), u() =, u() = and linear elements leads to a centered scheme (show it!) or u i+ u i h Stability problems when h > ɛ ui ui + ui+ = ɛ, i =,..., n h u =, u n = [δ xu = ɛδ xδ xu] i Upwind = centered + artificial diffusion (h/) Convection-dominated flow p. 89 Convection-dominated flow p. 9 Finite element theory () Finite element theory () Abstract finite element theory starts with a(u, v) = L(v) v V Consider v u = ɛ u Nonsymmetric a(u, v) General best-approximation result: In the estimate, c = ɛ and If v is ɛ, c /c is also large c = ɛ + C sup v(x) x Some indication that the best-approximation property of the Galerkin method is not that much worth u u h V c c u v V v V where c and c are the bounds of a: c v V a(v, v), a(u, v) c u V v V Convection-dominated flow p. 9 Convection-dominated flow p. 9 Finite elements and upwind differences Perturbed weighting functions in D How to construct upwind differences in a finite element context? One possibility: add artificial diffusion (h/) u (x) = (ɛ + h )u (x), x (, ), u() =, u() = Can be solved by a Galerkin method Equivalent strategy: use perturbed weighting functions Take W i(x) = N i(x) + τn i(x) as weighting function for the convective term u : u W idx = u N idx + τn iu dx The new term τn i u is the weak formulation of an artificial diffusion term τn iu With τ = h/ we then get the upwind scheme Convection-dominated flow p. 93 Convection-dominated flow p. 94 Optimal artificial diffusion Multi-dimensional problems Try a weighted sum of a centered and an upwind discretization: Is there an optimal θ? Yes, for [u ] i [θδ x u + ( θ)δ xu] i, θ [θδ x u + ( θ)δ xu = ɛδ xδ xu] i θ(h/ɛ) = coth h ɛ ɛ h we get exact u i (i.e. u exact at nodal points) Equivalent artificial diffusion τ o =.5hθ(h/ɛ) Exact finite element method: W i(x) = N i(x) + τ on i (x) for the convective term u Model problem: often written as v x u x + vy u y = u v u = u Non-physical oscillations occur with centered differences or Galerkin methods when the left-hand side terms are large Remedy: upwind differences Downside: too much diffusion Important result: extra stabilizing diffusion is needed only in the streamline direction (v x, v y) Convection-dominated flow p. 95 Convection-dominated flow p. 96

38 Streamline diffusion Perturbed weighting functions () Idea: add diffusion in the streamline direction Isotropic diffusion: d d u kδ ij = k u x i= j= i x j kδ ij is the diffusion tensor (same in all directions) Streamline diffusion: d d ( ) u k ij, k ij = τ vivj x i= j= i x j v Implementation: artificial diffusion term or perturbed weighting function Consider the weighting function W i = N i + τ v N i for the convective (left-hand side) term: W i v u d This expands to N iv ud + τ v u v N id The latter term can be viewed as the Galerkin formulation of (write v u = i u/ xi etc.) d d ( ) τ u v iv j x i= j= i x j Convection-dominated flow p. 97 Convection-dominated flow p. 98 Perturbed weighting functions () Consistent SUPG Streamline diffusion can be obtained by perturbing the weighting function Common name: SUPG (streamline-upwind/petrov-galerkin) Why not just add artificial diffusion? Why bother with perturbed weighting functions? In standard FEM, L(u)W id = the exact solution is a solution of the FEM equations (it fulfills L(u)) This no longer holds if we add an artificial diffusion term ( h/) use different weighting functions on different terms Idea: use consistent SUPG no artificial diffusion term same (perturbed) weighting function applies to all terms Convection-dominated flow p. 99 Convection-dominated flow p. 3 A step back to D Choosing τ Let us try to use on both terms in u = ɛu : Problem: last term Remedy: drop it (!) W i(x) = N i(x) + τn i(x) (N iu + (ɛ + τ)n iu )dx + τ N i u dx = Justification: N i = on each linear element Drop nd-order derivatives of N i in D/3D too Consistent SUPG is not so consistent... Choosing τ is a research topic Many suggestions Two classes: τ h τ t (time-dep. problems) Little theory Convection-dominated flow p. 3 Convection-dominated flow p. 3 A test problem () A test problem () y u=.5 u= du/dn= or u= y = x tan θ +.5 v θ u= du/dn= x Methods:. Classical SUPG: Brooks and Hughes: "A streamline upwind/petrov-galerkin finite element formulation for advection domainated flows with particular emphasis on the incompressible Navier-Stokes equations", Comp. Methods Appl. Mech. Engrg., 99-59, 98.. An additional discontinuity-capturing term W i = N i + τ v N i + ˆτ v u u u was proposed in Hughes, Mallet and Mizukami: "A new finite element formulation for computational fluid dynamics: II. Beyond SUPG", Comp. Methods Appl. Mech. Engrg., , 986. Convection-dominated flow p. 33 Convection-dominated flow p. 34

39 Galerkin s method SUPG Z Z Y Y X Convection-dominated flow p. 35 X Convection-dominated flow p. 36 Time-dependent problems Taylor-Galerkin methods () Model problem: u t + v u = ɛ u Can add artificial streamline diffusion term Can use perturbed weighting function Idea: Lax-Wendroff + Galerkin Model equation: u t + U u x = Lax-Wendroff: nd-order Taylor series in time, on all terms How to choose τ? W i = N i + τ v N i u l+ = u l + t Replace temporal by spatial derivatives, [ ] l u + [ ] l u t t t t = U x Result: u l+ = u l U t [ ] l u + [ x U t ] l u x Convection-dominated flow p. 37 Convection-dominated flow p. 38 Taylor-Galerkin methods () Taylor-Galerkin methods (3) We can write the scheme on the form [ δ t + u + U u x = ] l U t u x Forward scheme with artificial diffusion Lax-Wendroff: centered spatial differences, [δ + t u + Uδ xu = U tδ xδ xu] l i Alternative: Galerkin s method in space, [δ + t u + Uδ xu = U tδ xδ xu] l i provided that we lump the mass matrix This is Taylor-Galerkin s method In multi-dimensional problems, we have and ( v = ) u t + v u = t = v d d ( ) = (vv ) = v t rv s x r= s= r x s This is streamline diffusion with τ = t/: [δ t + u + v u = t (vv u)]l Convection-dominated flow p. 39 Convection-dominated flow p. 3 Taylor-Galerkin methods (4) Can use the Galerkin method in space (gives centered differences) The result is close to that of SUPG, but τ is diferent The Taylor-Galerkin method points to τ = t/ for SUPG in time-dependent problems Nonlinear PDEs Convection-dominated flow p. 3 Nonlinear PDEs p. 3

40 Examples Nonlinear discrete equations; FDM Some nonlinear model problems to be treated next: u (x) = f(u), u() = u L, u() = u R, (λ(u)u ) =, u() = u L, u() = u R [λ(u) u] = g(x), with u or λ u n B.C. Discretization methods: standard finite difference methods standard finite element methods the group finite element method We get nonlinear algebraic equations Solution method: iterate over linear equations Finite differences for u = f(u): (ui ui + ui+) = f(ui) h nonlinear system of algebraic equations F (u) =, or Au = b(u), u = (u,..., u n) T Finite differences for (λ(u)u ) = : ([λ(ui+) + λ(ui)](ui+ ui) h [λ(u i) + λ(u i )](u i u i )) = nonlinear system of algebraic equations F (u) = or A(u)u = b Nonlinear PDEs p. 33 Nonlinear PDEs p. 34 Nonlinear discrete equations; FEM Nonlinearities in the FEM Finite elements for u = f(u): n u û = u kn k(x) k= Galerkin approach: N iû dx = f( N ku k)n idx k (assuming prescribed u() and u()) Left-hand side is easy to assemble: h (ui ui + ui+) = f( u kn k(x))n idx k Note that f( k N k(x)u k) is a complicated function of u,..., u n F.ex.: f(u) = u ( ) N ku k N idx k gives rise to a difference representation h ( u i + u i(u i + u i+) + 6u i + u i+) (compare with f(u i) = u i in FDM!) Must use numerical integration in general Nonlinear PDEs p. 35 Nonlinear PDEs p. 36 The group finite element method FEM for a nonlinear coefficient The group finite element method: f(û) = f( k u kn k(x)) n f(u k)n k k= Resulting term: f(u)nidx = k NiNkf(uk) gives N kn idx or f(u k), k Mf(u) which is a mass matrix-like term: h 6 (f(ui ) + 4f(ui) + f(ui+)) Trapezoidal integration gives an FDM-like term: N kn idx f(u k)dx hf(u i) k similar results as FDM Nonlinear PDEs p. 37 Nonlinear algebraic equations We now look at (λ(u)u ) =, u() = u L, u() = u R Using a finite element method (exercise 4.) results in an integral complicated! λ( u kn k)n in j dx k Linear elements and trapezoidal rule: (λ(ui) + λ(ui+))(ui+ ui) (λ(ui ) + λ(ui))(ui ui ) = FDM with arithmetic mean for λ(u i+/ ) Solving nonlinear algebraic eqs. Nonlinear PDEs p. 38 FEM/FDM for nonlinear PDEs gives nonlinear algebraic equations: (λ(u)u ) = A(u)u = b u = f(u) Au = b(u) In general a nonlinear PDE gives or F (u) = F (u,..., u n) =... F n(u,..., u n) = Have A(u)u b =, Au b(u) =, F (u) = Idea: solve nonlinear problem as a sequence of linear subproblems Must perform some kind of linearization Iterative method: guess u, solve linear problems for u, u,... and hope that lim k uk = u i.e. the iteration converges Nonlinear PDEs p. 39 Nonlinear PDEs p. 3

41 Successive substitutions () Successive substitutions () Model problem: A(u)u = b Simple iteration scheme: A(u k )u k+ = b, k =,,... Must provide (good) guess u Termination: u k+ u k ɛ u or using the residual (expensive, req. new A(u k+ )!) b A(u k+ )u k+ ɛ r Model problem: Au = b(u) Simple iteration scheme: Au k+ = b(u k ), k =,,... Relaxation: Au = b(u k ), u k+ = ωu + ( ω)u k (may improve convergence, avoids too large steps) Picard iteration is another name of this method Relative criteria: or (more expensive) u k+ u k ɛ u u k b A(u k+ )u k+ ɛ r b A(u )u Nonlinear PDEs p. 3 Nonlinear PDEs p. 3 Simple method, but sometimes slow convergece Newton s method () Newton s method () The Newton (Newton-Raphson) method for f(x) =, x IR Given an approximation x k Approximate f by a linear function at x k : Find new x k+ such that f(x) M(x; x k ) = f(x k ) + f (x k )(x x k ) M(x k+ ; x k ) = x k+ = x k f(xk ) f (x k ) Systems of nonlinear equations: F (u) =, F (u) M(u; u k ) Multi-dimensional Taylor-series expansion: M(u; u k ) = F (u k ) + J(u u k ), J i,j = Fi u j Iteration no. k: solve linear system J(u k )(δu) k+ = F (u k ) update: u k+ = u k + (δu) k+ Can use relaxation: u k+ = u k + ω(δu) k+ J F Nonlinear PDEs p. 33 Nonlinear PDEs p. 34 The Jacobian matrix; FDM () The Jacobian matrix; FDM () Model: u = f(u) Scheme: Jacobian matrix term (FDM): F i (ui ui + ui+) f(ui) = h Derivation: F i (ui ui + ui+) f(ui) = h J i,i = Fi u i = h F i = contains only u i, u i± Jacobian is sparse (tridiagonal) J i,j = Fi u j J i,i+ = Fi u i+ = h J i,i = Fi u i = h f (u i) Must form the Jacobian in each iteration and solve Jδu k+ = F (u k ) and then update u k+ = u k + ωδu k+ Nonlinear PDEs p. 35 Nonlinear PDEs p. 36 The Jacobian matrix; FEM A D/3D transient nonlinear PDE () u = f(u) + FEM gives F i =, where The Jacobian: becomes F i N in ju j f( u sn s)n i dx j s J i,j = Fi u j [ N in j f ( ] u sn s)n jn i dx s In general, for FE function û = s usns, f(û) = f (û) û = f (û) u sn s = f (û)n j u j u j u j s Nonlinear PDEs p. 37 PDE: ϱc u t = [κ(u) u] (f.ex. u = g on the boundary and u = I at t = ) FDM in time: with λ = κ/(ϱc) u l u l = [λ(u l ) u l] t FEM nonlinear algebraic equations: where F i(u l,..., u l n) =, i =,..., n F i [(ûl û l ) N i + tλ(û l ) û l ] N i d Nonlinear PDEs p. 38

42 A D/3D transient nonlinear PDE () Iteration methods at the PDE level Successive substitution: Use old û l,k in λ(û l ) term, solve linear problem for û l,k+, k =,,... Exercise: specify element matrix and vector Newton-Raphson s method: need J, J i,j = Fi u j Exercise: carry out the differentiation, specify element matrix and vector Consider u = f(u) Could introduce a successive substitution at the PDE level: linear problem for u k+ d dx uk+ = f(u k ), k =,,... A PDE-level Newton-Raphson method can also be formulated (see the book for details) We get identical results for our model problem Time-dependent problems: first use finite differences in time, then use an interation method (successive subst. or Newton-Raphson) at the time-discrete PDE level Nonlinear PDEs p. 39 Nonlinear PDEs p. 33 Continuation methods Exercises Challenging nonlinear PDE: ( u q u) = Methods for nonlinear PDEs are best learned through exercises Exercises 4., 4., 4.4, 4.5, 4.6, 4.7, 4.9, 4. For q = this problem is simple Idea: solve a sequence of problems, starting with q =, and increase q towards a target value Sequence of PDEs: ( u r qr u r ) =, r =,,,... with = < q < q < q < < q m = q Start guess for u r is u r (the solution of a simpler problem) CFD: The Reynolds number is often the continuation parameter q Nonlinear PDEs p. 33 Nonlinear PDEs p. 33 Model problem for nonlinear PDEs Discretization in time Let us make software for solving Boundary conditions: u prescribed u t = [λ(u) u] Test solution: λ(u) = u, u = φ(x, t) = dt + j xj at the boundary u = φ everywhere Backward Euler scheme in time: u l u l = [λ(u l ) u l] t recursive set of spatial problems for u l (x) Nonlinear PDEs p. 333 Nonlinear PDEs p. 334 Discretization in space Solution of nonlinear systems FEM for the spatial problems: where u l (x) û l (x) = n u l jn j(x) j= F i(u l,..., u l n) =, i =,..., n, F i [(ûl û l ) W i + tλ(û l ) û l ] W i d. Nonlinear system of algebraic equations for u = (u l,..., u l n) Idea: solve a nonlinear system as a sequence of linear systems Approx. to u l in iteration k: û l,k (x) Successive substitutions: use old û l,k in nonlinear coefficients, λ(û l,k ), and solve for û l,k+ Newton s or Newton-Raphson s method: in iteration k, solve. Jδu k+ = F. u l,k+ = u l,k + δu k+ where J i,j = Fi u j and F i are computed using old values u l,k (F = (F,..., F n), J = J i,j) Nonlinear PDEs p. 335 Nonlinear PDEs p. 336

43 The Jacobian What to implement In our example: [ J i,j Fi u l = W in j + t dλ j dû (ûl,k )N j W i û l,k + t λ(û l,k ) W i N j ]d F i: these are the same terms that appear in a corresponding linear PDE problem In case of Newton-Raphson s method, we also need to implement J i,j (containing terms that are not identical to those in the PDE) Nonlinear PDE solver = Linear PDE solver + an outer nonlinear loop Such expressions must be calculated by hand (or symbolic math software) Nonlinear PDEs p. 337 Nonlinear PDEs p. 338 Implementation in Diffpack Nonlinear systems in Diffpack The evaluation of the Jacobian and right-hand side at an integration follows the same set-up as in linear problems The management of a nonlinear loop is a new component Need information about the type of nonlinear solver Nonlinear solvers are realized as subclasses of a class hierarchy NonLinEqSolver: - SuccessiveSubst - NewtonRaphson Nonlinear solver algorithm: // calling NonLinEqSolver s solve() leads to iteration = ; while (!converged) iteration++; // ask simulator to set up the linear (sub)system to be // solved in this iteration: solver->makeandsolvelinearsystem(); // define the PDE! // perform updates according to the algorithm This loop takes place in the Diffpack libraries Nonlinear PDEs p. 339 Nonlinear PDEs p. 34 makeandsolvelinearsystem Implementation The purpose of makeandsolvelinearsystem() is the same as for a linear problem: makesystem (assemble system) lineq s solve (solve linear system) Hence, the programmer has complete control of the linear system and its solution in each iteration Note: makesystem defines the linear system and hence the PDE (implicitly) This set-up makes it easy to switch between iteration methods/strategies Derive simulator from NonLinEqSolverUDC and FEM Add three new data items: Vec(real) nonlin_solution; Handle(NonLinEqSolver_prm) nlsolver_prm; Handle(NonLinEqSolver) nlsolver; Initialize these objects in scan Call nonlinear solver: nlsolver->solve(); In each iteration, the nonlinear solver jumps back to your virtual void makeandsolvelinearsystem() // essentially makesystem (*dof, *lineq); // set up linear subsystem lineq->solve(); // solve linear subsystem Tip: learn the numerics well before starting with the implementation! Nonlinear PDEs p. 34 Nonlinear PDEs p. 34 A real makeandsolvelinearsystem integrands () void NlHeat:: makeandsolvelinearsystem () dof->vecfield (nonlin_solution, *u); // u = most recent guess if (nlsolver->getcurrentstate().method == NEWTON_RAPHSON) // essential boundary conditions must be set to zero because // the unknown vector in the linear system is a correction // vector (assume that nonlin_solution has correct ess. bc.) dof->fillessbczero(); else // normal (default) treatment of essential boundary cond. dof->unfillessbczero(); makesystem (*dof, *lineq); // init start vector for iterative linear solver: if (nlsolver->getcurrentstate().method == NEWTON_RAPHSON) // start for a correction vector (expected to be approx ): linear_solution.fill (.); else // use the most recent nonlinear solution: linear_solution = nonlin_solution; lineq->solve(); // invoke a linear system solver // the solution of the linear system is now available // in the vector linear_solution void NlHeat::integrands(ElmMatVec& elmat,const FiniteElement& fe) const real dt = tip->delta(); // current time step const int nsd = fe.getnospacedim(); // no of space dims const real u_pt = u->valuefem (fe); // interpolate u const real up_pt= u_prev->valuefem (fe); // interpolate u_prev Ptv(real) gradu_pt (nsd); // grad u at present pt. u->derivativefem (gradu_pt, fe); // compute gradu_pt Ptv(real) gradup_pt (nsd); // grad u_prev --"-- u_prev->derivativefem (gradup_pt, fe); // compute gradup_pt const int nbf = fe.getnobasisfunc(); const real detjxw = fe.detjxw(); real gradni_gradnj, gradni_gradu, h; int i,j,s; // no of local nodes Nonlinear PDEs p. 343 Nonlinear PDEs p. 344

44 integrands () Lessons learned if (nlsolver->getcurrentstate().method == NEWTON_RAPHSON) for (i = ; i <= nbf; i++) gradni_gradu = ; for (s = ; s <= nsd; s++) gradni_gradu += fe.dn(i,s)*gradu_pt(s); for (j = ; j <= nbf; j++) gradni_gradnj = ; for (s = ; s <= nsd; s++) gradni_gradnj += fe.dn(i,s)*fe.dn(j,s); h = fe.n(i)*fe.n(j) + dt*( lambda(u_pt)*gradni_gradnj + dlambda(u_pt)*fe.n(j)*gradni_gradu ); elmat.a(i,j) += h*detjxw; h = fe.n(i)*(u_pt - up_pt) + dt*u_pt*gradni_gradu; elmat.b(i) += -h*detjxw; else // error message... not implemented... Transient PDE solver = stationary PDE solver + a time loop and a couple of extra data items Nonlinear PDE solver = linear PDE solver + a hidden nonlinear loop and three extra data items Learn software tools for u = f well; - they can be trivially reused for systems of transient nonlinear PDEs - the Diffpack programming philosophy remains the same The implementational steps from simple to advanced problems can be small Be prepared for major numerical steps when moving to advanced problems Diffpack does not simplify the numerics, just the implementation ( the Diffpack book contains both numerics and software) Nonlinear PDEs p. 345 Nonlinear PDEs p. 346 Linear thermo-elasticity Application area: structural analysis pressure load Elasticity Purpose of simulation: compute deformation and internal forces (stress) Elasticity p. 347 Elasticity p. 348 The deformation and a stress measure Mathematical model () Basic quantities: u(x) displacement field (a vector at each point) σ ij: the stress tensor (3x3 matrix) (needed for evaluating stresses) λ, µ: elasticity coefficients T : temperature deviation Basic equations: Equilibrium: σ = Constitutive law for elasticity (Hooke s law): σ = λ( u)i + µ( u + ( u) T ) α(3λ + µ)t I Elasticity p. 349 Elasticity p. 35 A look at the very basics Mathematical model () Consider elongation of a bar Combining the equations gives: [(λ + µ) u] + [µ u] = [α(3λ + µ)t ] F F or with constant λ and µ: (λ + µ) ( u) + µ u = α(3λ + µ) T F F/ F/ F/ F/ F Primary unknown: u Primary interest: σ T is prescribed or found from a heat eq. Solve for u, find σ from Hooke s law Note: d unknowns per node: u Linear (elliptic) vector PDE for u The stress at the red circle depends on the surface orientation (note: stress is difficult to understand!) Elasticity p. 35 Elasticity p. 35

45 Special versions of the model Notation Full 3D thermo-elasticity D plane strain elasticity u 3 =, / x 3 = D plane stress elasticity: set u 3 =, / x 3 = and modify λ Index notation to condense formulas Rule : a i is vector, a ij is tensor Rule : sum over repeated indices d a ib i a ib i i= Rule 3: comma denotes differentiation f,i f x i a i,k ai x k These rules can be combined, e.g., σ ij,j d j= σ ij x j Elasticity p. 353 Elasticity p. 354 The Kronecker delta Mathematical model with new notation Kronecker delta: δ ij = if i j, δ ij = if i = j With summation convention: δ ii = + + = 3 (!) Without summation convention: δ ii = Rule 4: annihilate expressions with δ ij δ ijv j = v i v jδ ij = v i a ijδ ij = a ii (e.g., set i =, δ jv j = v j + v j + v j) Basic quantities: u i(x j): displacement field σ ij: the stress tensor λ, µ: elasticity coefficients T : temperature deviation Basic equations: Equilibrium: σ ij,j = Constitutive law for elasticity (Hooke s law): σ ij = λu k,kδ ij + µ(u i,j + u j,i) α(3λ + µ)t δ ij Combined into an equation for u i: ((λ + µ)u k,k),i + (µu i,j),j = (α(3λ + µ)t ),i Elasticity p. 355 Elasticity p. 356 The Poisson equation revisited () The Poisson equation revisited () [λ u] = f Written as a first order system q = f q = λ u Starting with this system, and eliminating q after having derived the weighted residual form and performed integration by parts, is the approach we shall use in elasticity (as it simplifies the mathematical details in the elasticity problem) Weighted residual form (Galerkin s method): q N id, Integration by parts: q ˆq = j ˆq N id = q j N j, N i ˆq d + u û = j Insert ˆq = λ û, and obtain standard FEM problem Short notation with indices (q q i): N iˆq n dγ u jn j d ˆq k ˆq = = ˆq k,k, ˆq k = λû,k x k k= Elasticity p. 357 Sum over repeated index and comma denotes differentiation Elasticity p. 358 The Poisson equation revisited (3) FEM in elasticity () Integration by parts in alternative notation: q k,kn id = Linear system (as usual) n j= λ N i,kn j,k N i,kq kd + n q k = λu,k λ u jn j,k j= = Ni Nj d u j = fn id + N in kq kdγ N iλ u,kn k dγ = u n This new notation and use of both q k and u in the derivation makes the numerical details of more complicated problems (e.g. elasticity) easier Equilibrium equation: σ rs,s = Galerkin s method + integration by parts: σ rsn i,sd = d equations, each weighted by N i Replace σ rs by u r: N i σ rsn s dγ b.c. σ rs = λu k,kδ rs + µ(u r,s + u s,r) α(3λ + µ)t δ rs d equations for each i (node) Elasticity p. 359 Elasticity p. 36

46 FEM in elasticity () The element equations Expansion: n u i û i = u i jn j(x,..., x d) j= Element level equations can be written as A rs i,ju s j = b r i, j s r =,..., d, i =,..., n d unknowns at each node: u j,..., ud j Linear system: Kx = b x = (u,..., u d, u,..., u d,..., u n,..., u d n) T Matrices (vectors) consist of d d (d) blocks Equation number: (i, r) d(i ) + r Unknown number: (j, s) d(j ) + s Element matrix: dn dn (n = no of nodes in elem.) For fixed i and j (node numbers), A rs i,j is a d d matrix reflecting the coupling of node i and j Elasticity p. 36 Elasticity p. 36 Derivation of the element equations Derivation; cont. Aim : insert σ rs in σrsni,sd, where and σ rs = λu k,kδ rs + µ(u r,s + u s,r) α(3λ + µ)t δ rs u i û i = n u i jn j(x) Aim : Manipulate expressions to identify element matrix A rs i,j and vector b r i j= Let s look at the first term in σ rs: σ rs = λû k,kδ rs = λ N k,ju k j δ rs in σ rsn i,sd j gives ( ) λ N j,k N i,sδ rsd u k j j s k Elasticity p. 363 Elasticity p. 364 Derivation; cont. Derivation; cont. Try to rewrite on the form j s Ars i,j us j ( ) λ N j,k N i,sδ rsd u k j j s k Step : δ rsφ s = φ r for any vector φ annihilate s: s Ni,sδrs = Ni,r Step : change k with s (dummy summation index) λn i,rn j,sd u s j j s A rs i,j Next term: Step : u r j = δrkuk j ( k ) Step : change k and s Result: Easy to identify A rs i,j Next terms are straightforward µn j,sn i,sd u r j j s ( ) µn i,kn j,k δ rs u s j j s k Elasticity p. 365 Elasticity p. 366 Result of derivation () Result of derivation () General formula for A rs i,j : A rs i,j = This derivation: global level Local level: replace by and d by det J dξ dξ d [ ( ) µ N i,kn j,k δ rs k + µn i,sn j,r + λn i,rn j,s ]d Right-hand side: b r i = [(µ + 3λ) αt N i,r] d + t r = σ rsn s: stress vector at the surface Essential conditions: u r given Natural conditions: t r given N it rdγ Elasticity p. 367 Elasticity p. 368

47 Implementation Entries in the element matrix Standard Poisson/ data The DegFreeFE object is more central (d unknown per node!) FieldsFE u (vector field) Vec(real) solution (solution of linear system) Shuffling u solution using DegFreeFE Initialization: u.rebind (new FieldsFE (*grid,"u")); dof.rebind (new DegFreeFE (*grid, nsd)); //!!! solution.redim (u->getnovalues()); lineq->attach (solution); x matrix, coupling node and 3 entry, coupling local dof in node 3 with local dof in node 4 Elasticity p. 369 Elasticity p. 37 The heart of the integrands routine The important variables in elasticity 4 loops: // matrix: for i =,...,nbf for j =,...,nbf for r =,...,d for s =,...,d add A_i,j^rs into elmat.a (d*(i-)+r, d*(j-)+s) // right-hand side: for i =,...,nbf for r =,...,d add b_i^r into elmat.b (d*(i-)+r) Primary unknowns in the finite element method: the displacement field u r Primary interest: the components of the stress tensor σ rs a norm of σ rs, e.g., m σ ij σ ij where σ rs σ rs 3 σkkδrs Note: σ rs u r/ x s σ rs is discontinuous accross element boundaries Smoothing might be necessary Elasticity p. 37 Elasticity p. 37 Computing derivatives () Computing derivatives () û(x) is a finite element field Define g = û/ x Bilinear û g a + bx Linear û g const g is discontinuous across element boundaries g has optimal accuracy at the reduced Gauss points (=centroid in linear/bilinear elements) g = û/ x (e.g.) Find continuous ĝ = j gjnj(x) as approximate solution of Galerkin or least squares: ĝ = g ( ) N in jd g j = gn id j Integrate rhs with reduced Gauss rule Lump the mass matrix N in jd Efficient solution of diagonal system for g j ĝ is a continuous field Elasticity p. 373 Elasticity p. 374 Computing derivatives in Diffpack Computing derivatives in Diffpack Let m be a stress norm (discontinuous) Representation of m: class FieldsFEatItgPt FieldsFEatItgPt = n f point values of derivatives at each (possibly reduced) integration point in each element (representing m only: n f = ) Class FieldsFEatItgPt has a function void derivedquantitiesatitgpt ( FEM& fesolver, GridFE& grid, int nfields, NumItgPoints pt_tp = GAUSS_POINTS, int relative_order = - // reduced Gauss pts ); that runs through all elements in the grid and their (reduced) integration points, and for each point, fesolver s virtual derivedquantitiesatitgpt is called for defining the values of the nfields discontinuous fields at the current point In class Elasticity: class Elasticity : public FEM... Handle(FieldsFEatItgPts) stress_measures;... ; // m void Elasticity:: calcderivedquantities () // Handle(stress_measures) contains stresses (now only the norm m) stress_measures->derivedquantitiesatitgpt (*this, *grid, /* derived quantity */, GAUSS_POINTS, - /* reduced Gauss-Legendre points */); FEM::smoothFields (*smooth_stress_measures, *stress_measures); void Elasticity:: derivedquantitiesatitgpt (VecSimple(NUMT)& quantities, const FiniteElement& fe) // fill quantities() with the expression for m Elasticity p. 375 Elasticity p. 376

48 Plate with imperfection Boundary conditions Consider a plate with an elliptic hole: Equation: Navier, D, with λ replaced by σ Plane stress (thin plate) σ Boundary with tension force: stress vector known (t = σi) Inner boundary in hole: no stress λ = λµ λ + µ Upper and lower boundary: no stress conditions at each point on the boundary (recall that σ i3 = by definition of plane strain) (In a 3D formulation of the problem we would trivially get 3 cond. at each point) Elasticity p. 377 Elasticity p. 378 Symmetry Numerical simulations For numerical computations it is crucial to reduce the size of the domain as much as possible Here: symmetry about two lines equivalent stress in deformed configuration σ σ Condition at a symmetry line: vanishing normal displacement: condition no shear stress: (D) or (3D) conditions Elasticity p. 379 Elasticity p. 38 Plate with crack Boundary conditions Let the ellipse collapse to a line σ σ What has actually changed? Nothing; same boundary conditions (stress-free inner surface of the crack) same symmetry properties However: the extreme geometry will lead to infinite stresses at the crack tip Elasticity p. 38 Elasticity p. 38 Numerical simulations Elastic beam with a crack equivalent stress in deformed configuration.3 uniform pressure load crack clamped end Elasticity p. 383 Elasticity p. 384

49 5 Mathematical model Numerical simulation: stress Elasticity Plane strain No temperature effects equivalent stress in deformed configuration Elasticity p. 385 Elasticity p. 386 Tsunamis Shallow water waves Waves in fjords, lakes, or oceans, generated by slide earthquake subsea volcano asteroid human activity, like nuclear detonation, or slides generated by oil drilling, may generate tsunamis Propagation over large distances Hardly recognizable in the open ocean, but wave amplitude increases near shore Run-up at the coasts may result in severe damage Giant events: Dec 6 4 ( 3 killed), 883 (similar to 4), 65 My ago (extinction of the dinosaurs) Shallow water waves p. 387 Shallow water waves p. 388 Norwegian tsunamis Tsunamis in the Pacific Tromsø 7 Bodø 65 SWEDEN Trondheim 6 Bergen NORWAY Oslo Stockholm 5 Circules: Major incidents, > killed; Triangles: Selected smaller incidents; Square: Storegga (5 B.C.) Selected events; slides Shallow water waves p. 389 Scenario: earthquake outside Chile, generates tsunami, propagating at 8 km/h accross the Pacific, run-up on densly populated coasts in Japan; Shallow water waves p. 39 Selected events; earthquakes etc. location year run-up dead Loen 95 4m 6 Tafjord 934 6m 4 Loen m 73 Storegga 5 B.C. m(?)?? Vaiont, Italy 963 7m 6 Litua Bay, Alaska 958 5m Shimabara, Japan 79 m(?) 5 location year strength run-up dead Thera 64 B.C. volcano?? Thera 65 volcano?? Lisboa 755 M=9? 5(?)m? Portugal 969 M=7.9 m Amorgos 956 M=7.4 5(?)m Krakatao 883 volcano 4 m 36 Flores 99 M=7.5 5 m Nicaragua 99 M=7. m 68 Sumatra 4 M=9 5 m 3 The selection is biased wrt. European events; 5 catastrophic tsunami events have been recorded along along the Japanese coast in modern times. Tsunamis: no. 5 killer among natural hazards Shallow water waves p. 39 Shallow water waves p. 39

50 Why simulation? Problem sketch Increase the understanding of tsunamis Assist warning systems Assist building of harbor protection (break waters) Recognize critical coastal areas (e.g. move population) Hindcast historical tsunamis (assist geologists/biologists) H(x,y,t) z y x η(x,y,t) Assume wavelength depth (long waves) Assume small amplitudes relative to depth Appropriate approx. for many ocean wave phenomena Reference: HPL chapter 6. Shallow water waves p. 393 Shallow water waves p. 394 Mathematical model Primary unknowns PDEs: η t u t v t η(x, y, t) : surface elevation = x (uh) ( y (vh) H ) t = η x, x, t > = η y, x, t > u(x, y, t) and v(x, y, t) : horizontal (depth averaged) velocities H(x, y) : stillwater depth (given) Boundary conditions: either η, u or v given at each point Initial conditions: all of η, u and v given Discretization: finite differences Staggered grid in time and space η, u, and v unknown at different points: u l+ i,j+ η l i+,j+, u l+, v l+ i,j+ i+,j+ v l+ i+,j+ η l i+,j+ u l+ i+,j+ v l+ i+,j Shallow water waves p. 395 Shallow water waves p. 396 A global staggered grid Discrete equations; η η t = x (uh) y (vh) Widely used grid in computational fluid dynamics (CFD) Important for Navier-Stokes solvers Basic idea: centered differences in time and space [ ] η l t i+,j+ η l i+,j+ = x y at (i +, j +, l ) [ ] (Hu) l (Hu) l i+,j+ i,j+ [ ] (Hv) l i+,j+ (Hv)l i+,j Shallow water waves p. 397 Shallow water waves p. 398 Discrete equations; u Discrete equations; v u t [ ] u l+ u l t i,j+ i,j+ = η x at (i, j +, l) = [ ] η l x i+,j+ η l i,j+ v t [ ] v l+ t i+,j vl i+,j = η y at (i +, j, l) = [ ] η l y i+,j+ η l i+,j Shallow water waves p. 399 Shallow water waves p. 4

51 Complicated costline boundary Relation to the wave equation Eliminate u and v (easy!) Eliminate discrete u and v η t = [H(x, y) η] Standard 5-point explicit finite difference scheme for discrete η Saw-tooth approximation to real boundary Successful method, widely used Warning: can lead to nonphysical waves Shallow water waves p. 4 Shallow water waves p. 4 Stability and accuracy Verification of an implementation Centered differences in time and space truncation error: O( x, y, t ) Stability as for the std. wave equation in D: t H x + y How can we verify that the program works? Compare with an analytical solution (if possible) Check that basic physical mechanisms are reproduced in a qualitatively correct way by the program (CFL condition) If H const, exact numerical solution is possible for one-dimensional wave propagation Shallow water waves p. 43 Shallow water waves p. 44 Tsunami due to a slide Tsunami due to faulting Surface elevation ahead of the slide, dump behind Initially, negative dump propagates backwards The surface waves propagate faster than the slide moves Shallow water waves p. 45 The sea surface deformation reflect the bottom deformation Velocity of surface waves (H 5 km): 79 km/h Velocity of seismic waves in the bottom: 6 5 km/h Shallow water waves p. 46 Tsunami approaching the shore Tsunamis experienced from shore The velocity of a tsunami is gh(x, y, t). As a fast tide, with strong currents in fjords A wall of water approaching the beach Wave breaking: the top has larger effective depth and moves faster than the front part (requires a nonlinear PDE) The back part of the wave moves at higher speed the wave becomes more peak-formed Deep water (H 3 km): wave length 4 km, height m Shallow water waves p. 47 Shallow water waves p. 48

52 Viscous fluid flow.. A penalty N-S solver Many processes in science and technology involve viscous fluid flow, and the numerical models then need solvers for the Navier-Stokes (N-S) equations: ( ) v ϱ t + v v v = = p + µ v + ϱb A penalty N-S solver p. 49 A penalty N-S solver p. 4 Different ways of writing the N-S eqs. Numerical methods With vector symbols: With index notation: ( ) v ϱ t + v v v = = p + µ v + ϱb ϱ(v r,t + v sv r,s) = p,r + µv r,ss + ϱb r v s,s = The condition v = and the term p make the N-S equations hard to solve numerically There are numerous approaches: fully implicit artificial compressibility penalty functions operator splitting The latter is preferred when deriving the details of a numerical method (as in the elasticity problem) A penalty N-S solver p. 4 A penalty N-S solver p. 4 Penalty methods Modified N-S equations Firm basis in calculus of variations Main result: Can eliminate the pressure!!! p = λ v, λ Result: a kind of nonlinear transient elasticity problem Very convenient from a numerical point of view λ gives some undesired numerical properties (ill-conditioned matrix systems) Good educational example on using Diffpack p = λ v = λv s,s eliminates p and the eq. v = v s,s = Result: ϱ(αv r,t + v sv r,s) = λv s,sr + µv r,ss + ϱb r eq. of linear elasticity, modulo the acceleration terms on the left-hand side (which add transient and nonlinear effects) Implementation in Diffpack: Extend class Elasticity with a time loop and a nonlinear solver Combine class Poisson, Heat, NlHeat, and Elasticity Reference: HPL chapter 6.3. A penalty N-S solver p. 43 A penalty N-S solver p. 44 Basic steps Discretization (). Derive the weak form. Identify the integrands 3. Get control of the element degrees of freedom, i.e., how the formulas are stacked in the element matrix/vector Strong similarity to the elasticity problem! Then, use class NlHeat as a template for administering the solution process In time: θ-rule θ = : backward Euler, θ =.5: Crank-Nicolson In space: isoparametric finite elements ˆv l r(x, t) = n j= v r,l j N j(x) Weak form: multiply by N i, integrate nd order derivatives by parts Nonlinear system at each time level F r i (v,..., v d, v,..., v d,..., v n,..., v d n) = for i =,..., n (nodes), r =,..., d (loc. dof.) A penalty N-S solver p. 45 A penalty N-S solver p. 46

53 Discretization () Entries in the element matrix Newton-Raphson method for F r i = Sequence of linear systems n d A rs i,jδvj s = Fi r j= s= A rs i,j F r i v s,l j 3 4 x matrix, coupling node and 3 entry, coupling local dof in node 3 with local dof in node A penalty N-S solver p. 47 A penalty N-S solver p. 48 Selective reduced integration Computing the pressure The λ term must be integrated by a rule of one order lower than the rule used for the other terms (equiv. to using mixed interpolation). p = λv s,s can be computed when the velocity is known p derivatives of v s p becomes discontinuous Might smooth p: where n M i,jp j = b i j= b i = λ v s,sn id, M i,j = N in jd normally with lumped M i,j Another finite element assembly process Diffpack tool: class FieldsFEatItgPt or integrands functor A penalty N-S solver p. 49 A penalty N-S solver p. 4 The idea of integrand functors Implementation; pressure computation A solver can only have one integrands function What if it needs more than one? if-else tests in integrands external integrands functions as functors Basic structure of an integrand functor: class MyExtraIntegrand : public IntegrandCalc MySim* data; // access to all solver data public: MyExtraIntegrand (MySim* sim) : data(sim) void integrands (ElmMatVec& em, const FiniteElement& fe) // normal integrands function // access physical parameters in the solver by data-> ; Overloaded versions of FEM::makeSystem work with integrand functors (as an alternative to the integrands function in the solver) Integrand functor for b i = λ v s,sn id M i,j can be computed once and for all by makemassmatrix Class FEM has a function smoothfield for solving M i,jp j = functor-defined right-hand side j // integrand functor: class PressureIntg : public IntegrandCalc... ; void NsPenalty:: calcderivedquantities () PressureIntg penalty_integrand (this); FEM::smoothField (*p, penalty_integrand); // calls makemassmatrix (if necessary), makesystem // and solves the diagonal system A penalty N-S solver p. 4 A penalty N-S solver p. 4 Flow in a constricted channel () Flow in a constricted channel () solid wall. uniform inlet profile. outlet solid wall Re= Re= A penalty N-S solver p. 43 A penalty N-S solver p. 44

54 Splitting the N-S equations ϱ(v r,t + v sv r,s) = p,r + µv r,ss + ϱb r v s,s = A fast FE N-S solver Difficulty: p,r and v s,s = Idea: Split N-S into simpler equations Common approach: split N-S into an explicit convection-diffusion equation for v r an (implicit) Poisson equation for p explicit updating formula for v r A fast FE N-S solver p. 45 A fast FE N-S solver p. 46 A nd order algorithm () A nd order algorithm () ϱ(v r,t + v sv r,s) = p,r + µv r,ss + ϱb r v s,s =. Calculation of an intermediate velocity field: k () r = t(v l sv l r,s νv l r,ss) ˆv r = v l r + k () r k r () = t(ˆv sˆv l r,s l νˆv r,ss) l vr = vr l + ( ) k r () + k r (). Solution of a Poisson equation for the new pressure (arising from the incompressibility constraint v l+ s,s = ): p l+ = ϱ t v s,s 3. Correction of the intermediate velocity field: v l+ r = v r (p l+,r ϱb r) t/ϱ Same interpolation for v r and p (no need for mixed finite elements) A fast FE N-S solver p. 47 A fast FE N-S solver p. 48 Discrete equations Implementation First step: for r =,..., d. K: operator a r: nonlinear convective term Explicit updates a la Mk () r = ta r(v,..., v d) ν tkv r v r = v l r + k () r, v r = v l r + ( ) k () r + k () r The pressure Poisson equation: Kp l+ = ϱ t Bsv s Obvious: create a std solver for the Poisson equation and the scalar explicit updates Observation I: original vector equations are split into d independent scalar equations (!) Requires several weak forms integrand functors Observation II: M, K, and B s are independent of time Can speed up the code by precomputing M, K, and B s, and generate the Poisson equation from matrix-vector products only (cf. class Wave) Problem: the nonlinear term a r Solution: precompute as much as possible, multiply by v at the element level and assemble Correcting the velocity field (c r contains body forces): Mv l+ r = Mv r (B rp l+ ϱc r) A fast FE N-S solver p. 49 A fast FE N-S solver p. 43 The importance of linear system solvers Solving linear systems PDE problems often (usually) result in linear systems of algebraic equations Ax = b Special methods utilizing that A is sparse is much faster than Gaussian elimination! Most of the CPU time in a PDE solver is often spent on solving Ax = b Important to use fast methods Solving linear systems p. 43 Solving linear systems p. 43

55 Example: Poisson eq. on the unit cube () Example: Poisson eq. on the unit cube () u = f on an n = q q q grid FDM/FEM result in Ax = b system FDM: 7 entries pr. row in A are nonzero FEM: 7 (tetrahedras), 7 (trilinear elms.), or 5 (triquadratic elms.) entries pr. row in A are nonzero A is sparse (mostly zeroes) Fraction of nonzeroes: Rq 3 (R is nonzero entries pr. row) Important to work with nonzeroes only! Compare Banded Gaussian elimination (BGE) versus Conjugate Gradients (CG) Work in BGE: O(q 7 ) = O(n.33 ) Work in CG: O(q 3 ) = O(n) (multigrid; optimal), for the numbers below we use incomplete factorization preconditioning: O(n.7 ) n = 7: CG 7 times faster than BGE BGE needs times more memory than CG n = 8 million: CG 7 times faster than BGE BGE needs 487 times more memory than CG Solving linear systems p. 433 Solving linear systems p. 434 Classical iterative methods Convergence Ax = b, A IR n,n, x, b IR n. Split A: A = M N Write Ax = b as Mx = Nx + b, and introduce an iteration Mx k = Nx k + b, k =,,... Mx k = Nx k + b, k =,,... The iteration converges if G = M N has its largest eigenvalue, ϱ(g), less than Rate of convergence: R (G) = ln ϱ(g) To reduce the initial error by a factor ɛ, x x k ɛ x x Systems My = z should be easy/cheap to solve Different choices of M correspond to different classical iteration methods: Jacobi iteration Gauss-Seidel iteration Successive Over Relaxation (SOR) Symmetric Successive Over Relaxation (SSOR) one needs iterations ln ɛ/r (G) Solving linear systems p. 435 Solving linear systems p. 436 Some classical iterative methods Jacobi iteration Split: A = L + D + U L and U are lower and upper triangular parts, D is A s diagonal Jacobi iteration: M = D (N = L U) Gauss-Seidel iteration: M = L + D (N = U) SOR iteration: Gauss-Seidel + relaxation SSOR: two (forward and backward) SOR steps Rate of convergence R (G) for u = f in D with u = as BC: Jacobi: πh / Gauss-Seidel: πh SOR: πh SSOR: > πh SOR/SSOR is superior (h vs. h, h is small) M = D Put everything, except the diagonal, on the rhs D Poisson equation u = f: u i,j + u i,j + u i+,j + u i,j+ 4u i,j = h f i,j Solve for diagonal element and use old values on the rhs: u k i,j = 4 for k =,,... ( u k i,j + uk i,j + uk i+,j + uk i,j+ + ) h f i,j Solving linear systems p. 437 Solving linear systems p. 438 Relaxed Jacobi iteration Relation to explicit time stepping Idea: Computed new x approximation x from Dx = ( L U)x k + b Set x k = ωx + ( ω)x k weighted mean of x k and x k if ω (, ) Relaxed Jacobi iteration for u = f is equivalent with solving α u t = u + f by an explicit forward scheme until u/ t, provided ω = 4 t/(αh) Stability for forward scheme implies ω In this example: ω = best ( largest t) Forward scheme for t is a slow scheme, hence Jacobi iteration is slow Solving linear systems p. 439 Solving linear systems p. 44

56 Gauss-Seidel/SOR iteration Symmetric/double SOR: SSOR M = L + D For our D Poisson eq. scheme: u k i,j = 4 ( u k i,j + u k i,j + u k i+,j + uk i,j+ + ) h f i,j i.e. solve for diagonal term and use the most recently computed values on the right-hand side SOR is relaxed Gauss-Seidel iteration: compute x from Gauss-Seidel it. set x k = ωx + ( ω)x k ω (, ), with ω = O(h ) as optimal choice Very easy to implement! SSOR = Symmetric SOR One (forward) SOR sweep for unknowns,, 3,..., n One (backward) SOR sweep for unknowns n, n, n,..., M can be shown to be M = ( ) ( ) ( ) ω ω D + L ω D ω D + U Notice that each factor in M is diagonal or lower/upper triangular ( very easy to solve systems My = z) Solving linear systems p. 44 Solving linear systems p. 44 Status: classical iterative methods Conjugate Gradient-like methods Jacobi, Gauss-Seidel/SOR, SSOR are too slow for paractical PDE computations The simplest possible solution method for u = f and other stationary PDEs in D/3D is to use SOR Classical iterative methods converge quickly in the beginning but slow down after a few iterations Classical iterative methods are important ingredients in multigrid methods Ax = b, A IR n,n, x, b IR n. Use a Galerkin or least-squares method to solve a linear system (!) Idea: write k x k = x k + α jq j j= α j: unknown coefficients, q j : known vectors Compute the residual: k r k = b Ax k = r k α jaq j j= and apply the ideas of the Galerkin or least-squares methods Solving linear systems p. 443 Solving linear systems p. 444 Galerkin Least squares Residual: k r k = b Ax k = r k α jaq j j= (r k, q i ) = Galerkin s method (r R, q j N j, α j u j): (r k, q i ) =, i =,..., k Residual: k r k = b Ax k = r k α jaq j j= α i (r k, r k ) = Least squares: minimize (r k, r k ) Result: linear system for α j: (, ): Eucledian inner product Result: linear system for α j, k (Aq i, q j )α j = (r k, q i ), j= i =,..., k k (Aq i, Aq j )α j = (r k, Aq i ), j= i =,..., k Solving linear systems p. 445 Solving linear systems p. 446 The nature of the methods Extending the basis Start with a guess x In iteration k: seek x k in a k-dimensional vector space V k Basis for the space: q,..., q k Use Galerkin or least squares to compute the (optimal) approximation x k in V k Extend the basis from V k to V k+ (i.e. find q k+ ) V k is normally selected as a so-called Krylov subspace: V k = spanr, Ar,..., A k r Alternatives for computing q k+ V k+: k q k+ = r k + β jq k j= q k+ = k Ar k + β jq k j= How to choose β j? Solving linear systems p. 447 Solving linear systems p. 448

57 Orthogonality properties Formula for updating the basis vectors Bad news: must solve a k k linear system for α j in each iteration (as k n the work in each iteration approach the work of solving Ax = b!) The coefficient matrix in the α j system: (Aq i, q j ), (Aq i, Aq j ) Idea: make the coefficient matrices diagonal That is, Galerkin: (Aq i, q j ) = for i j Least squares: (Aq i, Aq j ) = for i j Use β j to enforce orthogonality of q i Define and u, v (Au, v) = u T Av [u, v] (Au, Av) = u T A T Av Galerkin: require A-orthogonal q j vectors, which then results in β i = rk, q i q i, q i Least squares: require A T A orthogonal q j vectors, which then results in β i = [rk, q i ] [q i, q i ] Solving linear systems p. 449 Solving linear systems p. 45 Simplifications Symmetric A Galerkin: q i, q j = for i j gives α k = (rk, q k ) q k, q k If A is symmetric (A T = A) and positive definite (positive eigenvalues y T Ay > for any y ), also β i = for i < k need to store q k only (q,..., q k are not used in iteration k) and α i = for i < k (!): x k = x k + α kq k That is, hand-derived formulas for α j Least squares: and α i = for i < k α k = (rk, Aq k ) [q k, q k ] Solving linear systems p. 45 Solving linear systems p. 45 Summary: least squares algorithm Truncation and restart given a start vector x, compute r = b Ax and set q = r. for k =,,... until termination criteria are fulfilled: α k = (r k, Aq k )/[q k, q k ] x k = x k + α kq k r k = r k α kaq k if A is symmetric then β k = [r k, q k ]/[q k, q k ] q k+ = r k β kq k else β j = [r k, q j ]/[q j, q j ], j =,..., k q k+ = r k k j= βjq j The Galerkin-version requires A to be symmetric and positive definite and results in the famous Conjugate Gradient method Problem: need to store q,..., q k Much storage and computations when k becomes large Truncation: work with a truncated sum for x k, x k = x k + where a possible choice is K = 5 k j=k K+ Small K might give convergence problems α jq j Restart: restart the algorithm after K iterations (alternative to truncation) Solving linear systems p. 453 Solving linear systems p. 454 Family of methods Convergence Generalized Conjugate Residual method = least squares + restart Orthomin method = least squares + truncation Conjugate Gradient method = Galerkin + symmetric and positive definite A Conjugate Residuals method = Least squares + symmetric and positive definite A Many other related methods: BiCGStab, Conjugate Gradients Squared (CGS), Generalized Minimum Residuals (GMRES), Minimum Residuals (MinRes), SYMMLQ Common name: Conjugate Gradient-like methods All of these are easily called in Diffpack Conjugate Gradient-like methods converge slowly (but usually faster than SOR/SSOR) To reduce the initial error by a factor ɛ, ln κ ɛ iterations are needed, where κ is the condition number: κ = largest eigenvalue of A smalles eigenvalue of A κ = O(h ) when solving nd-order PDEs (incl. elasticity and Poisson eq.) Solving linear systems p. 455 Solving linear systems p. 456

58 Preconditioning Classical methods as preconditioners Idea: Introduce an equivalent system M Ax = M b solve it with a Conjugate Gradient-like method and construct M such that. κ = O() M A (i.e. fast convergence). M is cheap to compute 3. M is sparse (little storage) 4. systems My = z (occuring in the algorithm due to M Av-like products) are efficiently solved (O(n) op.) Contradictory requirements! The preconditioning business: find a good balance between -4 Idea: solve My = z by one iteration with a classical iterative method (Jacobi, SOR, SSOR) Jacobi preconditioning: M = D (diagonal of A) No extra storage as M is stored in A No extra computations as M is a part of A Efficient solution of My = z But: M is probably not a good approx to A poor quality of this type of preconditioners? Conjugate Gradient method + SSOR preconditioner is widely used Solving linear systems p. 457 Solving linear systems p. 458 M as a factorization of A M as an incomplete factorization of A Idea: Let M be an LU-factorization of A, i.e., M = LU where L and U are lower and upper triangular matrices resp. Implications:. M = A (κ = ): very efficient preconditioner!. M is not cheap to compute (requires Gaussian elim. on A!) 3. M is not sparse (L and U are dense!) 4. systems My = z are not efficiently solved (O(n ) process when L and U are dense) New idea: compute sparse L and U How? compute only with nonzeroes in A Incomplete factorization, M = LÛ LU M is not a perfect approx to A M is cheap to compute and store (O(n) complexity) My = z is efficiently solved (O(n) complexity) This method works well - much better than SOR/SSOR preconditioning Solving linear systems p. 459 Solving linear systems p. 46 How to compute M Numerical experiments Run through a standard Gaussian elimination, which factors A as A = LU Normally, L and U have nonzeroes where A has zeroes Idea: let L and U be as sparse as A Compute only with the nonzeroes of A Such a preconditioner is called Incomplete LU Factorization, ILU Option: add contributions outside A s sparsity pattern to the diagonal, multiplied by ω Relaxed Incomplete Factorization (RILU): ω > Modified Incomplete Factorization (MILU): ω = See algorithm C.3 in the book Two test cases: u = f on the unit cube and FDM u = f on the unit cube and FEM Diffpack makes it easy to run through a series of numerical experiments, using multiple loops, e.g., sub LinEqSolver_prm set basic method = ConjGrad & MinRes ok sub Precond_prm set preconditioning type = PrecRILU set RILU relaxation parameter =. &.4 &.7 &. ok Solving linear systems p. 46 Solving linear systems p. 46 Test case : 3D FDM Poisson eq. Jacobi vs. SOR vs. SSOR Equation: u = Boundary condition: u = 7-pt star standard finite difference scheme Grid size: = 8 points and 3 3 = 7 points Source code: $NOR/doc/Book/src/linalg/LinSys4/ All details in HPL Appendix D Input files: $NOR/doc/Book/src/linalg/LinSys4/experiments Solver s CPU time written to standard output n = 3 = 8 and n = 3 3 = 7 Jacobi: not converged in iterations SOR(ω =.8):.s and 9.s SSOR(ω =.8):.8s and 9.8s Gauss-Seidel: 3.s and 97s SOR s sensitivity to relax. parameter ω:.: 96s,.6: 3s,.7: 6s,.8: 9s,.9: s SSOR s sensitivity to relax. parameter ω:.: 66s,.6: 7s,.7: 3s,.8: 9s,.9: s relaxation is important, great sensitivity to ω Solving linear systems p. 463 Solving linear systems p. 464

59 Conjugate Residuals or Gradients? Different preconditioners Compare Conjugate Residuals with Conjugate Gradients Or: least squares vs. Galerkin Diffpack names: MinRes and ConjGrad MinRes: not converged in iterations ConjGrad:.7s and 3.9s ConjGrad is clearly faster than the best SOR/SSOR Add ILU preconditioner MinRes:.7s and 4s ConjGrad:.6s and.7s The importance of preconditioning grows as n grows ILU, Jacobi, SSOR preconditioners (ω =.) MinRes: Jacobi: not conv., SSOR:.4s, ILU: 4s ConjGrad: Jacobi: 4.8s, SSOR:.8s, ILU:.7s Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:.: 3.3s,.6:.s,.8:.s,.9:.6s Sensitivity to relax. parameter in RILU, with ConjGrad as solver:.:.7s,.6:.4s,.8:.s,.9:.9s,.95:.9s,.:.7s ω slightly less than is optimal, RILU and SSOR are equally fast (here) Solving linear systems p. 465 Solving linear systems p. 466 Test case : 3D FEM Poisson eq. Jacobi vs. SOR vs. SSOR Equation: u = A π sin πx + 4A π sin πy + 9A 3π sin 3πz Boundary condition: u known ElmB8n3D and ElmB7n3D elements Grid size: = 96 nodes and = 979 nodes Source code: $NOR/doc/Book/src/fem/Poisson All details in HPL Chapter 3. and 3.5 Input files: $NOR/doc/Book/src/fem/Poisson/linsol-experiments Solver s CPU time available in casename-summary.txt n = 96 and n = 3 3 = 979, trilinear and triquadratic elms. Jacobi: not converged in iterations SOR(ω =.8): 9.s and 8s, 4s and 338s SSOR(ω =.8): 47s and 48s, 38s and 755s Gauss-Seidel: not converged in iterations SOR s sensitivity to relax. parameter ω:.: not conv.,.6: s,.8: 83s,.9: 57s (n = 979 and trilinear elements) SSOR s sensitivity to relax. parameter ω:.: not conv.,.6: s,.7: 7s,.8: 45s,.9: 435s (n = 979 and trilinear elements) relaxation is important, great sensitivity to ω Solving linear systems p. 467 Solving linear systems p. 468 Conjugate Residuals or Gradients? Different preconditioners Compare Conjugate Residuals with Conjugate Gradients Or: least squares vs. Galerkin Diffpack names: MinRes and ConjGrad MinRes: not converged in iterations 96 vs 979 unknowns, trilinear elements ConjGrad: 5s and s ConjGrad is clearly faster than the best SOR/SSOR! Add ILU preconditioner MinRes: 5s and 8s ConjGrad: 4s and 6s ILU prec. has a greater impact when using triquadratic elements (and when n grows) ILU, Jacobi, SSOR preconditioners (ω =.) MinRes: Jacobi: 68s., SSOR: 57s, ILU: 8s ConjGrad: Jacobi: 9s, SSOR: 4s, ILU: 6s Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:.: 7s,.6: s,.8: 3s,.9: 8s Sensitivity to relax. parameter in RILU, with ConjGrad as solver:.: 6s,.6: 5s,.8: 3s,.9: s,.95: s,.: 6s ω slightly less than is optimal, RILU and SSOR are equally fast (here) Solving linear systems p. 469 Solving linear systems p. 47 More experiments Multigrid methods Convection-diffusion equations: $NOR/doc/Book/src/app/Cd/Verify Files: linsol_a.i etc as for LinSys4 and Poisson Elasticity equations: $NOR/doc/Book/src/app/Elasticity/Verify Files: linsol_a.i etc as for the others Run experiments and learn! Multigrid methods are the most efficient methods for solving linear systems Multigrid methods have optimal complexity O(n) Multigrid can be used as stand-alone solver or preconditioner Multigrid applies a hierarchy of grids Multigrid is not as robust as Conjugate Gradient-like methods and incomplete factorization as preconditioner, but faster when it works Multigrid is complicated to implement Diffpack has a multigrid toolbox that simplifies the use of multigrid dramatically Solving linear systems p. 47 Solving linear systems p. 47

60 p The rough ideas of multigrid Damping in Gauss-Seidel s method () Observation: e.g. Gauss-Seidel methods are very efficient during the first iterations High-frequency errors are efficiently damped by Gauss-Seidel Low-frequence errors are slowly reduced by Gauss-Seidel Idea: jump to a coarser grid such that low-frequency errors get higher frequency Repeat the procedure On the coarsest grid: solve the system exactly Transfer the solution to the finest grid Iterate over this procedure Model problem: u = f by finite differences: solved by Gauss-Seidel iteration: Study the error e l i = ul i u i : u j + u j u j+ = h f j u l j = u l j + u l j+ + h f j e l j = e l j + e l j+ This is like a time-dependent problem, where the iteration index l is a pseudo time Solving linear systems p. 473 Solving linear systems p. 474 Damping in Gauss-Seidel s method () Gauss-Seidel s damping factor Can find e l j with techniques from Appendix A.4: e l j = k A k exp (i(kjh ωl t)) ξ = 5 4 cos p, p = kh [, π] or (easier to work with here): e l j = k A kξ l exp (ikjh), ξ = exp ( i ω t) Inserting a wave component in the scheme: ξ = exp ( i ω t) = exp (ikh) exp ( ikh), ξ = 5 4 cos kh Interpretation of ξ : reduction in the error per iteration Solving linear systems p. 475 Small p = kh h/λ: low frequency (relative to the grid) and small damping Large ( π) p = kh h/λ: high frequency (relative to the grid) and efficient damping Solving linear systems p. 476 More than one grid Transferring the solution between grids From the previous analysis: error components with high frequency are quickly damped Jump to a coarser grid, e.g. h = h p is increased by a factor of, i.e., not so high-frequency waves on the h grid is efficiently damped by Gauss-Seidel on the h grid Repeat the procedure On the coarsest grid: solve by Gaussian elimination Interpolate solution to a finer grid, perform Gauss-Seidel iterations, and repeat until the finest grid is reached From fine to coarser: restriction From coarse to finer: prolongation simple restriction weighted restriction fine grid function q- q interpolated fine grid function coarse grid function q q Solving linear systems p. 477 Solving linear systems p. 478 Smoothers A multigrid algorithm The Gauss-Seidel method is called a smoother when used to damp high-frequency error components in multigrid Other smoothers: Jacobi, SOR, SSOR, incomplete factorization No of iterations is called no of smoothing sweeps Common choice: one sweep Start with the finest grid Perform smoothing (pre-smoothing) Restrict to coarser grid Repeat the procedure (recursive algorithm!) On the coarsest grid: solve accurately Prolongate to finer grid Perform smoothing (post-smoothing) One cycle is finished when reaching the finest grid again Can repeat the cycle Multigrid solves the system in O(n) operations Check out HPL C.4. for details!! Solving linear systems p. 479 Solving linear systems p. 48

61 e e e e 5 4.7e 5.78e 5.39e 5 V- and W-cycles Multigrid requires flexible software Different strategies for constructing cycles: q γ q = γ q = 4 3 coarse grid solve smoothing Many ingredients in multigrid: pre- and post-smoother no of smoothing sweeps solver on the coarsest level cycle strategy restriction and prolongation methods how to construct the various grids? There are also other variants of multigrid (e.g. for nonlinear problems) The optimal combination of ingredients is only known for simple model problems (e.g. the Poisson eq.) In general: numerical experimentation is required! (Diffpack has a special multigrid toolbox for this) Solving linear systems p. 48 Solving linear systems p. 48 System of PDEs; coupling simulators Coupling simulators System of PDEs: one momentum equation + one energy equation Develop independent solvers for each PDE Combine solvers in a few lines Coupling simulators p. 483 Coupling simulators p. 484 Physical problem General mathematical model z Equation of continuity: v = Momentum equation: ϱv v = p + P Non-Newtonian fluid Temperature-dependent viscosity Steady flow Straight pipe Constitutive law: P exp ( αt ) γ n ( v + ( v) T ) γ = v : v velocity temperature Energy equation: Cϱv T = k T + c exp ( αt ) γ n Coupling simulators p. 485 Coupling simulators p. 486 Simplified mathematical model Numerical solution methods Assumption: rectilinear flow v = (,, w) Simplified equation system: ( µ w ) + ( µ w ) x x y y µ = µ e αt S(w) n ( w ) ( ) w S(w) = + x y Two nonlinear Poisson equations BC: w = and T = at the walls = const T x + T y = ˆµ e αt S(w) n+ Galerkin finite element method Fully implicit formulation: Sequential solution method Solution of nonlinear systems: Picard iteration Newton-Raphson A(w k, T k )w k = a BT k = b(w k, T k ) A(w k, T k )w k = a BT k = b(w k, T k ) Coupling simulators p. 487 Coupling simulators p. 488

62 Simplified structure in D Software development System of PDEs: ( ) d e αt dw n dw dx dx = const dx d T dx = e αt dw n+ dx Nonlinearities depend on n and α Momentum: [µ u] = Energy: u = f Momentum simple µ Momentum relevant µ Energy Energy simple f relevant f Manager CommonRel viscosity models Very little code in Momentum, Energy, Manager Coupling simulators p. 489 Coupling simulators p. 49 Parallel computing in Diffpack Advanced Diffpack features Idea: add a few statements to a Diffpack solver and get a parallel version Two approaches: domain decomposition of the mathematical problem (two-level block Jacobi iteration with course grid correction) parallellization of matrix generation and linear solver Both approaches start with a sequential Diffpack solver and add parallel features in small subclasses Current status: upcoming module Advanced Diffpack features p. 49 Advanced Diffpack features p. 49 A parallel simulator Multigrid methods MySim MySimP Parallel tools Why multigrid? It is often the fastest solution method for linear systems How to implement multigrid? wind it into the inner grid/pde details? put it on top of a solver? Diffpack was designed without multigrid methods in mind Multigrid has recently been added as a top module Advanced Diffpack features p. 493 Advanced Diffpack features p. 494 Basic features of Diffpack multigrid Mixed finite elements Multigrid ingredients multiple grids multiple linear solvers interpolation/prolongation...put together in an algorithm Data structures: vectors of grids, linear solvers, sparse matrices Can reuse standard Diffpack classes About lines of additional code in e.g. class Poisson Multigrid in a new problem requires heavy experimentation Flexible run-time combination of standard Diffpack modules into multigrid algorithms Efficiency: surprisingly good Available through the Multi-Level Module Diffpack applies a grid overlay, class BasisFuncGrid, for defining basis functions on a GridFE geometry Isoparametric elements: BasisFuncGrid is transparent Mixed finite elements: BasisFuncGrid defines new nodes Programming with MxFEM and MxFiniteElement instead of FEM and FiniteElement Easy and flexible tool Coupled to block matrices, block preconditioners etc Available in Diffpack v3.5 Advanced Diffpack features p. 495 Advanced Diffpack features p. 496

63 .3 Domain decomposition Adaptivity DD as solver or preconditioner Overlapping vs. non-overlapping The ideas from OO implementation of multigrid carry over to DD In fact, an abstract multilevel algorithm constitute the general software, with multigrid and DD as special cases Current status: upcoming extension of the Multi-Level Module How to implement adaptivity: wind adaptive discretization and solution algorithm (multigrid) together? separate discretization and solvers? Diffpack always separates discretization and solvers! Adaptive grids require about lines of extra code: adaptive grids are subclasses of GridFE adaptivity: simple loop calling up () a refinement criterion and () grid->refine Available through the Adaptivity Module Advanced Diffpack features p. 497 Advanced Diffpack features p. 498 Example: adaptive grids () Example: adaptive grids () Z Axis Y Axis Z X Axis X Y Advanced Diffpack features p. 499 Advanced Diffpack features p. 5 Example: adaptive grids (3) Example: adaptive grids (4) Advanced Diffpack features p. 5 Advanced Diffpack features p. 5 The nd Diffpack book on Springer Advanced Computational Partial Differential Equations Numerical Methods and Diffpack Programming, edited by H.P. Langtangen and A. Tveito. Integration of some theory, models, and algorithms, with emphasis on Diffpack software. Basic concepts in parallel computing Parallel computing with Diffpack Multilevel methods Mixed finite elements Block preconditioning Stochastic PDEs Computational medicine Computational finance Computational geology Published 3 Intro to OOP Advanced Diffpack features p. 53 Intro to OOP p. 54

64 Traditional programming Programming with objects (OOP) Traditional procedural programming: subroutines/procedures/functions data structures = variables, arrays data are shuffled between functions Problems with procedural approach: Numerical codes are usually large, resulting in lots of functions with lots of arrays (and their dimensions) Too many visible details Little correspondence between mathematical abstraction and computer code Redesign and reimplementation tend to be expensive Programming with objects makes it easier to handle large and complicated codes: Well-known in computer science/industry Can group large amounts of data (arrays) as a single variable Can make different implementations look the same for a user Not much explored in numerical computing (until late 99s) Intro to OOP p. 55 Intro to OOP p. 56 Example: programming with matrices A dense matrix in Fortran 77 Mathematical problem: Matrix-matrix product: C = MB Matrix-vector product: y = Mx Points to consider: What is a matrix? a well defined mathematical quantity, containing a table of numbers and a set of legal operations How do we program with matrices? Do standard arrays in any computer language give good enough support for matrices? Fortran syntax (or C, conceptually) C C integer p, q, r double precision M(p,q), B(q,r), C(p,r) double precision y(p), x(q) matrix-matrix product: C = M*B call prodm(m, p, q, B, q, r, C) matrix-vector product: y = M*x call prodv(m, p, q, x, y) Drawback with this implementation: Array sizes must be explicitly transferred New routines for different precisions Intro to OOP p. 57 Intro to OOP p. 58 Working with a dense matrix in C++ A dense matrix class // given integers p, q, j, k, r MatDense M(p,q); // declare a p times q matrix M(j,k) = 3.54; // assign a number to entry (j,k) MatDense B(q,r), C(p,r); Vector x(q), y(p); // vectors of length q and p C=M*B; // matrix-matrix product y=m*x; // matrix-vector product M.prod(x,y); // matrix-vector product Observe that we hide information about array sizes we hide storage structure (the underlying C array) the computer code is as compact as the mathematical notation class MatDense private: double** A; // pointer to the matrix data int m,n; // A is an m times n matrix public: // --- mathematical interface --- MatDense (int p, int q); // create pxq matrix double& operator () (int i, int j); // M(i,j)=4; s=m(k,l); void operator = (MatDense& B); // M = B; void prod (MatDense& B, MatDense& C); // M.prod(B,C); (C=M*B) void prod (Vector& x, Vector& z); // M.prod(y,z); (z=m*y) MatDense operator * (MatDense& B); // C = M*B; Vector operator * (Vector& y); // z = M*y; void size (int& m, int& n); // get size of matrix ; Notice that the storage format is hidden from the user Intro to OOP p. 59 Intro to OOP p. 5 What is this object or class thing? Extension to sparse matrices A class is a collection of data structures and operations on them An object is a realization (variable) of a class The MatDense object is a good example:. data: matrix size + array entries. operations: creating a matrix, accessing matrix entries, matrix-vector products,.. A class is a new type of variable, like reals, integers etc A class can contain other objects; in this way we can create complicated variables that are easy to program with Matrix for the discretization of u = f. Only 5n out of n entries are nonzero. Store only the nonzero entries! Many iterative solution methods for Au = b can operate on the nonzeroes only Intro to OOP p. 5 Intro to OOP p. 5

65 How to store sparse matrices () How to store sparse matrices () a, a,4 a, a,3 a,5 A = a 3, a 3,3. a 4, a 4,4 a 4,5 a 5, a 5,4 a 5,5 Working with the nonzeroes only is important for efficiency! The nonzeroes can be stacked in a one-dimensional array Need two extra arrays to tell where a row starts and the column index of a nonzero A = (a,, a,4, a,, a,3, a,5,... irow = (, 3, 6, 8,, 4), jcol = (, 4,, 3, 5,, 3,, 4, 5,, 4, 5). more complicated data structures and hence more complicated programs Intro to OOP p. 53 Intro to OOP p. 54 Sparse matrices in Fortran Sparse matrix as a C++ class () Code example for y = Mx integer p, q, nnz integer irow(p+), jcol(nnz) double precision M(nnz), x(q), y(p)... call prodvs (M, p, q, nnz, irow, jcol, x, y) Two major drawbacks: Explicit transfer of storage structure (5 args) Different name for two functions that perform the same task on two different matrix formats class MatSparse private: double* A; // long vector with the nonzero matrix entries int* irow; // indexing array int* jcol; // indexing array int m, n; // A is (logically) m times n int nnz; // number of nonzeroes public: // the same functions as in the example above // plus functionality for initializing the data structures ; void prod (Vector& x, Vector& z); // M.prod(y,z); (z=m*y) Intro to OOP p. 55 Intro to OOP p. 56 Sparse matrix as a C++ class () The jungle of matrix formats What has been gained? Users cannot see the sparse matrix data structure Matrix-vector product syntax remains the same The usage of MatSparse and MatDense is the same Easy to switch between MatDense and MatSparse When solving PDEs by finite element/difference methods there are numerous advantageous matrix formats: - dense matrix - banded matrix - tridiagonal matrix - general sparse matrix - structured sparse matrix - diagonal matrix - finite difference stencil as matrix The efficiency of numerical algorithms is often strongly dependent on the matrix storage scheme Goal: hide the details of the storage schemes Intro to OOP p. 57 Intro to OOP p. 58 Different matrix formats The matrix class hierarchy Matrix MatDense MatSparse MatTriDiag MatBanded Generic interface in base class Matrix Implementation of storage and member functions in the subclasses Generic programming in user code: Matrix& M; M.prod(x,y); // y=m*x i.e., we need not know the structure of M, only that it refers to some concrete subclass object; C++ keeps track of which subclass object! prod must then be a virtual function Intro to OOP p. 59 Intro to OOP p. 5

66 Object-oriented programming Bad news... Matrix = object Details of storage schemes are hidden Common interface to matrix operations Base class: define operations, no data Subclasses: implement specific storage schemes and algorithms It is possible to program with the base class only! Object-oriented programming do wonderful things, but might be inefficient Adjusted picture: When indexing a matrix, one needs to know its data storage structure because of efficiency In the rest of the code one can work with the generic base class and its virtual functions Object-oriented numerics: balance between efficiency and OO techniques Intro to OOP p. 5 Intro to OOP p. 5 Base class, subclass, inheritance A subclass inherits data and functions from its base class Base class: class X int i,k; void calc(); ; Some Diffpack/C++ programming Subclass: class Y : public X int n; void calc(); ; Class Y has int i,k,n and functions calc,calc Some Diffpack/C++ programming p. 53 Some Diffpack/C++ programming p. 54 Organization of Diffpack vectors The vector class hierarchy class VecSimplest(Type): just a C array with indexing class Type: no requirements plain C array op()(int i) VecSimplest subclass VecSimple(Type): adds operator=, input/output class Type: operator=, operator<<, operator>> op= op<< op>> VecSimple subclass VecSort(Type): adds operator< etc, sorting class Type: operator<, operator<= etc subclass Vec(Type): adds numerical operations on vectors class Type: operator*, operator/ etc op< op<= etc op+ op- op* op/ VecSort Vec ArrayGen ArrayGenSimplest ArrayGenSimple Vec + multiple indices op()(int i, int j) op()(int i, int j, int k) can print, scan, op= Vector ArrayGenSel inactive entries (FDM & non-rect. geom.) Some Diffpack/C++ programming p. 55 Some Diffpack/C++ programming p. 56 Why this vector organization? Matrices revisited Vector of real (=double): Vec(real) Vector of int: VecSort(int) Vec(int) has too many arithmetic op. Vector of grids: VecSimple(Grid) operator= and printing/reading for Grid make sense, but not arithmetic operations or sorting Vector of simulators: VecSimplest(MySim) neither printing/reading, operator=, nor arithmetic operators make sense Want to use same basic array handling code for VecSimplest(Grid) as for Vec(real) Use inheritance to share code and increase reliability Recall the intro example on handling various matrix formats Declare base class Matrix Define virtual functions for mathematical operations Realize dense matrix, diagonal matrix, etc. as subclasses Implement mathematical operations in subclasses only Magic: Program with Matrix, C++ figures out which subclass you really mean! Keywords: virtual functions, inheritance, object-oriented programming Some Diffpack/C++ programming p. 57 Some Diffpack/C++ programming p. 58

67 Why emphasize software design? Grid and field abstractions PDE simulator: 5 + code lines Maintainability important Should be easy to extend Should be easy to use/understand Abstractions close to mathematical language are needed The design must be a balance between attractive abstractions and computational efficiency PDE: [λ(x) u(x)] = f(x), Assume some discretization (FDM, FEM,...) Natural abstractions: scalar fields: λ(x), f(x), u(x) (explicit functions, discrete fields) discrete : grid field = grid + values or field = explicit formula discrete operators? x Some Diffpack/C++ programming p. 59 Some Diffpack/C++ programming p. 53 Programming considerations Grids and fields in Diffpack Obvious ideas: collect grid information in a grid class collect field information in a field class Gain: shorter code, closer to the mathematics finite difference methods: minor finite element methods: important big programs: fundamental Assume a finite difference method: Field represented by class FieldLattice: a grid of type GridLattice a set of point values, ArrayGenSel (ArrayGenSel is a subclass of ArrayGen with extra functionality) Grid represented by GridLattice (uniform partition in d dimensions) Some Diffpack/C++ programming p. 53 Some Diffpack/C++ programming p. 53 The GridLattice class The FieldLattice class class GridLattice private: // data that hold grid spacing, size of domain etc public: GridLattice (int nsd); real getpt (int dir, int index); // get coordinate of pt. int getbase (int dir); // loops: start index int getmaxi (int dir); // loops: stop index real Delta (int dir); // grid spacing void scan (Is is); // scan("d= [,] [:4]"); ; // declare a D grid in a program: // GridLattice grid(); // grid.scan("d= [,]x[,] [:]x[-:]"); Some Diffpack/C++ programming p. 533 class FieldLattice private: Handle(GridLattice) grid; // pointer to the grid Handle(ArrayGen(real)) vec; // pointer to the field values public: FieldLattice (GridLattice& grid, const char* fieldname); GridLattice& grid (); // access to the grid ArrayGen(real)& values (); // access to the field values ; // given some D FieldLattice f, set f=sin(f): int i = f.grid().getbase(); // start index, x-dir int in = f.grid().getmaxi(); // stop index, x-dir int j = f.grid().getbase(); // start index, y-dir int jn = f.grid().getmaxi(); // stop index, y-dir int i,j; for (j = j; j <= jn; j++) for (i = i; i <= in; i++) f.values()(i,j) = sin (f.values()(i,j)); Some Diffpack/C++ programming p. 534 Smart pointers (handles) Simulator classes Dynamic memory in C/C++ need pointers Bug no. in C/C++: pointers For example, if 5 fields point to the same grid, when can we safely remove the grid object? Make life easy: use a smart pointer Handle(X) x; x.rebind (new X()); // NULL pointer // x points to new X object // given a function void somefunc (X& xobj): somefunc (*x); somefunc (x()); // send object (not the handle) // alternative syntax // given a Handle(X) y: x.rebind (*y); // x points to y s object x = y; // not recommended (often a bug...) *x = *y; // set x s object equal to y s object x.getref(); x.getptr(); // extract reference to x (same as *x) // extract pointer to x The PDE solver is a class itself Easy to extend/modify solver Enables coupling to optimization, automatic parameter analysis etc. Easy to combine solvers (systems of PDEs) Typical look: class MySim protected: // grid and field objects // PDE dependent parameters public: void scan(); // read input and init void solveproblem(); void resultreport(); ; negligible overhead, automatic garbage collection Some Diffpack/C++ programming p. 535 Some Diffpack/C++ programming p. 536

68 Diffpack naming conventions String vibration revisited Local variables have lower-case letters, words are separated by underscores, e.g., my_variable Functions start with lower-case letters, words are separated by capitals, e.g., myfunction Class and enum names start with a capital letter, words are separated by capitals, e.g., MyClass Macros and enum values have upper-case letters, words are separated by underscores, e.g., MY_MACRO Remark: of course, you can follow your own convention, but it is important to be consistent! Some Diffpack/C++ programming p. 537 Problem: u t = γ u x Explicit finite difference method; loop through (x, t) grid New class-based code: WaveD class WaveD Handle(GridLattice) grid; // lattice grid here D grid Handle(FieldLattice) up; // solution u at time level l+ Handle(FieldLattice) u; // solution u at time level l Handle(FieldLattice) um; // solution u at time level l- Handle(TimePrm) tip; // time discretization parameters (dt etc.) CurvePlotFile plotfile;// for plotting results real C; // the Courant number (appears in the scheme) void setic (); // set initial conditions void timeloop (); // perform time stepping void dumpsolution (); // make a curve plot of u public: WaveD() ~WaveD() void scan (); // read discretization parameters and initialize void solveproblem (); // solve the problem void resultreport () // just dummy here ; Some Diffpack/C++ programming p. 538.h files and.cpp files Class TimePrm The class declaration (listing of data and function) is placed in a separate file, with extension.h (here WaveD.h) #ifndef WaveD_h_IS_INCLUDED #define WaveD_h_IS_INCLUDED #include <FieldLattice.h> #include <TimePrm.h> class WaveD Handle(GridLattice) grid; // lattice grid here D grid Handle(FieldLattice) up; // solution u at time level l+... ; #endif The bodies of the member functions are put in a file with extension.cpp (here WaveD.cpp) Class TimePrm holds time parameters: t, time interval for simulation etc. Initialization: Handle(TimePrm) tip = new TimePrm(); tip.scan ("dt=. t in [,8]"); // only some characters are important: tip.scan ("=. [,8]"); Useful methods: class TimePrm public: real Delta() const; // return time step real time() const; // return current time void inittimeloop(); // initialize bool finished(); // is stop time reached? void increasetime(); // t = t + dt int gettimestepno(); // return time step number ; Some Diffpack/C++ programming p. 539 Some Diffpack/C++ programming p. 54 Reading input Solving the problem Let us read input (C, the grid, and the stop time) from the Unix command line like this:./app -C.8 -g d= [,] [:4] -t 6.5 void WaveD:: scan () // real C is a class member, initialize it here: initfromcommandlinearg ("-C", C,., "Courant number", "R[:]"); String grid_str; initfromcommandlinearg ("-g", grid_str, "d= [,] [:]", "grid", "S"); grid.rebind(new GridLattice()); grid->scan (grid_str); tip.rebind (new TimePrm()); real tstop; initfromcommandlinearg ("-t", tstop,., "tstop", "R[:]"); // construct the proper initialization string from C: tip->scan (aform("dt=%g t in [,%g]", C*grid->Delta(), tstop)); // (we assume unit wave velocity)... void WaveD:: solveproblem () timeloop(); void WaveD:: timeloop () tip->inittimeloop(); setic(); const int i = u->grid().getbase(); // start of loop const int n = u->grid().getmaxi(); // end of loop int i; dumpsolution (); // plot initial condition // useful abbreviations (also for efficiency): const ArrayGen(real)& U = u ->values(); const ArrayGen(real)& Um = um->values(); ArrayGen(real)& Up = up->values(); while (!tip->finished()) tip->increasetime(); for (i = i+; i <= n-; i++) Up(i) = *U(i) - Um(i) + sqr(c) * (U(i+) - *U(i) + U(i-)); Up(i) = ; Up(n) = ; // insert boundary values *um = *u; *u = *up; // update for next step, CHANGED // alternative syntax: um() = u(); u() = up(); dumpsolution (); Some Diffpack/C++ programming p. 54 Some Diffpack/C++ programming p. 54 Set initial conditions Dump results and main function void WaveD:: setic () // set initial conditions on u and um const int i = u->grid().getbase(); // start point index const int n = u->grid().getmaxi(); // end point index const real umax =.5; // max amplitude // initialization of up up->fill(.); // initialization of u (the initial displacement of the string) u->fill(.); int i; real x; for (i = i; i <= n; i++) x = grid->getpt(,i); // get x coord of grid point no i if (x <.7) u->values()(i) = (umax/.7) * x; else u->values()(i) = (umax/.3) * ( - x); // initialization of um (the special formula) um->fill(.); for (i = i+; i <= n-; i++) // set the help variable um: um->values()(i) = u->values()(i) +.5*sqr(C) * (u->values()(i+) - *u->values()(i) + u->values()(i-)); void WaveD:: dumpsolution () // automatic dump of a curve plot of a D field: SimResgnuplot::makeCurvePlot (*u, // field to be plotted (D) plotfile, // curve plot manager "displacement", // plot title oform("u(x,%.4f)",tip->time()), // name of function oform("c=%g, h=%g, t=%g", // comment C,u->grid().Delta(),tip->time())); // main.cpp: #include <WaveD.h> int main (int argc, const char* argv[]) initdiffpack (argc, argv); WaveD simulator; simulator.scan (); simulator.solveproblem (); simulator.resultreport (); Some Diffpack/C++ programming p. 543 Some Diffpack/C++ programming p. 544

69 Nice exercise: manual plotting Exercise.7 Let us rewrite dumpsolution: write each data point on the u(x, ) curve to a CurvePlot object: void WaveD:: dumpsolution () CurvePlot curve (plotfile); // tie CurvePlot to CurvePlotFile curve.initpair ("displacement", // title aform("u(x,%.4f)",tip->time()), // curvename "x", // indep.var. aform("c=%g",c)); // comment // loop through all points in the grid, add (x,u) to curve: int i = grid->getbase(); // start index int in = grid->getmaxi(); // stop index real x,uval; for (int i = i; i <= in; i++) x = grid->getpt(,i); // extract x coordinate uval = u->values()(i); curve.addpair (x, uval); curve.finish(); Consider a wave equation with damping: u t + β u t = γ u x Same initial and boundary conditions as in class WaveD Modify the numerical scheme Take a copy of class WaveD Implement the modification Give β on the command line Display a movie of a damped string: curveplotmovie gnuplot SIMULATION.map -.. Some Diffpack/C++ programming p. 545 Some Diffpack/C++ programming p. 546