In other words the graph of the polynomial should pass through the points

Capter 3 Interpolation Interpolation is te problem of fitting a smoot curve troug a given set of points, generally as te grap of a function. It is useful at least in data analysis (interpolation is a form of regression), industrial design, signal processing (digital-to-analog conversion) and in numerical analysis. It is one of tose important recurring concepts in applied matematics. In tis capter, we will immediately put interpolation to use to formulate ig-order quadrature and differentiation rules. 3.1 Polynomial interpolation Given N + 1 points x j R, 0 j N, and sample values y j = f(x j ) of a function at tese points, te polynomial interpolation problem consists in finding a polynomial p N (x) of degree N wic reproduces tose values: y j = p N (x j ), j = 0,..., N. In oter words te grap of te polynomial sould pass troug te points N (x j, yj). A degree-n polynomial can be written as p N (x) = n=0 a nx n for some coefficients a 0,..., a N. For interpolation, te number of degrees of freedom (N + 1 coefficients) in te polynomial matces te number of points were te function sould be fit. If te degree of te polynomial is strictly less tan N, we cannot in general pass it troug te points (x j, y j ). We can still try to pass a polynomial (e.g., a line) in te best approximate manner, but tis is a problem in approximation rater tan interpolation; we will return to it later in te capter on least-squares. 1

CHAPTER 3. INTERPOLATION Let us first see ow te interpolation problem can be solved numerically in a direct way. Use te expression of p N into te interpolating equations y j = p N (x j ): N n=0 a x n n j = y j, j = 0,..., N. In tese N + 1 equations indexed by j, te unknowns are te coefficients a 0,..., a N. We are in presence of a linear system V a = y, N Vjn a n = y j, n=0 wit V te so-called Vandermonde matrix, V jn = x n j, i.e., 1 x N 0 x 0 V = 1 x 1 x N 1.... 1 x N x N N We can ten use a numerical software like Matlab to construct te vector of abscissas x j, te rigt-and-side of values y j, te V matrix, and numerically solve te system wit an instruction like a = V \ y (in Matlab). Tis gives us te coefficients of te desired polynomial. Te polynomial can now be plotted in between te grid points x j (on a finer grid), in order to display te interpolant. Historically, matematicians suc as Lagrange and Newton did not ave access to computers to display interpolants, so tey found explicit (and elegant) formulas for te coefficients of te interpolation polynomial. It not only simplified computations for tem, but also allowed tem to understand te error of polynomial interpolation, i.e., te difference f(x) p N (x). Let us spend a bit of time retracing teir steps. (Tey were concerned wit applications suc as fitting curves to celestial trajectories.) We ll define te interpolation error from te uniform (L ) norm of te difference f p N : f p N := max f(x) p N (x), were te maximum is taken over te interval [x 0, x N ]. x 2

3.1. POLYNOMIAL INTERPOLATION Call P N te space of real-valued degree-n polynomials: N P N = { a n x n : a n R}. n=0 Lagrange s solution to te problem of polynomial interpolation is based on te following construction. Lemma 1. (Lagrange elementary polynomials) Let {x j, j = 0,..., N} be a collection of disjoint numbers. For eac k = 0,..., N, tere exists a unique degree-n polynomial L k (x) suc tat { 1 if j = k; L k (x j ) = δ jk = 0 if j = k. Proof. Fix k. L k as roots at x j for j = k, so L k must be of te form 1 L k (x) = C (x x j ). j=k Evaluating tis expression at x = x k, we get 1 = C 1 (x k x j ) C =. j=k (x k x j ) j=k Hence te only possible expression for L k is L k (x) = j=k(x x j ). j=k (x k x j ) Tese elementary polynomials form a basis (in te sense of linear algebra) for expanding any polynomial interpolant p N. 1 Tat s because, if we fix j, we can divide L k (x) by (x x j ), j = k. We obtain L k (x) = (x x j )q(x) + r(x), were r(x) is a remainder of lower order tan x x j, i.e., a constant. Since L k (x j ) = 0 we must ave r(x) = 0. Hence (x x j ) must be a factor of L k (x). Te same is true of any (x x j ) for j = k. Tere are N suc factors, wic exausts te degree N. Te only remaining degree of freedom in L k is te multiplicative constant. 3

CHAPTER 3. INTERPOLATION Teorem 4. (Lagrange interpolation teorem) Let {x j, j = 0,..., N} be a collection of disjoint real numbers. Let {y j, j = 0,..., N} be a collection of real numbers. Ten tere exists a unique p N P N suc tat p N (x j ) = y j, j = 0,..., N. Its expression is N p N (x) = y k L k (x), (3.1) k=0 were L k (x) are te Lagrange elementary polynomials. Proof. Te justification tat (3.1) interpolates is obvious: N p N (x j ) = N y k L k (x j ) = y k L k (x j ) = y j. k=0 k=0 It remains to see tat p N is te unique interpolating polynomial. For tis purpose, assume tat bot p N and q N take on te value y j at x j. Ten r N = p N q N is a polynomial of degree N tat as a root at eac of te N + 1 points x 0,..., x N. Te fundamental teorem of algebra, owever, says tat a nonzero polynomial of degree N can only ave N (complex) roots. Terefore, te only way for r N to ave N + 1 roots is tat it is te zero polynomial. So p N = q N. By definition, N p N (x) = f(x k )L k (x) k=0 is called te Lagrange interpolation polynomial of f at x j. Example 7. Linear interpolation troug (x 1, y 1 ) and (x 2, y 2 ): x x L 1 (x) = 2 x x 1, L 2 (x) =, x 1 x 2 x 2 x 1 p 1 (x) = y 1 L 1 (x) + y 2 L 2 (x) y 2 y 1 y 1 x 2 y 2 x 1 = x + x 2 x 1 x 2 x 1 y 2 = y 1 + y1 (x x 1 ). x 2 x 1 4

3.1. POLYNOMIAL INTERPOLATION Example 8. (Example (6.1) in Suli-Mayers) Consider f(x) = e x, and interpolate it by a parabola (N = 2) from tree samples at x 0 = 1, x 1 = 0, x 2 = 1. We build (x x 1 )( x x 2 ) 1 L 0 (x) = = x(x 1) (x 0 x 1 )(x 0 x 2 ) 2 Similarly, 1 L 1 (x) = 1 x 2, L 2 (x) = x(x + 1). 2 So te quadratic interpolant is p 2 (x) = e 1 L 0 (x) + e 0 L 1 (x) + e 1 L 2 (x), = 1 + sin(1) x + (cos(1) 1) x 2, 1 + 1.1752 x + 0.5431 x 2. Anoter polynomial tat approximates e x reasonably well on [ 1, 1] is te Taylor expansion about x = 0: x 2 t 2 (x) = 1 + x +. 2 Manifestly, p 2 is not very different from t 2. (Insert picture ere) Let us now move on to te main result concerning te interpolation error of smoot functions. Teorem 5. Let f C N+1 [a, b] for some N > 0, and let {x j : j = 0,..., N} be a collection of disjoint reals in [a, b]. Consider p N te Lagrange interpolation polynomial of f at x j. Ten for every x [a, b] tere exists ξ(x) [a, b] suc tat f (N+1) (ξ(x)) f(x) p N (x) = π N+1 (x), (N + 1)! were N+1 π N+1 (x) = (x j=1 x ). An estimate on te interpolation error follows directly from tis teorem. Set ( M N+1 = max f N+1) (x) x [a,b] j 5

CHAPTER 3. INTERPOLATION (wic is well defined since f (N+1) is continuous by assumption, ence reaces its lower and upper bounds.) Ten MN+1 f(x) p N (x) (N + 1)! π N+1(x) In particular, we see tat te interpolation error is zero wen x = x j for some j, as it sould be. Let us now prove te teorem. Proof. (Can be found in Suli-Mayers, Capter 6) In conclusion, te interpolation error: depends on te smootness of f via te ig-order derivative f (N+1) ; as a factor 1/(N + 1)! tat decays fast as te order N ; and is directly proportional to te value of π N+1 (x), indicating tat te interpolant may be better in some places tan oters. Te natural follow-up question is tat of convergence: can we always expect convergence of te polynomial interpolant as N? In oter words, does te factor 1/(N + 1)! always win over te oter two factors? Unfortunately, te answer is no in general. Tere are examples of very smoot (analytic) functions for wic polynomial interpolation diverges, particularly so near te boundaries of te interplation interval. Tis beavior is called te Runge penomenon, and is usually illustrated by means of te following example. Example 9. (Runge penomenon) Let f(x) for x [ 5, 5]. Interpolate it at equispaced points x j = 10j/N, were j = N/2,..., N/2 and N is even. It is easy to ceck numerically tat te interpolant diverges near te edges of [ 5, 5], as N. See te Trefeten textbook on page 44 for an illustration of te Runge penomenon. (Figure ere) If we ad done te same numerical experiment for x [ 1, 1], te interpolant would ave converged. Tis sows tat te size of te interval matters. Intuitively, tere is divergence wen te size of te interval is larger tan te features, or caracteristic lengt scale, of te function (ere te widt of te bump near te origin.) 6

3.2. POLYNOMIAL RULES FOR INTEGRATION Te analytical reason for te divergence in te example above is due in no small part to te very large values taken on by π N+1 (x) far away from te origin in contrast to te relatively small values it takes on near te origin. Tis is a problem intrinsic to equispaced grids. We will be more quantitative about tis issue in te section on Cebysev interpolants, were a remedy involving non-equispaced grid points will be explained. As a conclusion, polynomial interpolants can be good for small N, and on small intervals, but may fail to converge (quite dramatically) wen te interpolation interval is large. 3.2 Polynomial rules for integration In tis section, we return to te problem of approximating b u(x)dx by a a weigted sum of samples u(x j ), also called a quadrature. Te plan is to form interpolants of te data, integrate tose interpolants, and deduce corresponding quadrature formulas. We can formulate rules of arbitrarily ig order tis way, altoug in practice we almost never go beyond order 4 wit polynomial rules. 3.2.1 Polynomial rules Witout loss of generality, consider te local interpolants of u(x) formed near te origin, wit x 0 = 0, x 1 = and x 1 =. Te rectangle rule does not belong in tis section: it is not formed from an interpolant. Te trapezoidal rule, were we approximate u(x) by a line joining (0, u(x 0 )) and (, u(x 1 )) in [0, ]. We need 2 derivatives to control te error: u (ξ(x)) u(x) = p 1 (x) + x(x ), 2 p 1 (x) = u(0)l 0 (x) + u()l 1 (x), 0 L 0 (x) = x, L x 1 (x) =, L 0 (x)dx = L 1 (x)dx = /2, (areas of triangles) 0 0 u (ξ(x)) x (x 2 ) dx C max u (ξ) 3. ξ 7

CHAPTER 3. INTERPOLATION Te result is 0 ( ) u(0) + u() u(x) dx = + O( 3 ). 2 As we ave seen, te terms combine as 1 N 1 u(x) dx = u(x 0 ) + u(x j 1) + u(x N ) + O( 2 ). 2 0 2 j=1 Simpson s rule, were we approximate u(x) by a parabola troug (, u(x 1 )), (0, u(x 0 )), and (, u(x 1 )) in [, ]. We need tree derivatives to control te error: u (ξ(x)) u(x) = p 2 (x) + (x + )x(x ), 6 p 2 (x) = u( )L 1 (x) + u(0)l 0 (x) + u()l 1 (x), x(x ) (x + )(x ) (x + )x L 1 (x) =, L0 (x) =, L1 (x) =, 2 2 2 2 2 L 1 (x)dx = L 1 (x)dx = /3, L 0 (x)dx = 4/3, u (ξ(x)) (x + )x(x ) dx C max u (ξ) 4. 0 6 Te result is ( ) u( ) + 4u(0) + u() u(x) dx = + O( 4 ). 3 Te composite Simpson s rule is obtained by using tis approximation on [0, 2] [2, 4]... [1 2, 1], adding te terms, and recognizing tat te samples at 2n (except 0 and 1) are represented twice. ( ) 1 u(0) + 4u() + 2u(2) + 4u(3) +... + 2u(1 2) + 4u(1 ) + u(1) u(x)dx = 0 3 It turns out tat te error is in fact O( 5 ) on [, ], and O( 4 ) on [0, 1], a result tat can be derived by using symmetry considerations (or canceling te terms from Taylor expansions in a tedious way.) For tis, we need u to be four times differentiable (te constant in front of 5 involves u.) ξ 8

3.3. POLYNOMIAL RULES FOR DIFFERENTIATION Te iger-order variants of polynomial rules, called Newton-Cotes rules, are not very interesting because te Runge penomenon kicks in again. Also, te weigts (like 1,4,2,4,2, etc.) become negative, wic leads to unacceptable error magnification if te samples of u are not known exactly. Piecewise spline interpolation is not a good coice for numerical integration eiter, because of te two leftover degrees of freedom, wose arbitrary coice affects te accuracy of quadrature in an unacceptable manner. (We ll study splines in a later section.) We ll return to (useful!) integration rules of arbitrarily ig order in te scope of spectral metods. 3.3 Polynomial rules for differentiation A systematic way of deriving finite difference formulas of iger order is to view tem as derivatives of a polynomial interpolant passing troug a small number of points neigboring x j. For instance (again we use, 0, as reference points witout loss of generality): Te forward difference at 0 is obtained from te line joining (0, u(0)) and (, u()): p 1 (x) = u(0)l 0 (x) + u()l 1 (x), x x L 0 (x) =, L 1 (x) =, u() u(0) p 1(0) =. We already know tat u (0) p 1(0) = O(). Te centered difference at 0 is obtained from te line joining (, u( )) and (, u()) (a simple exercise), but it is also obtained from differentiating te parabola passing troug te points (, u( )), (0, u(0)), and (, u()). Indeed, p 2 (x) = u( )L 1 (x) + u(0)l 0 (x) + u()l 1(x), x(x ) (x + )(x ) (x + )x L 1 (x) =, L 0 (x) =, L 1 (x) =, 2 2 2 2 2 2x 2x p 2x + 2(x) = u( ) + u(0) + u(), 2 2 2 2 2 9

CHAPTER 3. INTERPOLATION u() ( p u ) 2 (0) =. 2 We already know tat u (0) p 2(0) = O( 2 ). Oter examples can be considered, suc as te centered second difference (-1 2-1), te one-sided first difference (-3 4-1), etc. Differentiating one-sided interpolation polynomials is a good way to obtain one-sided difference formulas, wic comes in andy at te boundaries of te computational domain. Te following result establises tat te order of te finite difference formula matces te order of te polynomial being differentiated. Teorem 6. Let f C N+1 [a, b], {x j, j = 0,..., N} some disjoint points, and p N te corresponding interpolation polynomial. Ten f (N+1) (ξ) f (x) p N(x) = π N (x), N! for some ξ [a, b], and were π N (x) = (x η 1 )... (x η N ) for some η j [x j 1, xj]. Te proof of tis result is an application of Rolle s teorem tat we leave out. (It is not simply a matter of differentiating te error formula for polynomial interpolation, because we ave no guarantee on dξ/dx.) A consequence of tis teorem is tat te error in computing te derivative is a O( N ) (wic comes from a bound on te product π N (x).) It is interesting to notice, at least empirically, tat te Runge s penomenon is absent wen te derivative is evaluated at te center of te interval over wic te interpolant is built. 3.4 Piecewise polynomial interpolation Te idea of piecewise polynomial interpolation, also called spline interpolation, is to subdivide te interval [a, b] into a large number of subintervals [x j 1, x j], and to use low-degree polynomials over eac subintervals. Tis elps avoiding te Runge penomenon. Te price to pay is tat te interpolant is no longer a C function instead, we lose differentiability at te junctions between subintervals, were te polynomials are made to matc. 10

3.4. PIECEWISE POLYNOMIAL INTERPOLATION If we use polynomials of order n, ten te interpolant is piecewise C, and overall C k, wit k n 1. We could not expect to ave k = n, because it would imply tat te polynomials are identical over eac subinterval. We ll see two examples: linear splines wen n = 1, and cubic splines wen n = 3. (We ave seen in te omework wy it is a bad idea to coose n = 2.) 3.4.1 Linear splines We wis to interpolate a continuous function f(x) of x [a, b], from te knowledge of f(x j ) at some points x j, j = 0,..., N, not necessarily equispaced. Assume tat x 0 = a and x N = b. Te piecewise linear interpolant is build by tracing a straigt line between te points (x j 1, f(x j 1 )) and (x j, f(x j ))); for j = 1,..., N te formula is simply x j 1 x x x j 1 s L (x) = f(xj 1 ) + f(x j ), x [x j 1, x j]. x j 1 x j x 1 x j In tis case we see tat s L (x) is a continuous function of x, but tat its derivative is not continuous at te junction points, or nodes x j. If te function is at least twice differentiable, ten piecewise linear interpolation as second-order accuracy, i.e., an error O( 2 ). Teorem 7. Let f C 2 [a, b], and let = max j=1,...,n (x j x j 1 ) be te grid diameter. Ten 2 f s L L [a,b] 8 f L [a,b]. Proof. Let x [x j 1, x j ] for some j = 1,..., N. We can apply te basic result of accuracy of polynomial interpolation wit n = 1: tere exists ξ [x j 1, x j ] suc tat 1 f(x) s L (x) = f (ξ) (x x j 1)(x x j ), x [x j 1, x j ]. 2 Let j = x j x j 1. It is easy to ceck tat te product (x x j 1 )(x x j ) xj 1+x takes its maximum value at te midpoint j, and tat te value is 2 2 j/4. We ten ave 2 j f(x) s L (x) j max f (ξ), x [x j 1, x j ]. 8 ξ [x j 1,x j ] Te conclusion follows by taking a maximum over j. 11

CHAPTER 3. INTERPOLATION We can express any piecewise linear interpolant as a superposition of tent basis functions φ k (x): s L (x) = c k φ k (x), k were φ k (x) is te piecewise linear function equal to 1 at x = x k, and equal to zero at all te oter grid points x j, j = k. Or in sort, φ k (x j ) = δ jk. On an equispaced grid x j = j, an explicit formula is were 1 φ k (x) = S (1) (x x k ), S (1) (x) = x + 2(x ) + + (x 2) +, and were x + denotes te positive part of x (i.e., x if x 0, and zero if x < 0.) Observe tat we can simply take c k = f(x k ) above. Te situation will be more complicated wen we pass to iger-order polynomials. 3.4.2 Cubic splines Let us now consider te case n = 3 of an interpolant wic is a tird-order polynomial, i.e., a cubic, on eac subinterval. Te most we can ask is tat te value of te interpolant, its derivative, and its second derivative be continuous at te junction points x j. Definition 5. (Interpolating cubic spline) Let f C[a, b], and {x j, j = 0,..., N} [a, b]. An interpolating cubic spline is a function s(x) suc tat 1. s(x j ) = f(x j ); 2. s(x) is a polynomial of degree 3 over eac segment [x j 1, x j ]; 3. s(x) is globally C 2, i.e., at eac junction point x j, we ave te relations s(x j ) = s(x+ j ), s (x j ) = s (x + j ), s (x j ) = s (x + j ), were te notations s(x j ) and s(x+ j ) refer to te adequate limits on te left and on te rigt. 12

3.4. PIECEWISE POLYNOMIAL INTERPOLATION Let us count te degrees of freedom. A cubic polynomial as 4 coefficients. Tere are N + 1 points, ence N subintervals, for a total of 4N numbers to be specified. Te interpolating conditions s(x j ) = f(x j ) specify two degrees of freedom per polynomial: one value at te left endpoint x j 1, and one value at te rigt endpoint x j. Tat s 2N conditions. Continuity of s(x) follows automatically. Te continuity conditions on s, s are imposed only at te interior grid points x j for j = 1,..., N 1, so tis gives rise to 2(N 1) additional conditions, for a total of 4N 2 equations. Tere is a mismatc between te number of unknowns (4N) and te number of conditions (4N 2). Two more degrees of freedom are required to completely specify a cubic spline interpolant, ence te precaution to write an interpolant in te definition above, and not te interpolant. Te most widespread coices for fixing tese two degrees of freedom are: Natural splines: s (x 0 ) = s (x N ) = 0. If s(x) measures te displacement of a beam, a condition of vanising second derivative corresponds to a free end. Clamped spline: s (x 0 ) = p 0, s (x N ) = p N, were p 0 and p N are specified values tat depend on te particular application. If s(x) measures te displacement of a beam, a condition of vanising derivative corresponds to a orizontal clamped end. Periodic spline: assuming tat s(x 0 ) = s(x N ), ten we also impose s (x 0 ) = s (x N ), s (x 0 ) = s (x N ). Let us now explain te algoritm most often used for determining a cubic spline interpolant, i.e., te 4N coefficients of te N cubic polynomials, from te knowledge of f(x j ). Let us consider te natural spline. It is advantageous to write a system of equations for te second derivatives at te grid points, tat we denote σ j = s (x j ). Because s(x) is piecewise cubic, we know tat s (x) is piecewise linear. Let j = x j x j 1. We can write x s j x x x j 1 (x) = σ j 1 + σ j, x [x j 1, x j ]. j j 13

CHAPTER 3. INTERPOLATION Hence (x j x) 3 (x s(x) = σ j 1 + x j 1) 3 σ j + α j (x x j 1 ) + βj(x j x). 6 j 6 j We could ave written ax + b for te effect of te integration constants in te equation above, but writing it in terms of α j and β j makes te algebra tat follows simpler. Te two interpolation conditions for [x j 1, x j ] can be written σj 1 2 j s(x j 1 ) = f(x j 1 ) f(x 1 ) = j + β 6 j j, σj 2 j s(x j ) = f(x j ) f(x j ) = + j α j. 6 One ten isolates α j, β j in tis equation; substitutes tose values in te equation for s(x); and evaluates te relation s (x j ) = s (x + j ). Given tat σ 0 = σ N = 0, we end up wit a system of N 1 equations in te N 1 unknowns σ 1,..., σ N. Skipping te algebra, te end result is ( ) f(x j+1 ) f(x j ) f(x j ) f(x j 1 ) j σ j 1 + 2( j+1 + j )σ j + j+1 σ j+1 = 6. j+1 We are in presence of a tridiagonal system for σ j. It can be solved efficiently wit Gaussian elimination, yielding a LU decomposition wit bidiagonal factors. Unlike in te case of linear splines, tere is no way around te fact tat a linear system needs to be solved. Notice tat te tridiagonal matrix of te system above is diagonally dominant (eac diagonal element is strictly greater tan te sum of te oter elements on te same row, in absolute value), ence it is always invertible. One can ceck tat te interpolation for cubic splines is O( 4 ) well away from te endpoints. Tis requires an analysis tat is too involved for te present set of notes. Finally, like in te linear case, let us consider te question of expanding a cubic spline interpolant as a superposition of basis functions s(x) = c k φ k (x). k Tere are many ways of coosing te functions φ k (x), so let us specify tat tey sould ave as small a support as possible, tat tey sould ave te j 14

3.4. PIECEWISE POLYNOMIAL INTERPOLATION same smootness C 2 as s(x) itself, and tat tey sould be translates of eac oter wen te grid x j is equispaced wit spacing. Te only solution to tis problem is te cubic B-spline. Around any interior grid point x k, is supported in te interval [x k 2, x k+2 ], and is given by te formula 1 φ k (x) = S (3) (x x 4 3 k 2 ), were S (x) = x 3 3 (3) + 4(x ) + + 6(x 2) 3 + 4(x 3) 3 + + (x 4) 3 +, and were x + is te positive part of x. One can ceck tat φ k (x) takes te value 1 at x k, 1/4 at x k ± 1, zero outside of [xk 2, x k+2], and is C 2 at eac junction. It is a bell-saped curve. it is an interesting exercise to ceck tat it can be obtained as te convolution of two tent basis functions of linear interpolation: S (3) (x) = cs (1) (x) S (1) (x), were c is some constant, and te symbol denotes convolution: f g(x) = f(y)g(x y) dy. R Now wit cubic B-splines, we cannot put c k = f(x k ) anymore, since φ k (x j ) = δ jk. Te particular values of c k are te result of solving a linear system, as mentioned above. Again, tere is no way around solving a linear system. In particular, if f(x k ) canges at one point, or if we add a datum point (x k, f k ), ten te update requires re-computing te wole interpolant. Canging one point as ripple effects trougout te wole interval [a, b]. Tis beavior is ultimately due to te more significant overlap between neigboring φ k in te cubic case tan in te linear case. 15

MIT OpenCourseWare ttp://ocw.mit.edu 18.330 Introduction to Numerical Analysis Spring 2012 For information about citing tese materials or our Terms of Use, visit: ttp://ocw.mit.edu/terms.