3.5 Spline interpolation

3.5 Spline interpolation Given a tabulated function f k = f(x k ), k = 0,... N, a spline is a polynomial between each pair of tabulated points, but one whose coefficients are determined slightly non-locally. The non-locality is designed to guarantee global smoothness in the interpolated function up to some order of derivative. Cubic splines are the most popular. They produce an interpolated function that is continuous through to the second derivative. Splines tend to be stabler than fitting a polynomial through the N + points, with less possibility of wild oscillations between the tabulated points. We shall explain how spline interpolation works by first going through the theory and then applying it to interpolate the function below: 0.5 f 5 f f 0-4 -3-2 - 2 3 4 f 4 f 7 f f 8 f 2 f 3-0.5 Figure 3.: The function f(x) = +x 2 sin(x) between [ 4, 4]. The values of f(x k ) are given for i = 0,..., 8, x k = 4, 3, 2,, 0,, 2, 3, 4. Figure 3. shows the function f(x) = sin x in the region x [ 4, 4]. We are given the value of the +x 2 function at the points f i = f(x k ), k = 0,..., 8, where x k = 4, 3,..., 0,..., 3, 4. These are what is called the interpolating nodes. 3.5. Linear Spline Let s focus attention on one particular interval (x k, x k+ ). Linear interpolation in that interval gives the interpolation formula f = Af k + Bf k+ (3.) where A x k+ x x k+ x k, B A = x x k x k+ x k (3.2)

This is just like the piecewise Lagrange polynomial interpolation we looked at earlier. Figure 3.2 shows the piecewise interpolated function over the full range [ 4, 4]. 0.5 f 5 f f 0-4 -3-2 - 2 3 4 f 4 f 7 f f 8 f 2 f 3-0.5 Figure 3.2: Piecewise linear interpolation with 9 nodes (solid) for the function +x 2 sin x between [ 4, 4] (dotted). We can see that linear interpolation works quite well for larger values of x but does particularly badly in (, ) as it fails to capture the curvature of the function. The accuracy can be improved by using more interpolating nodes, but an important issue is that the first derivatives of the interpolating function are discontinuous at the nodes. 3.5.2 Cubic spline interpolation The goal of cubic spline interpolation is to get an interpolation formula that is continuous in both the first and second derivatives, both within the intervals and at the interpolating nodes. This will give us a smoother interpolating function. In general, if the function you want to approximate is smooth, then cubic splines will do better than piecewise linear interpolation. Before you read on, I d like you to clear your mind a little. The following is a derivation of the cubic interpolation formula from Chapter 3 of Numerical Recipes by Press et al., which is a slight variation on the B 0 splines we discussed in lectures, but in my opinion illustrates the fundamental idea of spline interpolation more clearly. If you understand this, then you will find working with the B 0 splines much easier. Let s restate the problem. We have a function f(x) that is tabulated at the N + points f k = f(x k ), k = 0,..., N. In each interval (x k, x k+ ), we can fit a straight line through the points (x k, f k ) and (x k+, f k+ ) using the formula given by (3.). The problem is that with a linear function, the first derivative is not contin- 2

uous at the boundary between two adjacent intervals, while we want the second derivative to be continuous, even at the boundary! Now suppose, contrary to fact, that in addition to the tabulated values of f i, we also have tabulated values for the function s second derivatives, that is, a set of numbers f i. Then within each interval (x k, x k+ ), we can add to the right-hand side of equation (3.) a cubic polynomial whose second derivative varies linearly from a value f k on the left to a value f k+ on the right. Doing so, we will have the desired continuous second derivative. If we also construct the cubic polynomial to have zero values at x k and x k+, then adding it in will not spoil the agreement with the tabulated functional values f k and f k+ at the endpoints x k and x k+. A little side calculation shows that there is only one way to arrange this construction, namely replacing equation (3.) by where A and B are defined as before and f = Af k + Bf k+ + C f k + D f k+ (3.3) C (A3 A)(x k+ x k ) 2, D (B3 B)(x k+ x k ) 2 (3.4) Note that since A and B are linearly dependent on x, C and D (through A and B) have cubic x-dependence. We can readily check that f is in fact the second derivative of the new interpolating polynomial. We take derivatives of equation (3.3) with respect to x, using the definitions of A, B, C and D to compute da/dx, db/dx, dc/dx and dd/dx. The result is for the first derivative, and d f dx = f k+ f k 3A2 (x k+ x k ) f k x k+ x k + 3B2 (x k+ x k ) f k+ (3.5) d 2 f dx 2 = Af k + Bf k+ (3.) for the second derivative. Since A = at x k and A = 0 at x k+, and B = 0 at x k and B = at x k+, (3.) shows that f is just the tabulated second derivative, and also that the second derivative will be continuous across the boundary between two intervals, say (x k, x k ) and (x k, x k+ ). The only problem now is that we supposed the f k s to be known, when actually, they are not. However, we have not yet required that the first derivative, computed from (3.5), be continuous across the boundary between two intervals. The key idea of a cubic spline is to require this continuity and to use it to get equations for the second derivatives f k. The required equations are obtained by setting equation (3.5) evaluated for x = x k in the interval (x k, x k ) equal to the same equation evaluated for x = x k but in the interval (x k, x k+ ). With some rearrangement, this give (for k =,..., N ) x k x k f k + x k+ x k f k 3 + x k+ x k f k+ = f k+ f k f k f k (3.7) x k+ x k x k x k These are N linear equations in the N + unknowns f i, i = 0,..., N. Therefore there is a two-parameter family of possible solutions. For a unique solution, we need to specify further conditions, typically taken as boundary conditions at x 0 and x N. The most common ways of doing this is to set both f 0 and f N equal to zero, giving the so-called 3

natural cubic spline, which has zero second derivatives on both of its boundaries. Now we have the solution for f k, k = 0,..., N, we can substitute back into equation (3.3) to give the cubic interpolation formula in each interval (x k, x k+ ). This is probably the easiest approach to use if you want to write a numerical code to calculate cubic spline interpolation. So what are B-splines? In the above approach, we started with a functional form for the interpolation formula ( f = Af k + Bf k+ + C f k + D f k+ ), and had to use constraints ( f continuous at interval boundaries) to solve for f k. As mathematicians, we like to build things up from fundamental units. In this case, we can use a set of piecewise cubic polynomials defined on some sub-interval of [x 0, x N ], which are by construction continuous through to second derivative at the boundaries of intervals. They would form a set of basis functions, since linear combinations of these functions would also satisfy the continuity properties at the boundaries between adjacent intervals. To construct the cubic spline over the whole range [x 0, x N ], we would then simply need match the sum of the basis functions with tabulated values of f i at the interpolating nodes x i, i = 0,..., N. The B-splines are the basis functions that satisfy our continuity conditions. If the interpolation is over the region [x 0, x N ], then B 0 is defined by B 0 (x) = 0 x x 0 2h (2h + (x x 0)) 3 x 0 2h x x 0 h 2h 3 3 2 (x x 0) 2 (2h + (x x 0 )) x 0 h x x 0 2h 3 3 2 (x x 0) 2 (2h (x x 0 )) x 0 x x 0 + h (2h (x x 0)) 3 x 0 + h x x 0 + 2h 0 x x 0 + 2h (3.8) where h = x k+ x k = x N x 0 N is the width between interpolating nodes (assumed to be equal). Note that B 0 has non-zero values over four intervals. We can easily check that B 0 satisfy the continuity conditions at the boundaries of the intervals, namely, B 0, B 0 and B 0 are continuous at 2h, h, 0, h and 2h. B k- (x) B k (x) B k+ (x) B k+2 (x) x k-2 x k- x k x k+ x k+2 Figure 3.3: Cubic B-spline 4

We define B k (x) = B 0 (x kh + x 0 ) (i.e. the B 0 functions shifted to the right by k nodes). This is illustrated in Figure 3.3, which shows B k, B k, B k+ and B k+2. Note that all four of these basis functions are non-zero in the interval (x k, x k+ ). The cubic spline function, S 3 (x) in [x 0, x N ] is then written as the linear combination of the B k s: S 3 (x) = N+ a k B k (x) (3.9) k= The sum is from to N +, since B is nonzero in the interval (x 0, x ), and B N+ is nonzero in the interval (x N, x N ), so we need to take into account these functions. Is the problem now completely specified? We know that we need 4 conditions (coefficients) to uniquely specify a cubic, and in [x 0, x N ] there are N intervals, so altogether a total of 4N conditions are needed. The continuity conditions are automatically satisfied in the N interior points, since the B k s satisfy the continuity conditions (that s 3(N ) conditions). The other requirement is that S 3 must match the tabulated points, i.e. S 3 (x k ) = f(x k ), for k = 0,..., N (N + conditions). So there are two unspecified conditions, and like before we can take S 3 (x 0 ) = S 3 (x N ) = 0 (i.e., curvature equals zero at the boundaries, natural spline ). Now evaluating S 3 (x) at x k : S 3 (x k ) = a k B k (x k ) + a k B k (x k ) + a k+ B k+ (x k ) + a k+2 B k+2 (x k ) = f(x k ) (3.0) (all other B k s are zero) From the definitions of B k (x) and B 0 (x), we find B k (x k ) = B 0 (x 0 ) = 2h3 3 B k (x k ) = B 0 (x 0 + h) = h3 B k+ (x k ) = B 0 (x 0 h) = h3 B k+2 (x k ) = B 0 (x 0 2h) = 0 Substituting into (3.0), we get the recurrence relation for the coefficients a k a k + 4a k + a k+ = h 3 f(x k) (3.) for k = 0,..., N. What happens at the boundaries? Remember we set S 3 (x 0 ) = S 3 (x N ) = 0. Differentiating S 3 (x) gives So we need to find the second derivatives of B k. S 3 N+ (x) = a k B k (x) (3.2) k= 5

Differentiating B 0 twice, we get 0 (x) = B 0 x x 0 2h 2h + (x x 0 ) x 0 2h x x 0 h 2h 3(x x 0 ) x 0 h x x 0 2h + 3(x x 0 ) x 0 x x 0 + h 2h (x x 0 ) x 0 + h x x 0 + 2h 0 (3.3) So as before, we have 0 = S 3 (x 0 ) = a B (x 0 ) + a 0 B 0 (x 0 ) + a B (x 0 ) + a 2 B 2 (x 0 ) = a B 0 (x 0 + h) + a 0 B 0 (x 0 ) + a B (x 0 h) = a h 2ha 0 + a h = a 2a 0 + a (3.4) From (3.0), we have a + 4a 0 + a = h 3 f(x 0) (3.5) Subtract (3.4) from (3.5), we obtain Similarly, we find We now can write a matrix equation to solve for the coefficients a 0,..., a N : a 0 = h 3 f(x 0) (3.) a N = h 3 f(x N) (3.7) 0 0 4 0 0 0 0 4 0 0 0 0 4 0 0 0 a 0 a a 2 a N a N = h 3 f(x 0 ) f(x ) f(x 2 ) f(x N ) f(x N ) (3.8) This set of equations is tridiagonal and can be solved in O(N) operations by the tridiagonal algorithm. The final two coefficients to completely determine S 3 are a = 2a 0 a a N+ = 2a N a N (3.9)

Worked Example Now we can try to apply cubic spline interpolation to the function f(x) = +x 2 sin x, 4 x 4, with the 9 nodes x k = 4, 3, 2,, 0,, 2, 3, 4. Here x 0 = 4, x N = 4 and h =. So from equation (3.8), the basis function B 0 (x) is given by B 0 (x) = 0 x (x + )3 x 5 2 3 2 (x + 4)2 (x + ) 5 x 4 2 3 + 2 (x + 4)2 (x + 2) 4 x 3 (x + 2)3 3 x 2 0 x 2 (3.20) The cubic spline S 3 (x) = 9 k= a k B k (x), and we can find the coefficients a k, k = 0,,..., 8 by solving the linear system 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 a 0 a a 2 a 3 a 4 a 5 a a 7 a 8 f( 4) f( 3) f( 2) f( ) = f(0) f() f(2) f(3) f(4) (3.2) This can be readily solved to give a 0,..., a 8. Additionally, a = 2a 0 a, a 9 = 2a 8 a 7. The numerical values are a = 0.05849270, a 9 = 0.05849270. a 0 = 0.03433382 a = 0.0080972 a 2 = 0.2539589 a 3 = 0.59975424 a 4 = 4.9493829 0 7 a 5 = 0.59975424 a = 0.2539589 a 7 = 0.0080972 a 8 = 0.0343382 We shall just show the construction of S 3 (x) in the interval [ 4, 3]: (3.22) S 3 (x) = a B (x) + a 0 B 0 (x) + a B (x) + a 2 B 2 (x) = a B 0 (x + ) + a 0 B 0 (x) + a B 0 (x ) + a 2 B 0 (x 2) = a ((x + ) + 2)3 + a 0 ( 2 3 + 2 (x + 4)2 (x + 2)) +a ( 2 3 2 ((x ) + 4)2 ((x ) + )) + a 2 ( ((x 2) + )3 ) So need to work out the functional form of S 3 (x) for each interval. The whole process is somewhat tedious, but as shown in the next figure, it does give the right result! 7

0.5-4 -3-2 - 2 3 4-0.5 Figure 3.4: Here s the function f(x) = +x 2 sin x again (dotted), and now with the cubic spline interpolation for [ 4, 3] superimposed on top. 8