# Between a cubic spline and the line of regression

Given some data points ${(x_i,y_i)}$ one can devise various functions that extract information from them. Such as

A smoothing spline is something in between the above extremes: it insists on neither being a line (i.e., having zero second derivative) nor on passing through given points (i.e., having zero residuals). Instead, it minimizes a weighted sum of both things: the integral of second derivative squared, and the sum of residuals squared. Like this plot, where the red points are given but the spline chooses to interpolate the green ones:

I’ll demonstrate a version of a smoothing spline that might not be exactly canonical, but is very easy to implement in Matlab or Scilab (which I prefer to use). As in Connecting dots naturally, assume the knots ${a=x_0 equally spaced at distance ${h=(b-a)/n}$. The natural cubic spline is determined by the values of its second derivative at ${x_1,\dots,x_{n-1}}$, denoted ${z_1,\dots,z_n}$. As explained earlier, these can be found from the linear system $\displaystyle \frac{h}{6}\begin{pmatrix} 4 & 1 & 0 & 0 & \ldots & 0 & 0 \\ 1 & 4 & 1 & 0 & \ldots & 0 & 0 \\ 0 & 1 & 4 & 1 & \ldots & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \ldots & \ldots \\ 0 & 0 & 0 & 0 & \ldots & 1 & 4 \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \\ \vdots \\ z_{n-1}\end{pmatrix} = - \frac{1}{h} \begin{pmatrix} \Delta^2 y_1 \\ \Delta^2 y_2 \\ \vdots \\ \Delta^2 y_{n-1}\end{pmatrix}$

where the column on the right contains the amounts by which the derivative of the piecewise linear interpolant through given points jumps at every knot. The notation ${\Delta^2y }$ means the second difference of ${(y_i)}$, for example ${\Delta^2y_1 =y_2-2y_1+y_0}$.

A smoothing spline is also a natural spline, but for a different set of points ${(x_i,\tilde y_i)}$. One has to find ${\tilde y_i}$ that minimize a weighted sum of ${\sum_i (\tilde y_i-y_i)^2}$ and of ${\int_a^b (f''(x))^2\,dx}$. The latter integral is easily computed in terms of ${z_i}$: it is equal to ${\frac{h}{3}\sum_{i=1}^{n}(z_i^2+z_{i-1}^2+z_iz_{i-1})}$. Since this quadratic form is comparable to ${\sum_{i=1}^{n}z_i^2}$, I’ll work with the latter instead.

The idea is to set up an underdetermined system for ${z_i}$ and ${\tilde y_i-y_i}$, and let Scilab find a least squares solution of that. Let’s introduce a weight parameter ${\lambda>0}$ that will determine the relative importance of curvature vs residuals. It is convenient to let ${\tilde y_i=y_i+\lambda h^2 \delta_i}$, so that both ${\delta_i}$ and ${z_i}$ (second derivative) scale the same way. The terms ${\delta_i}$ contributes to the linear system for ${z_i}$, since the right hand side now has ${\tilde y_i}$ instead of ${y_i}$. This contribution is ${- \lambda h \Delta^2 \delta}$. Moving it to the left hand-side (since ${\delta_i}$ are unknowns) we obtain the following system. $\displaystyle \begin{pmatrix} A & | & B \end{pmatrix} \begin{pmatrix} z \\ \delta \end{pmatrix} = - \frac{1}{h} \Delta^2 y$

where ${A}$ is the same tridiagonal matrix as above, and ${B}$ is the rectangular Laplacian-type matrix $\displaystyle B = \frac{\lambda}{h} \begin{pmatrix} -1 & 2 & -1 & 0 & \ldots & 0 & 0 \\ 0 & -1 & 2 & -1 & \ldots & 0 & 0 \\ 0 & 0 & -1 & 2 & \ldots & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \ldots & \ldots \\ 0 & 0 & 0 & 0 & \ldots & 2 & -1 \end{pmatrix}$

All in all, the system has ${2n }$ unknowns ${z_1,\dots,z_{n-1},\delta_0,\dots,\delta_n}$, and ${(n-1)}$ equations, reflecting the continuity of first derivative at each interior knot. The lsq command of Scilab finds the solution with the least sum of squares of the unknowns, which is what we are after.

Time for some examples. ${\lambda=0}$ and ${\lambda=0.05}$ can be seen above. Here are more:

As ${\lambda}$ increases, the spline approaches the regression line:

Finally, the Scilab code. It is almost the same as for natural spline; the difference is in five lines from B=... to newy=... The code after that is merely evaluation and plotting of the spline.

a = 0; b = 10
y = [3 1 4 1 5 9 2 6 5 3 5]
lambda = 0.1

n = length(y)-1
h = (b-a)/n
jumps = diff(y,2)/h
A = (h/6)*(diag(4*ones(1,n-1))+diag(ones(1,n-2),1)+diag(ones(1,n-2),-1))

B = lambda/h*(diag(-2*ones(1,n+1))+diag(ones(1,n),1)+diag(ones(1,n),-1))
C = [A, B(2:$-1,:)] sol = lsq(C,-jumps')' z = [0,sol(1:n-1),0] newy = y + lambda*h^2*sol(n:$)

allx = []
spl = []
for i=1:n
xL = a+h*(i-1)
xR = a+h*i
x = linspace(xL,xR,100)
linear = newy(i)*(xR-x)/h + newy(i+1)*(x-xL)/h
correction = ((z(i+1)+2*z(i))*(xR-x)+(2*z(i+1)+z(i))*(x-xL)).*(xR-x).*(x-xL)/(6*h)
allx = [allx, x]
spl = [spl, linear+correction]
end

plot(allx, spl)
plot(a:h:b, newy, 'g*')
plot(a:h:b, y, 'r*')

This site uses Akismet to reduce spam. Learn how your comment data is processed.