# B-splines and probability

If one picks two real numbers ${X_1,X_2}$ from the interval ${[0,1]}$ (independent, uniformly distributed), their sum ${S_2=X_1+X_2}$ has the triangular distribution.

The sum ${S_3}$ of three such numbers has a differentiable probability density function:

And the density of ${S_4=X_1+X_2+X_3+X_4}$ is smoother still: the p.d.f. has two
continuous derivatives.

As the number of summands increases, these distributions converge to normal if they are translated and scaled properly. But I am not going to do that. Let’s keep the number of summands to four at most.

The p.d.f. of ${S_n}$ is a piecewise polynomial of degree ${n-1}$. Indeed, for ${S_1=X_1}$ the density is piecewise constant, and the formula

$\displaystyle S_n(x) = \int_{x-1}^x S_{n-1}(t)\,dt$

provides the inductive step.

For each ${n}$, the translated copies of function ${S_n}$ form a partition of unity:

$\displaystyle \sum_{k\in\mathbb Z}S_n(x-k)\equiv 1$

The integral recurrence relation gives an easy proof of this:

$\displaystyle \sum_{k\in\mathbb Z}\int_{x-k-1}^{x-k} S_{n-1}(t)\,dt = \int_{\mathbb R} S_{n-1}(t)\,dt = 1$

And here is the picture for the quadratic case:

A partition of unity can be used to approximate functions by piecewise polynomials: just multiply each partition element by the value of the function at the center of the corresponding interval, and add the results.

Doing this with ${S_2}$ amounts to piecewise linear interpolation: the original function ${f(x)=e^{-x/2}}$ is in blue, the weighted sum of hat functions in red.

With ${S_4}$ we get a smooth curve.

Unlike interpolating splines, this curve does not attempt to pass through the given points exactly. However, it has several advantages over interpolating splines:

• Is easier to calculate; no linear system to solve;
• Yields positive function for positive data;
• Yields monotone function for monotone data

This site uses Akismet to reduce spam. Learn how your comment data is processed.