Boundary value problems: not all explicit solutions are useful

Consider this linear differential equation: {y''(t) + 4y'(t) + 2t y(t) = 7} with boundary conditions {y(0)=1} and {y(1)=0}. Nothing looks particularly scary here. Just one nonconstant coefficient, and it’s a simple one. Entering this problem into Wolfram Alpha produces the following explicit solution:

1_4_wa_solution

I am not sure how anyone could use this formula for any purpose.

Let us see what simple linear algebra can do here. The differential equation can be discretized by placing, for example, {4} equally spaces interior grid points on the interval: {t_k = k/5}, {k=1, \dots, 4}. The yet-unknown values of {y} at these points are denoted {y_1,\dots, y_4}. Standard finite-difference formulas provide approximate values of {y'} and {y''}:

{\displaystyle y'(t) \approx \frac{y(t+h) - y(t-h)}{2h}}

{\displaystyle y''(t) \approx \frac{y(t+h) - 2y(t) + y(t-h)}{h^2}}

where {h} is the step size, {1/5} in our case. Stick all this into the equation: we get 4 linear equations, one for each interior point. Namely, at {t_1 = 1/5} it is

{\displaystyle \frac{y_2 - 2y_1 + 1}{(1/5)^2} + 4 \frac{y_2 - 1}{2/5} + 2\cdot \frac15 y_1 = 7 }

(notice how the condition {y(0)=1} is used above), at {t_2 = 2/5} it is

{\displaystyle \frac{y_3 - 2y_2 + y_1}{(1/5)^2} + 4 \frac{y_3 - y_1}{2/5} + 2 \cdot \frac25 y_2 = 7 }

and so on. Clean up this system and put it in matrix form:

{\displaystyle \begin{pmatrix} -49.6 & 35 & 0 & 0 \\ 15 & -49.2 & 35 & 0 \\ 0 & 15 & -48.8 & 35 \\ 0 & 0 & 15 & -48.4 \end{pmatrix} \vec y = \begin{pmatrix} -8 \\ 7 \\ 7 \\ 7 \end{pmatrix} }

This isn’t too hard to solve even with pencil and paper. The solution is

{\displaystyle y = \begin{pmatrix} -0.2859 \\ -0.6337\\ -0.5683\\ -0.3207 \end{pmatrix}}

It can be visualized by plotting 4 points {(t_k, y_k)}:

dots
Discrete solution

Not particularly impressive is it? And why are all these negative y-values in a problem with boundary condition {y(0)=1}? They do not really look like they want to approach {1} at the left end of the interval. But let us go ahead and plot them together with the boundary conditions, using linear interpolation in between:

linear_interp
Linear algebra + linear interpolation

Or better, use cubic spline interpolation, which only adds another step of linear algebra (see Connecting dots naturally) to our computations.

cubic_n
Same points, cubic spline interpolation

This begins to look believable. For comparison, I used a heavier tool: BVP solver from SciPy. Its output is the red curve below.

comparison_s
Cubic spline and BVP solver

Those four points we got from a 4-by-4 system, solvable by hand, pretty much tell the whole story. At any rate, they tell a better story than the explicit solution does.

Graphics made with: SciPy and Matplotlib using Google Colab.

IVP vs BVP

A typical engineering-oriented course in ordinary differential equations focuses on solving initial value problems (IVP): first by elementary methods, then with power series (if nobody updated the syllabus since 1950s), then with the Laplace transform. One also considers the stability of equilibrium solutions and draws various diagrams to show how other solutions flow around equilibria. And then the semester is over.

And so, the boundary value problems (BVP) are likely to go unnoticed. Let’s compare:

  • IVP answers the question: if I aim at this angle, who will be shot?
  • BVP answers the question: how should I aim to shoot this guy?

Hm, which question is more practical?

  • IVP has a unique solution, as long as the functions involved are not too bad. This was observed already by the Cheshire Cat: you’re sure to get somewhere if you only walk long enough.
  • BVP can easily have no solutions, more than one, or infinitely many. The Scholarpedia article on BVP makes this fact intuitive with the example of the motion of a projectile: there may be one, two, or zero ways to hit the target, depending on how far it is.
From Scholarpedia, http://www.scholarpedia.org/article/Boundary_value_problem

The stability is also quite different for IVP and BVP. For example,

  • the IVP x''=x, x(0)=0, x'(0)=0 has a unique solution x(t)\equiv 0, which is unstable
  • the BVP x''=x, x(0)=0, x(1)=0 has a unique solution x(t)\equiv 0, which is stable

The instability of IVP means that a small perturbation of the initial values such as \tilde x(0)=\epsilon, \tilde x'(0)=0, will produce a solution \tilde x that moves away from x. Indeed, \tilde x(t)=\epsilon(e^t+e^{-t})/2 and the difference |\tilde x-x| grows exponentially in time. One can also see this instability without solving the equation. Just convert it into a 2×2 system of 1st order ODE and find that the Jacobian matrix \displaystyle \begin{pmatrix} 0 & 1 \\ 1& 0\end{pmatrix} has eigenvalues \pm 1. The positive eigenvalue makes the equilibrium unstable.

But if we perturb the boundary values, for example \tilde x(0)=\epsilon, \tilde x(1)=0, the new solution is \tilde x(t)=\epsilon(e^{1-t}-e^{t-1})/(e-e^{-1}). The plot is below: the important this is that |\tilde x-x|\le \epsilon, and this would be true even if we considered the BVP on the interval [0,1000] instead of [0,1].

Perturbed BVP (epsilon=0.1)

Same happens if the boundary derivatives are prescribed: for example, \tilde x'(0)=0.1 and \tilde x'(1)=-0.2:

Another perturbed BVP

Notice how the perturbed solution tries to approach the unperturbed x(t)=0 in the middle of the plot.

Where does this BVP stability come from? Turns out that the equation x''=x is the Euler-Lagrange equation for the energy functional \displaystyle E(x)=\int_0^1 ((x')^2+x^2)\,dt. Although the Wikipedia page explains why, let’s verify this directly. First, perform the variation by replacing x with x+\epsilon \eta, where \eta is some small “bump” function that vanishes near the endpoints. We get

\displaystyle E(x+\epsilon \eta)=\int_0^1 ((x'+\epsilon\eta')^2+(x+\epsilon\eta)^2)\,dt

Expand, take the first derivative with respect to \epsilon, then equate it to zero at \epsilon=0 (if x is a critical point, the derivative must be zero.) One notices that all this stuff amounts to keeping only the terms in which \eta appears to first power:

\displaystyle \int_0^1 (x'\eta'+x\eta)\,dt=0

In the term x'\eta' we want to move the derivative from \eta onto x. That is, integrate by parts:

\displaystyle \int_0^1 (-x''\eta+x\eta)\,dt=0

Now \eta factors out, and since it was arbitrary, the Euler-Lagrange equation is -x''+x=0, which is what we had. This explains why the solution was slightly bent toward the x=0 axis: it tried to minimize the integral x^2 while keeping the first derivative small, as to not have a large term (x')^2.

The boundary conditions determine a space V of all functions that satisfy them. The energy functional \displaystyle E(x)=\int_0^1 ((x')^2+x^2)\,dt is a strictly convex function from V to real numbers. Therefore, it has only one critical point on V which is the global minimum of energy, and therefore a stable equilibrium of whatever object is described by the energy E.