Green envelopes

Green functions of the Laplacian are bounded in one dimensional domains, a reflection of the fact that square integrability of the derivative implies boundedness in one dimension but not in two or more. So, it’s possible to say something about the envelope {\sup_y g(x,y)} obtained by taking the supremum over source location y.

In order to satisfy the distributional equation {-\frac{d^2}{dx^2}g(x,y) = \delta_{x=y}}, Green functions have a corner at y, where the derivative jumps down by 1. Combining this property with the boundary conditions identifies the function. Let’s use the interval (0,1) as the domain.

The Dirichlet boundary condition {g(0,y) = 0 = g(1,y)} leads to the formula {g(x,y)=\min(x,y)-xy}:

green10

The upper envelope is the parabola {x(1-x)}. This fact becomes more evident if we use more functions.

green200
Dirichlet envelope

Mixed Dirichlet-Neumann conditions {g(0,y) = 0 = g_x'(1,y)} yield an even simpler formula {g(x,y) = \min(x,y)}. The envelope is boring in this case:

greendn50
Dirichlet-Neumann envelope

The pure Neumann condition {g_x'(0,y) = 0 = g_x'(1,y)} requires an adjustment to the definition of Green function, since it is incompatible with {-\frac{d^2}{dx^2}g(x,y) = \delta_{x=y}}. Instead, let’s ask for {-\frac{d^2}{dx^2}g(x,y) = \delta_{x=y} - c} where the constant c is the reciprocal of the length of the interval. This ensures that the integral of the second derivative is zero. Also, it is more convenient to use the interval (-1,1) here. The Neumann Green function for this interval is {g(x,y) = (x^2+y^2)/4 - |x-y|/2}, up to an additive constant.

greenn50large
Neumann envelope

So the envelope is also determined up to a constant. On the picture above it is {x^2/2}.

Finally, let’s consider the periodic condition {g(0,y) = g(1,y)}, {g_x'(0,y) = g_x'(1,y)}. As with the Neumann condition, the definition is adjusted to allow the second derivative integral be zero. Using the interval (-1, 1) we get {g(x,y) = (1+x\wedge y)(1-x\vee y)/2 + (x^2+y^2)/4} using the notation {x\wedge y = \min(x,y)} and {x\vee y = \max(x,y)} for brevity. Note that the first term, {(1+x\wedge y)(1-x\vee y)/2}, is the Dirichlet Green function for this interval. The additional quadratic term brings the integral of the second derivative to zero, which ensures the periodic condition for the first derivative. And the upper envelope is, up to a constant…

greenp50
Periodic envelope

a constant. Not exciting but makes perfect sense, considering that we are dealing with the Green function of a circle, where moving the source amounts to shifting the function.

Oscillatory explosion

Nonlinear differential equations don’t necessarily have globally defined solutions, even if the equation is autonomous (no time is involved explicitly). The simplest example is {y'=y^2} with initial condition {y(0)=1}: the solution is {y(t) = 1/(1-t )}, which ceases to exist at {t=1}, escaping to infinity.


This kind of behavior can often be demonstrated without solving the ODE. Consider {y''=y^3} with {y(0)=1}, {y'(0)=0}. I’ve no idea what the explicit solution is, but it’s clear that the quantity {E(t) = \frac12 (y')^2 - \frac14 y^4} remains constant: indeed, {E'(t) = y''y' - y^3 y' = 0}. (Here, {E} is analogous to the sum of kinetic and potential energy in physics.)

The solution {y} is increasing and convex. Let {t_n} be such that {y(t_n)=n}, e.g., {t_1=0}. The invariance of {E} yields {y'(t_n) = 2^{-1/2} (n^2 - 1)}. By the mean value theorem, {t_{n+1}-t_n \le \sqrt{2}(n^2-1)^{-1}} for {n=2,3,\dots}. Since the series on the right converges, the limit {T = \lim_{n\rightarrow\infty }t_n} is finite; this is the blow-up time.


But no blow-up occurs for the equation {y''=-y^3}, where the nonlinear term pushes back toward equilibrium. Indeed, the invariant energy is now {E= \frac12 (y')^2 + \frac14 y^4}, which implies that {y} and {y'} stay bounded. In the phase space {(y,y')} the solution with initial values {y(0)=1}, {y'(0)=0} traces out this curve:

Closed trajectory in the phase space (y,y')
Closed trajectory in the phase space (y,y’)

Accordingly, the solution is a periodic function of {t} (although this is not a trigonometric function):

Periodic solution, globally defined
Periodic solution, globally defined

Everything so far has been pretty straightforward. But here is a stranger animal: {y'''=-y^3}. As in the previous example, nonlinear term pushes toward equilibrium. Using initial conditions {y(0)=1}, {y'(0)=y''(0)=0}, I get this numerical solution up to time {T=5}:

0<t<5
0<t<5

As in the previous example, {y} oscillates. But the speed and amplitude of oscillation are increasing.

0<t<5.4
0<t<5.4
0<t<5.7
0<t<5.7

Rapidly increasing:

0<t<5.9
0<t<5.9
0<t<6
0<t<6

In the plane {(y,y')} the solution spirals out:

The plane (y,y')
The plane (y,y’)

The plots make it clear that the solution ceases to exist in finite time, but I don’t have a proof. The issue is that the function {y} does not escape to infinity in just one direction. Indeed, if {y(t_0)>0}, then regardless of the values of {y'} and {y''} at {t_0}, the negative third derivative will eventually make the function {y} decrease, and so {y} will attain the value {0} at some {t_1>t_0}. After that, the third derivative is positive, guaranteeing the existence of time {t_2>t_1} when {y} returns to {0} again. I haven’t succeeded in proving that the limit {\lim t_n} is finite, giving the time when oscillatory explosion occurs.


The plots were made in SageMathCloud:

var('t y y1 y2')
P = desolve_system_rk4([y1,y2,-y^3],[y,y1,y2], ics=[0,1,0,0],ivar=t,end_points=5,step=0.01)
Q=[ [i,j] for i,j,k,l in P]
list_plot(Q, plotjoined=True)

Unreasonable effectiveness of Runge-Kutta 2

I used to give this example of an initial-value problem whenever numerical solutions of ODE came up in class:

\displaystyle    y' = -y^2, \quad y(0)=1

The exact solution is {y(t)=1/(1+t)}. Using Euler’s method with step {h=1} (admittedly large) we get, starting with {y_0=1}:

\displaystyle    y_1 = y_0 + h (-y_0^2) = 0,\quad y_2 = y_1 + h (-y_1^2) = 0, \dots

The numerical solution drops down to zero at once and stays there. Not good at all.

Euler's method
Euler’s method

The trapezoidal version of RK2 (also known as improved Euler, or Heun’s method) treats the above value of {y_1} merely as prediction {\tilde y_1}, and computes the actual {y_1} using the average of the slopes at {y_0} and {\tilde y_1}:

\displaystyle    y_1 = y_0 + \frac{h}{2} (-y_0^2 - \tilde y_1^2) = 1 + \frac12 (-1 - 0 ) = \frac12

… which is right on target. Let’s do it again: the prediction {\tilde y_2 = y_1 + h (-y_1^2) = 1/4} and correction

\displaystyle    y_2 = y_1 + \frac{h}{2} (-y_1^2 - \tilde y_2^2) = 1 + \frac12 \left(-\frac{1}{4}-\frac{1}{16} \right) =    \frac{11}{32}

… which is {\approx 0.34}, compared to the exact solution {1/3}. Nice!

Trapezoidal RK2
Trapezoidal RK2

I like the example, because all numbers are manageable with mental arithmetics for two steps. And it’s clear that {11/32} is pretty close to {1/3 = 11/33}.


But the fact that the relative error stays below {3.7\%} throughout the computation despite the large step size {h=1} seems… too good to be true. Consider that for the exact solution, the expression {y^2} can differ by the factor of {4} between two {t}-values at distance {1}.

The midpoint version of RK2, which generally performs about as well as the trapezoidal version, is nowhere as close in this example. Indeed, using the same data as above, we get

\displaystyle    y_1 = y_0 - h \left(\frac{y_0+\tilde y_1}{2}\right)^2 = \frac{3}{4}

and so forth: the relative error reaches {59\%} at the second step. That’s 16 times worse than the trapezoidal version.

Midpoint RK2
Midpoint RK2

What happens here has less to do with numerical analysis than with algebra of rational functions. Using {h=1} in trapezoidal RK2, we are in effect iterating the function

\displaystyle \varphi(y) = y + \frac12 (-y^2-(y-y^2)^2) = y-y^2+y^3-\frac12 y^4

The exact solution would be obtained by iterating

\displaystyle \Phi (y) = \frac{y}{1+y} = y-y^2+ y^3 - y^4 + \dots

Two functions just happen to coincide at {y=1}, which is our starting point here.

Two iterands
Two iterands

From there we get to {0.5}, and on {[0,0.5]} they are really close anyway, due to another coincidence: the truncation error {|\varphi(y)-\Phi(y)|} is {O(y^4)} instead of {O(y^3)} as it is normally for second-order methods.

The midpoint method with {h=1} amounts to iterating

\displaystyle \psi(y) = y - \left(\frac{y+y-y^2}{2} \right)^2 = y-y^2+y^3-\frac14 y^4

which is not substantially further away from {\Phi}, but does not enjoy the lucky coincidence at {y=1}.


The tables are turned with the initial value {y_0=3}. The exact solution is {y(t) = 3/(1+3t)}, which drops sharply from {t=0} to {t=1}; its slope decreases by the factor of {16} during one step. Still, midpoint-RK2 does a decent job with {h=1}:

Midpoint RK2 starting with y=3
Midpoint RK2 starting with y=3

while trapezoidal RK2 outputs {y_1=-19.5}, {y_2=-80110}, {y_3 = -2\cdot 10^{19}} and promptly overflows.


With a reasonable step size, like {h=0.1}, normality is restored: both methods perform about equally well, with {0.13\%} error for trapezoidal and {0.21\%} for midpoint.

Nested powers and collisions: final chapter

This is where the story of nested powers of metric spaces and of collision-inducing ODE reaches its logical conclusion.

Theorem 1. For every {d\ge 1} and every {n\ge 2} there exists a Lipschitz retraction {r\colon {\mathbb R}^d(n)\rightarrow {\mathbb R}^d(n-1)}.

Compared to previous posts, I changed notation to improve readability: {X(n)} now stands for the set of all nonempty subsets of {X} with at most {n} elements. The metric on {X(n)} is the Hausdorff metric {d_H}. The map {r} was defined in Collisions II as follows:

Given a set {A\in {\mathbb R}^d(n)\setminus {\mathbb R}^d(n-1)}, enumerate its elements {a_1,\dots,a_n} and consider the initial-value problem

\displaystyle    \dot{x}_i = \sum_{j\ne i}\frac{x_j-x_i}{|x_j-x_i|}, \quad x_i(0)=a_i, \quad i=1,\dots,n

Define {r(A) = \{x_i(t_c)\}} where {t_c} be the time of the first collision between particles in this system. Also define {r(A)=A} when {A\in{\mathbb R}^d(n-1)}.

I claim that

\displaystyle    d_H(r(A),r(B))\le (2n-1)\sqrt{n} \,d_H(A,B)   \ \ \ \ \ (1)


for all {A,B\in{\mathbb R}^d(n)}. Interestingly, the Lipschitz constant does not depend on the dimension of ambient space. (Well, I’m not so sure it has to depend on cardinality either.)

Proof. Let

\displaystyle \delta(A)=\begin{cases} \min_{i\ne j}|a_i-a_j|,\quad  A\in{\mathbb R}^d(n)\setminus {\mathbb R}^d(n-1) \\ 0,\quad \quad A\in \mathbb R^d(n-1) \end{cases}

It is easy to check that the function {A\mapsto \delta(A)} is {2}-Lipschitz on {{\mathbb R}^d(n)}.

Earlier I proved that { t_c\le \dfrac{1}{2}\delta(A)} and, as a consequence,

\displaystyle    d_H(r(A),A)\le \frac{n-1}{2} \delta(A)   \ \ \ \ \ (2)

Let {\rho=d_H(A,B)}. If {\delta(A)+\delta(B)\le 4\,\rho}, then (2) already gives (1). From now on assume {\delta(A)> 2\,\rho}, hence {\delta(B)>0}. The geometric meaning of these inequalities is that the points of {A} are spread out in space, yet each of them is close to a point of {B}. Therefore, we can enumerate them {a_i} and {b_i} in such a way that {|a_i-b_i|\le \rho} for all {i}.

Let {(x_i)} and {(y_i)} be the solutions of our ODE with initial data {(a_i)} and {(b_i)}, respectively. Here comes a monotonicity lemma.

Lemma 2. {\sum_{i=1}^n |x_i-y_i|^2} is nonincreasing with time {t}.

Proof: Encode the sequence {(x_i)} by a single point {\mathbf{x}\in {\mathbb R}^{nd}}. This point is moved by the gradient flow of the convex function {\Phi(\mathbf{x}) = \sum_{i<j}|x_i-x_j|}. Since the gradient of a convex function is monotone, we have {\langle {\mathbf{ \dot x}}-{\mathbf{\dot y}}, \mathbf{x}-\mathbf{y}\rangle\le 0}. It follows that {|\mathbf{x}-\mathbf{y}|^2} is nonincreasing. \Box

By Lemma 2, {|x_i-y_i|\le \sqrt{n}\,\rho } holds for all {i} until a collision occurs in either system. We may assume that {x}-particles collide first. Then, at the time of their collision they form the set {r(A)} while the {y}-particles form some set {B'}, with {d_H(r(A),B')\le \sqrt{n}\,\rho}. Since r(A) has fewer than n elements, {\delta(B')\le 2d_H(r(A),B')}. From (2) we obtain {d_H(r(B'),B')\le 2(n-1)\sqrt{n}\,\rho}. Note that {r(B')=r(B)} because our ODE system is autonomous. The final estimate is {d_H(r(A),r(B))\le d_H(r(A),B')+d_H(r(B'),B')}, which is (1). {\Box}

Here is how a 20-point set gets retracted onto {{\mathbb R}^2(19)}:

20 particles, stopped
20 particles, stopped

The plot was produced by a version of the familiar Scilab routine, modified to stop the computation at first collision. It was called with

gravity3(grand(1,20,"unf",0,2),grand(1,20,"def"),300)
function gravity3(vx,vy,Nsteps)
    Npoints = length(vx);
    h = 0.01/Npoints;
    tolerance = h*(Npoints-1)/3;
    collided = 0;
    x = zeros(Nsteps,Npoints);
    y = zeros(Nsteps,Npoints);
    x(1,:) = vx;
    y(1,:) = vy;
    m = 1;
    while (m<Nsteps & collided==0) do
        x(m+1,:)= x(m,:);
        y(m+1,:)= y(m,:);
        for i = 1:Npoints
            for j = (i+1):Npoints
                v = [x(m,j),y(m,j)]-[x(m,i),y(m,i)];
                if norm(v) < tolerance then 
                    collided=1; 
                else
                    x(m+1,i) = x(m+1,i) + h*v(1)/norm(v);
                    x(m+1,j) = x(m+1,j) - h*v(1)/norm(v);
                    y(m+1,i) = y(m+1,i) + h*v(2)/norm(v);
                    y(m+1,j) = y(m+1,j) - h*v(2)/norm(v);
                end
            end
        end
    m = m+1;    
    end
    clf();
    plot(x(:,:),y(:,:),'o');
endfunction

Clustering by collisions

This construction looks like it should answer the question raised at the end of the previous post, but instead of checking whether it really works, I decided to post some pictures.

Given {n} distinct points {p_1,\dots,p_n} in a Euclidean space {{\mathbb R}^d}, consider the ODE system

\displaystyle    \dot{x}_i = \sum_{j\ne i}\frac{x_j-x_i}{|x_j-x_i|},\quad i=1,\dots,n

with the initial condition {x_i(0)=p_i}. The solution exists and is well-behaved until the first collision: {x_i=x_j} for some {i\ne j}. We can stop the process then (thus obtaining a set of at most {n-1} points) or remove the duplicates and restart. Let’s try the latter approach. To simplify things (at a cost), we can write the system as follows:

\displaystyle    \dot{x}_i = \sum_{x_j\ne x_j}\frac{x_j-x_i}{|x_j-x_i|},\quad i=1,\dots,n.

Now the solution exists for all times, but becomes stationary when all points come together. The cost to which I alluded above is that keeping multiple copies of collided points distorts the post-collision behavior, but I hope that the qualitative picture survives.

I used Scilab to plot the trajectories of this system for several initial configurations. For purposes of numeric computation, the ODE was modified once again:

\displaystyle    \dot{x}_i = \sum_{|x_j- x_j|>\epsilon }\frac{x_j-x_i}{|x_j-x_i|},\quad i=1,\dots,n.

where {\epsilon} is tolerance, related to the step size used by the solver (Euler’s method works fine).

Three points
Three points
Four points: vertices of a rectangle
Four points: vertices of a rectangle
Four points: vertices of a long rectangle
Four points: vertices of a long rectangle
Five points
Five points
Six points
Six points

In each case the particles obviously want to collide: the points placed nearby move toward each other. Indeed, suppose that {p_1} and {p_2} are much close to each other than to the other points. The ODE system yields

\displaystyle    \frac{d}{dt}(x_1-x_2) = -2\,\frac{x_1-x_2}{|x_1-x_2|} + \text{(small terms)}

where the smallness is due to the near cancellation of {\displaystyle \frac{x_j-x_1}{|x_j-x_1|}} and {\displaystyle \frac{x_j-x_2}{|x_j-x_2|}} for {j\ne 1,2}. As long as the sum of small terms is bounded by a constant less than {2}, the distance {|x_1-x_2|} will decrease at linear rate, heading toward zero.

Here is the (hastily written) scilab function. It takes three parameters: vector of x-coordinates, vector of y-coordinates, and the number of steps. I use 100 for the latter, increasing it if the points do not have enough time to collide. For example, try gravity([0,1,2,3],[2,3,0,0.5],100).


function gravity(vx,vy,Nsteps)
h = 0.01;
Npoints = length(vx);
tolerance = h*(Npoints-1)/2;
x = zeros(Nsteps,Npoints);
y = zeros(Nsteps,Npoints);
x(1,:) = vx;
y(1,:) = vy;
for m = 1:(Nsteps-1)
    x(m+1,:)= x(m,:);
    y(m+1,:)= y(m,:);
    for i = 1:Npoints
        for j = 1:Npoints
            v = [x(m,j),y(m,j)]-[x(m,i),y(m,i)];
            if norm(v) > tolerance then 
                x(m+1,i) = x(m+1,i) + h*v(1)/norm(v);
                y(m+1,i) = y(m+1,i) + h*v(2)/norm(v);
            end
        end
    end
end
clf();
plot(x(:,:),y(:,:));
endfunction

IVP vs BVP

A typical engineering-oriented course in ordinary differential equations focuses on solving initial value problems (IVP): first by elementary methods, then with power series (if nobody updated the syllabus since 1950s), then with the Laplace transform. One also considers the stability of equilibrium solutions and draws various diagrams to show how other solutions flow around equilibria. And then the semester is over.

And so, the boundary value problems (BVP) are likely to go unnoticed. Let’s compare:

  • IVP answers the question: if I aim at this angle, who will be shot?
  • BVP answers the question: how should I aim to shoot this guy?

Hm, which question is more practical?

  • IVP has a unique solution, as long as the functions involved are not too bad. This was observed already by the Cheshire Cat: you’re sure to get somewhere if you only walk long enough.
  • BVP can easily have no solutions, more than one, or infinitely many. The Scholarpedia article on BVP makes this fact intuitive with the example of the motion of a projectile: there may be one, two, or zero ways to hit the target, depending on how far it is.
From Scholarpedia, http://www.scholarpedia.org/article/Boundary_value_problem

The stability is also quite different for IVP and BVP. For example,

  • the IVP x''=x, x(0)=0, x'(0)=0 has a unique solution x(t)\equiv 0, which is unstable
  • the BVP x''=x, x(0)=0, x(1)=0 has a unique solution x(t)\equiv 0, which is stable

The instability of IVP means that a small perturbation of the initial values such as \tilde x(0)=\epsilon, \tilde x'(0)=0, will produce a solution \tilde x that moves away from x. Indeed, \tilde x(t)=\epsilon(e^t+e^{-t})/2 and the difference |\tilde x-x| grows exponentially in time. One can also see this instability without solving the equation. Just convert it into a 2×2 system of 1st order ODE and find that the Jacobian matrix \displaystyle \begin{pmatrix} 0 & 1 \\ 1& 0\end{pmatrix} has eigenvalues \pm 1. The positive eigenvalue makes the equilibrium unstable.

But if we perturb the boundary values, for example \tilde x(0)=\epsilon, \tilde x(1)=0, the new solution is \tilde x(t)=\epsilon(e^{1-t}-e^{t-1})/(e-e^{-1}). The plot is below: the important this is that |\tilde x-x|\le \epsilon, and this would be true even if we considered the BVP on the interval [0,1000] instead of [0,1].

Perturbed BVP (epsilon=0.1)

Same happens if the boundary derivatives are prescribed: for example, \tilde x'(0)=0.1 and \tilde x'(1)=-0.2:

Another perturbed BVP

Notice how the perturbed solution tries to approach the unperturbed x(t)=0 in the middle of the plot.

Where does this BVP stability come from? Turns out that the equation x''=x is the Euler-Lagrange equation for the energy functional \displaystyle E(x)=\int_0^1 ((x')^2+x^2)\,dt. Although the Wikipedia page explains why, let’s verify this directly. First, perform the variation by replacing x with x+\epsilon \eta, where \eta is some small “bump” function that vanishes near the endpoints. We get

\displaystyle E(x+\epsilon \eta)=\int_0^1 ((x'+\epsilon\eta')^2+(x+\epsilon\eta)^2)\,dt

Expand, take the first derivative with respect to \epsilon, then equate it to zero at \epsilon=0 (if x is a critical point, the derivative must be zero.) One notices that all this stuff amounts to keeping only the terms in which \eta appears to first power:

\displaystyle \int_0^1 (x'\eta'+x\eta)\,dt=0

In the term x'\eta' we want to move the derivative from \eta onto x. That is, integrate by parts:

\displaystyle \int_0^1 (-x''\eta+x\eta)\,dt=0

Now \eta factors out, and since it was arbitrary, the Euler-Lagrange equation is -x''+x=0, which is what we had. This explains why the solution was slightly bent toward the x=0 axis: it tried to minimize the integral x^2 while keeping the first derivative small, as to not have a large term (x')^2.

The boundary conditions determine a space V of all functions that satisfy them. The energy functional \displaystyle E(x)=\int_0^1 ((x')^2+x^2)\,dt is a strictly convex function from V to real numbers. Therefore, it has only one critical point on V which is the global minimum of energy, and therefore a stable equilibrium of whatever object is described by the energy E.

Uniqueness for an ODE system

Consider the following system of second-order equations

\displaystyle \begin{pmatrix} \ddot{x} \\ \ddot{y} \\ \ddot{z} \end{pmatrix} = A\begin{pmatrix}\sin x \\ \sin y \\ \sin z \end{pmatrix}

where x,y,z are functions of variable t. The matrix A is constant. Despite the linear-ish appearance, the system is nonlinear due to the presence of sines. We are unlikely to find the solution in analytic form; even this simplified problem leaves WolframAlpha stumped. So if you want to find a solution, you’ll have to do it numerically. But the theory of ODEs may still give us a qualitative description of solutions.

The first thing to recall is that all systems of ODE can be made first-order systems. We can simply introduce new unknown functions u=\dot x, v=\dot y, w=\dot z and rewrite the system as 6 equations with 6 unknown functions:

(*)\qquad \displaystyle \begin{pmatrix} \dot{u} \\ \dot{v} \\ \dot{w} \end{pmatrix} = A\begin{pmatrix}\sin x \\ \sin y \\ \sin z \end{pmatrix}, \qquad \dot x = u, \ \dot y=v, \ \dot z=w

Think of these equations as describing the motion of a “particle” (x,y,z,u,v,w) in 6-dimensional space \mathbb R^6. The system prescribes the velocity of the particle at every point: we visualize this by imagining a tiny arrow placed at every point of the space. (We are lucky that the system is autonomous: the time t does not appear explicitly. Otherwise we’d have to imagine that the arrows change their length and magnitude as time goes on. As it is, they are fixed in place.)

Here is a two-dimensional illustration: it corresponds to the system \dot x=y, \dot y=\sin x, which comes from the second-order ODE \ddot x=\sin x.

2D field plot, using http://www.falstad.com/vector/

We can notice three different zones in this graph:

  • If y is large and positive, the point (x,y) moves to the right with only a bit of wobbling. In terms of x, this means that if \dot x is large to begin with, the solution x(t) will keep on increasing indefinitely.
  • If y is large and negative, the point (x,y) moves to the left: so, x(t) will keep on decreasing indefinitely.
  • If |y| is small, oscillatory pattern emerges. It’s only in this zone that the sine term has substantial effect on the solutions.

Similar conclusions can be made about the original 6-dimensional system. One can further identify the equilibrium points, the type of each equilibrium (from the plot they look like saddles and centers), and the local stability of equilibria. But let’s first settle a more basic question; do we have uniqueness of solutions?

The precise form of this question is: given a initial value (x_0,y_0,z_0,u_0,v_0,w_0)\in\mathbb R^6, is there a unique function (x,y,z,u,v,w) that satisfies both the equation (*) and the initial condition? Since it’s tiresome to keep typing these six letters xyzuvw, let’s switch to vector notation: writing

\displaystyle \vec X=\begin{pmatrix} x \\ y \\z\\u\\v\\w\end{pmatrix},\qquad   \vec X_0=\begin{pmatrix} x_0 \\ y_0 \\z_0\\u_0\\v_0\\w_0\end{pmatrix}

we bring the initial-value problem into a concise form

\displaystyle (**)\qquad \qquad \frac{d\vec X}{dt}=\vec F(\vec X), \qquad \vec X(0)=\vec X_0

Here \vec F is a six-dimensional vector field that captures the right-hand side of the system:

\displaystyle \vec F(\vec X)=\begin{pmatrix} u \\ v \\ w \\ a_{11}\sin x+a_{12}\sin y+a_{13}\sin z \\ a_{21}\sin x+a_{22}\sin y+a_{23}\sin z \\ a_{31}\sin x+a_{32}\sin y+a_{33}\sin z \end{pmatrix}

The basic uniqueness theorem says that the system (**) has a unique solution provided that the field \vec F satisfies the Lipschitz condition: namely, there exists a constant L such that |\vec F(\vec x)-F(\vec y)|\le L|\vec x-\vec y| for all vectors \vec x and \vec y in \mathbb R^6. The following fact helps:

If a function has bounded partial derivatives of first order, then it is Lipschitz

There are six variables, so we must look at six partial derivatives of each of six components of F. One can organize them into a 6-by-6 matrix of partial derivatives, the Jacobian matrix. Here it is:

\displaystyle \frac{\partial \vec F}{\partial \vec X}=\begin{pmatrix} 0&0&0&1&0&0 \\ 0&0&0&0&1&0 \\ 0&0&0&0&0&1 \\ a_{11}\cos x & a_{12}\cos y & a_{13}\cos z &0&0&0 \\ a_{21}\cos x&a_{22}\cos y&a_{23}\sin z &0&0&0 \\ a_{31}\cos x&a_{32}\cos y&a_{33}\cos z &0&0&0 \end{pmatrix}

Everything here is bounded, thanks to the cosine being bounded. Thus, the solution of (**) exists and is unique.

Oddly, I had a bit of difficulty locating a reference for the uniqueness theorem cited above. The basic ODE books like Boyce&DiPrima do not have it: they discuss uniqueness only for the scalar equation. The ODE book by Hartman obscures this topic with generalizations. Back in the days I learned this stuff from the 2-volume Analyse mathématique by Laurent Schwartz (in Russian translation), but this was a peculiarity of my education.

I hear good things about Theory of Ordinary Differential Equations by Coddington and Levinson, but don’t have it on my shelf. So, here is the best reference I can give now:

Basic Theory of Ordinary Differential Equations by Po-Fang Hsieh and Yasutaka Sibuya (at Google books), Theorem I-1-4 on page 3.

There are more general uniqueness theorems, but in our example \vec F is so nice that they were not needed.