Nearest point projection, part II: beware of two-dimensional gifts

To avoid the complications from the preceding post, let’s assume that X is a uniformly convex Banach space: such a space is automatically reflexive, and therefore any closed subspace A has a well-defined nearest point projection P\colon X\to A.

Recalling that in a Hilbert space P is a linear operator (and a self-adjoint one at that), we might ask if our P is linear. Indeed, the invariance of distance under translations shows that P(x+a)=P(x)+a for all x\in X, a\in A. Consequently, all fibers P^{-1}(a) are translates of one another. The map P is also homogeneous: P(\lambda x)=\lambda P(x), which follows from the homogeneity of the norm. In particular, P^{-1}(0) is a two-sided cone: it’s closed under multiplication by scalars.

In the special case \dim (X/A)=1 we conclude that P^{-1}(0) is a line, and the direct sum decomposition X= A+P^{-1}(0) identifies P as a (possibly skewed) linear projection.

Well, one of things that the geometry of Banach spaces teaches us is that 2-dimensional examples are often too simple to show what is really going on, while a 3-dimensional example may suffice. For instance, the \ell_1 and \ell_\infty norms define isometric spaces in 2 dimensions, but not in 3 or more.

So, let’s take X to be 3-dimensional with the \ell_p norm \|x\|^p=|x_1|^p+|x_2|^p+|x_3|^p, where 1<p<\infty. Let A=\lbrace x_1=x_2=x_3\rbrace, so that the codimension of A is 2. What is the set P^{-1}(0)? We know it is a ruled surface: with each point it contains the line through that point and the origin. More precisely, P(x)=0 exactly when the minimum of d(t):=|x_1-t|^p+|x_2-t|^p+|x_3-t|^p is attained at t=0. (The minimum point is unique, since the function is strictly convex.) Differentiation reveals that

\displaystyle P^{-1}(0)=\lbrace x\colon |x_1|^{p-2}x_1+|x_2|^{p-2}x_2+|x_3|^{p-2}x_3 = 0\rbrace

which is a plane only when p=2. Here is this surface for p=4, when the equation simplifies to x_1^3+x_2^3+x_3^3=0:

A fiber of the nearest point projection

The entire 3-dimensional space is foliated by the translates of this surface in the direction of the vector (1,1,1).

The nearest point projection is likely to be the first nonlinear map one encounters in functional analysis. It is not even Lipschitz in general, although in decent spaces such as \ell_p for 1<p<\infty it is Hölder continuous (I think the optimal exponent is \frac{p\wedge 2}{p\vee 2}).

After a little thought, the nonlinearity of NPP is not so surprising: minimization of distance amounts to solving an equation involving the gradient of the norm, and this gradient is nonlinear unless the norm is a quadratic functions, i.e., unless we are in a Hilbert space.

Nearest point projection, part I: eschewing transliterations

A subset A of a metric space (X,d) is called a Чебышёв (Chebyshev) set if for every x\in X there exists a\in A such that d(a',x)>d(a,x) for all a'\in A\setminus \{a\}. In words, for each point of the space there is a well-defined nearest point of A. We can then define the nearest point projection P\colon X\to A by sending x to a. This is consistent with the notion of orthogonal projection onto a subspace of a Euclidean space. More generally, every closed convex subset of a Hilbert space is a Чебышёв set. Whether the converse is true is unknown. It’s been proved in finite dimensions and in some other special cases, such as for weakly closed sets. (Yes, being weakly closed is a stronger property than being closed…)

How could a nonconvex set have NPP?

There isn’t much one can prove about Чебышёв sets in general metric spaces. They must be closed, and that’s about it. Let’s specialize to X being a real Banach space, and A a closed subspace thereof. There are two ways in which A could fail to be Чебышёв:

  1. the infimum \inf_{a\in A} d(a,x) is attained for more than one point a\in A.
  2. the infimum is not attained at all.

Examples of the first kind are easy to produce. Let X be the plane with the taxicab norm \|x\|=|x_1|+|x_2|. The point (1,0) is at distance 1 from the line \lbrace x_1=x_2\rbrace, and this distance is realized by any point (t,t) with 0\le t\le 1.

It’s also easy to see exactly how we can avoid such examples: assume that the space X is strictly convex, i.e., its unit sphere does not contain any line segments.

Examples of the second kind are not as simple. To attain the infimum, it suffices to find a cluster point of a minimizing sequence (a_n). So if the space is reflexive, we can use the weak compactness of closed balls [note in passing: weak compactness is weaker than compactness] to obtain a weakly convergent minimizing sequence. Its limit also belongs to A and minimizes the distance.

The converse is also true: if every closed subspace is Чебышёв, then X is reflexive. Indeed, by the James theorem every non-reflexive space has a linear functional \varphi that is not norm-attaining, that is, |\varphi(x)|<\|\varphi\|\|x\| for all x\ne 0. Let A=\mathrm{ker}\,\varphi. I claim that A has no nearest points to any point x\notin A.

Fix a point x\notin A and suppose that a\in A is the corresponding nearest point. We may assume \|x-a\|=1. Since |\varphi(x)|=|\varphi(x-a)|<\|\varphi \|, there exists a unit vector y\in X such that \varphi(y)>|\varphi(x)|. The vector a' = x-\frac{\varphi(x)}{\varphi(y)}\,y belongs to A and \|x-a'\|<1, a contradiction.

For a concrete example, take X=\ell_1 and \displaystyle A=\lbrace x\in\ell_1\colon \sum_{n=1}^\infty \frac{n}{n+1}x_n=0\rbrace. One can check that the unit vector e_1\in\ell_1 has \mathrm{dist}(e_1,A)=1/2, and this distance is not attained. The sequence \displaystyle a_n=\frac{n}{2n+1}e_1-\frac{n+1}{2n+1}e_n has \mathrm{dist}(x,a_n)\to 1/2, but contains no weakly convergent subsequence. Its weak* limit is \frac{1}{2}e_1, which is outside of A. Like this:

Closed subspace is always weakly closed, but not necessarily weak*ly closed

Just as the name Чебышёв has no nearest point projection onto the Latin alphabet.