Quasi-projections and isometries

A typical scenario: given a subset {E} of a metric space {X} and a point {x\in X}, we look for a point {y\in E} that is nearest to {x}: that is, {d(x, y) = \mathrm{dist}\,(x, E)}. Such a point is generally not unique: for example, if {E} is the graph of cosine function and {x = (\pi, \pi/2)}, then both {(\pi/2, 0)} and {(3\pi/2, 0)} qualify as nearest to {x}. This makes the nearest-point projection onto {E} discontinuous: moving {x} slightly to the left or to the right will make its projection onto {E} jump from one point to another. Not good.

Discontinuous nearest-point projection

Even when the nearest point projection is well-defined and continuous, it may not be the kind of projection we want. For example, in a finite-dimensional normed space with strictly convex norm we have a continuous nearest-point projection onto any linear subspace, but it is in general a nonlinear map.

Let’s say that {P\colon X\to E} is a quasi-projection if {d(x, P(x)) \le C \mathrm{dist}\,(x, E)} for some constant {C} independent of {x}. Such maps are much easier to construct: indeed, every Lipschitz continuous map {P\colon X\to E} such that {P(x)=x} for {x \in E} is a quasi-projection. For example, one quasi-projection onto the graph of cosine is the map {(x, y)\mapsto (x, \cos x)} shown below.

Continuous quasi-projection

If {X} is a Banach space and {E} is its subspace, then any idempotent operator with range {E} is a quasi-projection onto {E}. Not every subspace admits such an operator but many do (these are complemented subspaces; they include all subspaces of finite dimension or finite codimension). By replacing “nearest” with “close enough” we gain linearity. And even some subspaces that are not linearly complemented admit a continuous quasi-projection.

Here is a neat fact: if {M} and {N} are subspaces of a Euclidean space and {\dim M = \dim N}, then there exists anĀ isometric quasi-projection of {M} onto {N} with constant {C=\sqrt{2}}. This constant is best possible: for example, an isometry from the {y}-axis onto the {x}-axis has to send {(0, 1)} to one of {(\pm 1, 0)}, thus moving it by distance {\sqrt{2}}.

An isometry must incur sqrt(2) distance cost

Proof. Let {k} be the common dimension of {M} and {N}. Fix some orthonormal bases in {M} and {N}. In these bases, the orthogonal (nearest-point) projection from {M} to {N} is represented by some {k\times k} matrix {P} of norm at most {1}. We need an orthogonal {k\times k} matrix {Q} such that the map {M\to N} that it defines is a {\sqrt{2}}-quasi-projection. What exactly does this condition mean for {Q}?

Let’s say {x\in M}, {y\in N} is the orthogonal projection of {x} onto {N}, and {z\in N} is where we want to send {x} by an isometry. Our goal is {\|x-z\|\le \sqrt{2}\|x-y\|}, in addition to {\|z\|=\|x\|}. Squaring and expanding inner products yields {2\langle x, y\rangle - \langle x, z \rangle \le \|y\|^2}. Since both {y} and {z} are in {N}, we can replace {x} on the left by its projection {y}. So, the goal simplifies to {\|y\|^2 \le \langle y, z\rangle}. Geometrically, this means placing {z} so that its projection onto the line through {y} lies on the continuation of this line beyond {y}.

So far so good, but the disappearance of {x} from the inequality is disturbing. Let’s bring it back by observing that {\|y\|^2 \le \langle y, z\rangle} is equivalent to {4(\|y\|^2 - \langle y, z\rangle) + \|z\|^2 \le \|x\|^2}, which is simply {\|2y-z\| \le \|x\|}. So that’s what we want to do: map {x} so that the distance from its image to {2y} does not exceed {\|x\|}. In terms of matrices and their operator norm, this means {\|2P-Q\|\le 1}.

It remains to show that every square matrix of norm at most {2} (such as {2P} here) is within distance {1} of some orthogonal matrix. Let {2P = U\Sigma V^T} be the singular value decomposition, with {U, V} orthogonal and {\Sigma} a diagonal matrix with the singular values of {2P} on the diagonal. Since the singular values of {2P} are between {0} and {2}, it follows that {\|\Sigma-I\|\le 1}. Hence {\|2P - UV^T\|\le 1}, and taking {Q=UV^T} concludes the proof.

(The proof is based on a Stack Exchange post by user hypernova.)

Subspaces and projections

Let M be a closed subspace of a Banach space X. In general, there is no linear projection P\colon X\to M, the canonical example being c_0 in \ell^\infty. At least we can construct a projection when M is finite-dimensional. The one-dimensional case is easy: take a unit vector x_1\in M, pick a norming functional f and define P(x)=f(x)x_1. If M is n-dimensional, one can construct P as the sum of n rank-one projections, achieving \|P\| \le n. Which is pretty bad: distorting distances by a factor comparable to dimension may render high-dimensional data useless. One usually seeks estimates that are logarithmic in dimension (or better yet, dimension-independent).

Recall that a retraction is a continuous map f\colon X\to M which is the identity on M. A projection is a linear retraction. The linearity is quite a rigid condition. We may have better luck with retractions in other classes, such as Lipschitz maps. And indeed, there is a 2-Lipschitz retraction from \ell^\infty onto c_0. Given x\in \ell^\infty, let d=\limsup|x_n| and define r(x)=y with y_n=(|x_n|-d)^+\mathrm{sign}\,x_n. Since d is a 1-Lipschitz function, r is 2-Lipschitz. It is a retraction because d=0 when x\in c_0.

One of many open problems in the book Geometric Nonlinear Functional Analysis I (Benyamini and Lindenstrauss) is whether every Banach space is a Lipschitz retract of its bidual. This problem is also mentioned in 2007 survey by Nigel Kalton.

One thing to like about linear projections is their openness: any linear surjection between Banach spaces is an open map. This is not the case for Lipschitz surjections: for instance, f(x)=(|x|-1)^+\mathrm{sign}\,x is a Lipschitz surjection \mathbb R\to\mathbb R which maps (-1,1) to a point. This example resembles the retraction r above. And indeed, r is not open either: the image of a small open ball centered at (0,1,1,1,1,\dots) is contained in the hyperplane x_1=0.

In the context of Lipschitz maps it is natural to quantify openness in the same way as continuity: i.e., by requiring the image of a ball B(x,r) to contain B(f(x),r/C) with C independent of x,r. This defines Lipschitz quotients, which appear to be the right concept of “nonlinear projection”. However, it remains unknown whether there is a Lipschitz quotient Q\colon \ell^\infty\to c_0. [Benyamini and Lindenstrauss]