Riesz projection on polynomials

Consider a trigonometric polynomial of degree {n} with complex coefficients, represented as a Laurent polynomial {L(z) = \sum\limits_{k=-n}^n c_k z^k} where {|z|=1}. The Riesz projection of {L} is just the regular part of {L}, the one without negative powers: {R(z) = \sum\limits_{k=0}^n c_k z^k}. Let’s compare the supremum norm of {L}, { \|L\|=\max\limits_{|z|=1} |L(z)|}, with the norm of {R}.

The ratio {\|R\|/\|L\|} may exceed {1}. By how much?

The extreme example for {n=1} appears to be {L(z) = 2z + 4 - z^{-1}}, pictured below together with {R(z)=2z+4}. The polynomial {L} is in blue, {R} is in red, and the point {0} is marked for reference.

Laurent polynomial in blue, its regular part in red

Since {R(z)=2z+4} has positive coefficients, its norm is just {R(1)=6}. To compute the norm of {L}, let’s rewrite {|L(z)|^2 = L(z)L(z^{-1})} as a polynomial of {x=\mathrm{Re}\,(z)}. Namely, {|L(z)|^2 = -2z^2 + 4z + 21 + 4z^{-1} - 2z^{-2}} which simplifies to {27 - 2(2x-1)^2} in terms of {x}. Hence {\|L\| = \sqrt{27}} and {\|R\|/\|L\| = 6/\sqrt{27} = 2/\sqrt{3}\approx 1.1547}.

The best example for {n=2} appears to be vaguely binomial: {L(z) = 2z^2 + 4z + 6 - 4z^{-1} + z^{-2}}. Note that the range of {R} is a cardioid.

Laurent polynomial in blue, its regular part in red

Once again, {R(z) = 2z^2 + 4z + 6} has positive coefficients, hence {\|R\| = R(1) = 12}. And once again, {|L(z)|^2} is a polynomial of {x=\mathrm{Re}\,(z)}, specifically {|L(z)|^2 = 81 - 8(1-x^2)(2x-1)^2}. Hence {\|L\| = 9} and {\|R\|/\|L\| = 12/9 = 4/3\approx 1.3333}.

I do not have a symbolic candidate for the extremal polynomial of degree {n= 3}. Numerically, it should look like this:

Laurent polynomial in blue, its regular part in red

Is the maximum of {\|R\|/\|L\|} attained by polynomials with real, rational coefficients (which can be made integer)? Do they have some hypergeometric structure? Compare with the Extremal Taylor polynomials which is another family of polynomials which maximize the supremum norm after elimination of some coefficients.

Riesz projection as a contraction

To have some proof content here, I add a 2010 theorem by Marzo and Seip: {\|R\|_4 \le \|L\|} where {\|R\|_p^p = \int_0^1 |R(e^{2\pi i t})|^p\,dt}.

The theorem is not just about polynomials: it says the Riesz projection is a contraction (has norm {1}) as an operator {L^\infty\to L^4}.

Proof. Let {S=L-R}, the singular part of {L}. The polynomial {R-S} differs from {L} only by the sign of the singular part, hence {\|R-S\|_2 = \|L\|_2} by Parseval’s theorem.

Since {S^2} consists of negative powers of {z}, while {R^2} does not contain any negative powers, these polynomials are orthogonal on the unit circle. By the Pythagorean theorem, {\|R^2-S^2\|_2 \ge \|R^2\|_2 = \|R\|_4^2}. On the other hand, {R^2-S^2 = (R+S)(R-S)=L(R-S)}. Therefore, {\|R^2-S^2\|_2 \le \|L\| \|R-S\|_2 = \|L\| \|L\|_2 \le \|L\|^2}, completing the proof.

This is so neat. And the exponent {4} is best possible: the Riesz projection is not a contraction from {L^\infty} to {L^p} when {p>4} (the Marzo-Seip paper has a counterexample).

Quasi-projections and isometries

A typical scenario: given a subset {E} of a metric space {X} and a point {x\in X}, we look for a point {y\in E} that is nearest to {x}: that is, {d(x, y) = \mathrm{dist}\,(x, E)}. Such a point is generally not unique: for example, if {E} is the graph of cosine function and {x = (\pi, \pi/2)}, then both {(\pi/2, 0)} and {(3\pi/2, 0)} qualify as nearest to {x}. This makes the nearest-point projection onto {E} discontinuous: moving {x} slightly to the left or to the right will make its projection onto {E} jump from one point to another. Not good.

proj1
Discontinuous nearest-point projection

Even when the nearest point projection is well-defined and continuous, it may not be the kind of projection we want. For example, in a finite-dimensional normed space with strictly convex norm we have a continuous nearest-point projection onto any linear subspace, but it is in general a nonlinear map.

Let’s say that {P\colon X\to E} is a quasi-projection if {d(x, P(x)) \le C \mathrm{dist}\,(x, E)} for some constant {C} independent of {x}. Such maps are much easier to construct: indeed, every Lipschitz continuous map {P\colon X\to E} such that {P(x)=x} for {x \in E} is a quasi-projection. For example, one quasi-projection onto the graph of cosine is the map {(x, y)\mapsto (x, \cos x)} shown below.

proj2
Continuous quasi-projection

If {X} is a Banach space and {E} is its subspace, then any idempotent operator with range {E} is a quasi-projection onto {E}. Not every subspace admits such an operator but many do (these are complemented subspaces; they include all subspaces of finite dimension or finite codimension). By replacing “nearest” with “close enough” we gain linearity. And even some subspaces that are not linearly complemented admit a continuous quasi-projection.

Here is a neat fact: if {M} and {N} are subspaces of a Euclidean space and {\dim M = \dim N}, then there exists anĀ isometric quasi-projection of {M} onto {N} with constant {C=\sqrt{2}}. This constant is best possible: for example, an isometry from the {y}-axis onto the {x}-axis has to send {(0, 1)} to one of {(\pm 1, 0)}, thus moving it by distance {\sqrt{2}}.

proj3
An isometry must incur sqrt(2) distance cost

Proof. Let {k} be the common dimension of {M} and {N}. Fix some orthonormal bases in {M} and {N}. In these bases, the orthogonal (nearest-point) projection from {M} to {N} is represented by some {k\times k} matrix {P} of norm at most {1}. We need an orthogonal {k\times k} matrix {Q} such that the map {M\to N} that it defines is a {\sqrt{2}}-quasi-projection. What exactly does this condition mean for {Q}?

Let’s say {x\in M}, {y\in N} is the orthogonal projection of {x} onto {N}, and {z\in N} is where we want to send {x} by an isometry. Our goal is {\|x-z\|\le \sqrt{2}\|x-y\|}, in addition to {\|z\|=\|x\|}. Squaring and expanding inner products yields {2\langle x, y\rangle - \langle x, z \rangle \le \|y\|^2}. Since both {y} and {z} are in {N}, we can replace {x} on the left by its projection {y}. So, the goal simplifies to {\|y\|^2 \le \langle y, z\rangle}. Geometrically, this means placing {z} so that its projection onto the line through {y} lies on the continuation of this line beyond {y}.

So far so good, but the disappearance of {x} from the inequality is disturbing. Let’s bring it back by observing that {\|y\|^2 \le \langle y, z\rangle} is equivalent to {4(\|y\|^2 - \langle y, z\rangle) + \|z\|^2 \le \|x\|^2}, which is simply {\|2y-z\| \le \|x\|}. So that’s what we want to do: map {x} so that the distance from its image to {2y} does not exceed {\|x\|}. In terms of matrices and their operator norm, this means {\|2P-Q\|\le 1}.

It remains to show that every square matrix of norm at most {2} (such as {2P} here) is within distance {1} of some orthogonal matrix. Let {2P = U\Sigma V^T} be the singular value decomposition, with {U, V} orthogonal and {\Sigma} a diagonal matrix with the singular values of {2P} on the diagonal. Since the singular values of {2P} are between {0} and {2}, it follows that {\|\Sigma-I\|\le 1}. Hence {\|2P - UV^T\|\le 1}, and taking {Q=UV^T} concludes the proof.

(The proof is based on a Stack Exchange post by user hypernova.)

Subspaces and projections

Let M be a closed subspace of a Banach space X. In general, there is no linear projection P\colon X\to M, the canonical example being c_0 in \ell^\infty. At least we can construct a projection when M is finite-dimensional. The one-dimensional case is easy: take a unit vector x_1\in M, pick a norming functional f and define P(x)=f(x)x_1. If M is n-dimensional, one can construct P as the sum of n rank-one projections, achieving \|P\| \le n. Which is pretty bad: distorting distances by a factor comparable to dimension may render high-dimensional data useless. One usually seeks estimates that are logarithmic in dimension (or better yet, dimension-independent).

Recall that a retraction is a continuous map f\colon X\to M which is the identity on M. A projection is a linear retraction. The linearity is quite a rigid condition. We may have better luck with retractions in other classes, such as Lipschitz maps. And indeed, there is a 2-Lipschitz retraction from \ell^\infty onto c_0. Given x\in \ell^\infty, let d=\limsup|x_n| and define r(x)=y with y_n=(|x_n|-d)^+\mathrm{sign}\,x_n. Since d is a 1-Lipschitz function, r is 2-Lipschitz. It is a retraction because d=0 when x\in c_0.

One of many open problems in the book Geometric Nonlinear Functional Analysis I (Benyamini and Lindenstrauss) is whether every Banach space is a Lipschitz retract of its bidual. This problem is also mentioned in 2007 survey by Nigel Kalton.

One thing to like about linear projections is their openness: any linear surjection between Banach spaces is an open map. This is not the case for Lipschitz surjections: for instance, f(x)=(|x|-1)^+\mathrm{sign}\,x is a Lipschitz surjection \mathbb R\to\mathbb R which maps (-1,1) to a point. This example resembles the retraction r above. And indeed, r is not open either: the image of a small open ball centered at (0,1,1,1,1,\dots) is contained in the hyperplane x_1=0.

In the context of Lipschitz maps it is natural to quantify openness in the same way as continuity: i.e., by requiring the image of a ball B(x,r) to contain B(f(x),r/C) with C independent of x,r. This defines Lipschitz quotients, which appear to be the right concept of “nonlinear projection”. However, it remains unknown whether there is a Lipschitz quotient Q\colon \ell^\infty\to c_0. [Benyamini and Lindenstrauss]