The displacement set of nonlinear maps in vector spaces

Given a vector space {V} and a map {f\colon V\to V} (linear or not), consider the displacement set of {f}, denoted {D(f) = \{f(x)-x\colon x\in V\}}. For linear maps this is simply the range of the operator {f-I} and therefore is a subspace.

The essentially nonlinear operations of taking the inverse or composition of maps become almost linear when the displacement set is considered. Specifically, if {f} has an inverse, then {D(f^{-1}) = -D(f)}, which is immediate from the definition. Also, {D(f\circ g)\subset D(f)+D(g)}.

When {V} is a topological vector space, the maps for which {D(f)} has compact closure are of particular interest: these are compact perturbations of the identity, for which degree theory can be developed. The consideration of {D(f)} makes it very clear that if {f} is an invertible compact perturbation of the identity, then {f^{-1}} is in this class as well.

It is also of interest to consider the maps for which {D(f)} is either bounded, or is bounded away from {0}. Neither case can occur for linear operators, so this is essentially nonlinear analysis. In the nonlinear case, the boundedness assumption for linear operators is usually replaced by the Lipschitz condition. Let us say that {f} is {(L, \ell)}-bi-Lipschitz if {\ell\|x-y\|\le \|f(x)-f(y)\|\le L\|x-y\|} for all {x, y} in the domain of {f}.

Brouwer’s fixed point theorem fails in infinite-dimensional Hilbert spaces, but it not yet clear how hard it can fail. The strongest possible counterexample would be a bi-Lipschitz automorphism of the unit ball with displacement bounded away from 0. The existence of such a map is unknown. If it does not exist, that would imply that the unit ball and the unit sphere in the Hilbert space are not bi-Lipschitz equivalent, because the unit sphere does have such an automorphism: {x\mapsto -x}.

Concerning the maps with bounded displacement, here is a theorem from Patrick Biermann’s thesis (Theorem 3.3.2): if {f} is an {(L, \ell)}-bi-Lipschitz map in a Hilbert space, {L/\ell < \pi/\sqrt{8}}, and {f} has bounded displacement, then {f} is onto. The importance of bounded displacement is illustrated by the forward shift map {S(x_1, x_2, \dots) = (0, x_1, x_2, \dots)} for which {L=\ell=1} but surjectivity nonetheless fails.

It would be nice to get rid of the assumption {L/\ell < \pi/\sqrt{8}} in the preceding paragraph. I guess any bi-Lipschitz map with bounded displacement should be surjective, at least in Hilbert spaces, but possibly in general Banach spaces as well.

Orthogonality in normed spaces

For a vector {x} in a normed space {X}, define the orthogonal complement {x^\perp} to be the set of all vectors {y} such that {\|x+ty\|\ge \|x\|} for all scalars {t}. In an inner product space (real or complex), this agrees with the normal definition of orthogonality because {\|x+ty\|^2 - \|x\|^2 = 2\,\mathrm{Re}\,\langle x, ty\rangle + o(t)} as {t\to 0}, and the right hand side can be nonnegative only if {\langle x, y\rangle=0}.

Let’s see what properties of orthogonal complement survive in a general normed space. For one thing, {x^\perp=X} if and only if {x=0}. Another trivial property is that {0\in x^\perp} for all {x}. More importantly, {x^\perp} is a closed set that contains some nonzero vectors.

  •  Closed because the complement is open: if {\|x+ty\| < \|x\|} for some {t}, the same will be true for vectors close to {y}.
  • Contains a nonzero vector because the Hahn-Banach theorem provides a norming functional for {x}, i.e., a unit-norm linear functional {f\in X^*} such that {f(x)=\|x\|}. Any {y\in \ker f} is orthogonal to {x}, because {\|x+ty\|\ge f(x+ty) = f(x) = \|x\|}.

In general, {x^\perp} is not a linear subspace; it need not even have empty interior. For example, consider the orthogonal complement of the first basis vector in the plane with {\ell_1} (taxicab) metric: it is \{(x, y)\colon |y|\ge |x|\}.

download
The orthogonal complement of a horizontal vector in the taxicab plane

This example also shows that orthogonality is not symmetric in general normed spaces: {(1,1)\in (1,0)^\perp} but {(1,0)\notin (1,1)^\perp}. This is why I avoid using notation {y \perp x} here.

In fact, {x^\perp} is the union of kernels of all norming functionals of {x}, so it is only a linear subspace when the norming functional is unique. Containment in one direction was already proved. Conversely, suppose {y\in x^\perp} and define a linear functional {f} on the span of {x,y} so that {f(ax+by) = a\|x\|}. By construction, {f} has norm 1. Its Hahn-Banach extension is a norming functional for {x} that vanishes on {y}.

Consider {X=L^p[0,1]} as an example. A function {f} satisfies {1\in f^\perp} precisely when its {p}th moment is minimal among all translates {f+c}. This means, by definition, that its “{L^p}-estimator” is zero. In the special cases {p=1,2,\infty} the {L^p} estimator is known as the median, mean, and midrange, respectively. Increasing {p} gives more influence to outliers, so {1\le p\le 2} is the more useful range for it.

Measurability of Banach space valued functions

There is only an indirect proof of the existence of a function {f\colon [0, 1]\to \mathbb R} that is not Lebesgue measurable. But it’s easy to give an explicit example when the codomain of {f} is a Banach space: just let {b(t)} be the sequence of the binary digits of {t}, considered as an element of the sequence space {\ell_\infty}.

Why is {b} not measurable? Recall that a Banach space-valued function {f} is (Bochner) measurable iff there is a sequence of simple functions {\sum v_k \chi_{A_k}} (finite sum, measurable {A_k}, arbitrary vectors {v_k}) that converges to {f} almost everywhere. This property implies that, with an exception of a null set, the range of {f} lies in the separable subspace spanned by all the vectors {v_k} used in the sequence of simple functions. But {b} has the property {\|b(t)-b(s)\|=1} whenever {t\ne s}, so the image of any uncountable set under {b} is nonseparable.

Another way to look at this: on the interval [0, 1) the function {b} is injective and its range has discrete topology, which implies that every subset of [0, 1) is the preimage of some open subset of {\ell_\infty} under {b}.

The binary-digits functions can also be used to illustrate an issue with the duality of Lebesgue-Bochner spaces {L_p(0, 1; X)} where {X} is a Banach space and {1\le p<\infty}. (So, {f} belongs to this space iff it is Bochner measurable and the {L^p} norm of {\|f\|\colon [0, 1]\to [0, \infty)} is finite.) In general we do not have the expected relation {L_p(0, 1; X)^* = L_q(0, 1; X^*)} with {1/p+1/q=1}. The natural isometric embedding of {L_q(0, 1; X^*)} into {L_p(0, 1; X)^*} is still there: any {g\in L_q(0, 1; X^*)} acts on {L_p(0, 1; X)} by {f\mapsto \int \langle f(t), g(t) \rangle\, dt}. But unless {X^*} has the Radon–Nikodym property, these are more bounded linear functionals on {L_p(0, 1; X)}.

To construct such a functional, let {b_n(t)} be the {n}-th binary digit of {t}. Given {f\in L_1(0, 1; \ell_1)}, write it in coordinates as {(f_1, f_2, \dots)} and define {\varphi(f) = \sum_n \int_0^1 f_n b_n}. This is a bounded linear functional, since {|\varphi(f)|\le \sum_n \int_0^1 |f_n| = \|f\|}. But there is no function {g\in L_\infty(0, 1; \ell_\infty)} that represents it, i.e., {\varphi(f) = \int_0^1 \langle f(t), g(t)\rangle \,dt = \sum_n \int_0^1 f_n g_n }. Indeed, if such {g} existed then by considering {f} with only one nonzero coordinate, we find that {g_n} must be {b_n}, using the duality {L_1^* =  L_\infty} in the scalar case. But the function {[0, 1]\to \ell_\infty} with the components {(b_1, b_2, \dots)} is not measurable, as shown above.

This example, which applies to all {1\le p<\infty}, also serves as a reminder that the duality relation {L_p(0, 1; X)^* = L_q(0, 1; X^*)} depends on the dual space {X^*} having the Radon-Nikodym property (RNP), not {X} itself. Indeed, {X=\ell_1} has the RNP; its dual does not.

The importance of {X^*} having the RNP becomes clear once one tries to follow the usual proof of {L_p^*=L_q}. Given {\varphi\in L_p(0,1;X)^*}, we can define an {X^*}-valued measure {\tau} on {[0, 1]} by {\tau(A)(v) = \varphi( v\chi_A)} where {A\subset [0, 1]} is Lebesgue measurable and {v\in X}. This measure has reasonable finiteness and continuity properties coming from {\varphi} being a bounded functional. Still, the existence of a density {g\colon [0, 1]\to X^*} of the measure {\tau} depends on the structure of the Banach space {X^*}.

Modes of convergence on abstract (and not so abstract) sets

In how many ways can a series of real-valued functions on an abstract set converge? Having no measure on the domain eliminates the infinitude of modes of convergence based on integral norms. I can think of five modes of convergence of {\sum f_n} where {f_n\colon X\to \mathbb R}:

  • (P) Pointwise convergence: {\sum f_n(x)} converges for each {x\in X}.
  • (U) Uniform convergence: the partial sums {S_n} of the series converge to some function {f\colon X\to\mathbb R} uniformly, i.e., {\sup |S_n-f| \to 0}.
  • (PA) Pointwise convergence of absolute values: {\sum |f_n(x)|} converges for each {x\in X}.
  • (UA) Uniform convergence of absolute values: like uniform, but for {\sum |f_n|}.
  • (M) Weierstrass M-test convergence: {\sum \sup |f_n| } converges.

Implications (all easy): (M) implies (UA), which implies both (U) and (PA). Neither (U) nor (PA) implies the other one, but each of them implies (P).

Perhaps (U) and (PA) deserve an illustration, being incomparable. Let {X = [0, 1]}. The constant functions {f_n(x)=(-1)^n/n} form a series that converges uniformly but not in the sense (PA). In the opposite direction, a series of triangles with height 1 and disjoint supports converges (PA) but not (U).

UA

Notably, the sum of the latter series is not a continuous function. This had to happen: by Dini’s theorem, if a series of continuous functions is (PA)-convergent and its sum is continuous, then it is (UA)-convergent. This “self-improving” property of (PA) convergence will comes up again in the second part of this post.

From abstract sets to normed spaces

In functional analysis, the elements of a normed space {E} can often be usefully interpreted as functions on the unit ball of the dual space {E^*}. Indeed, each {x\in E} induces {f_x(z) = z(x)} for {z\in E^*}. Applying the aforementioned modes of convergence to {\sum x_n} with {x_n\in E}, we arrive at

  • (P) ⇔ Convergence in the weak topology of E.
  • (U) ⇔ Convergence in the norm topology of E.
  • (PA) ⇔ Unconditional convergence in the weak topology of E.
  • (UA) ⇔ Unconditional convergence in the norm topology of E.
  • (M) ⇔ Absolute convergence, i.e., {\sum \|x_n\|} converges.

The equivalences (P), (U), (M) are straightforward exercises, but the unconditional convergence merits further discussion. For one thing, there are subtly different approaches to formalizing the concept. Following “Normed Linear Spaces” by M. M. Day, let’s say that a series {\sum x_n} is

  • (RC) Reordered convergent if there exists {x} such that {\sum x_{\pi(n)} =x} for every bijection {\pi:\mathbb{N}\to\mathbb{N}}
  • (UC) Unordered convergent if there exists {x} such that for every neighborhood {U} of {x} there exists a finite set {E\subset \mathbb{N}} with the property that {\sum_{n\in F}x_n\in U} for every finite set {F} containing {E}.
  • (SC) Subseries convergent if for every increasing sequence of integers {(n_k)} the series {\sum x_{n_k}} converges.
  • (BC) Bounded-multiplier convergent if for every bounded sequence of scalars {(a_n)}, the series {\sum a_n x_n} converges.

In a general locally convex space, (BC) ⇒ (SC) ⇒ (UC) ⇔ (RC). The issue with reversing the first two implications is that they involve the existence of a sum for some new series, and if the space lacks completeness, the sum might fail to exist for no good reason. All four properties are equivalent in sequentially complete spaces (those where every Cauchy sequence converges).

Let’s prove that interpretation of (PA) stated above, using the (BC) form of unconditional convergence. Suppose {\sum x_n} converges in the sense (PA), that is for every linear functional {z} the series {\sum |z(x_n)|} converges. Then it’s clear that {\sum a_n x_n} has the same property for any bounded scalar sequence {(a_n)}. That is, (PA) implies bounded-multiplier convergence in the weak topology. Conversely, suppose {\sum x_n} enjoys weak bounded-multiplier convergence and let {z\in E^*}. Multiplying each {x_n} by a suitable unimodular factor {a_n} we can get {z(a_n x_n) > 0} for all {n}. Now the weak convergence of {\sum a_n x_n} yields the pointwise convergence of {\sum |z(x_n)|}.

A theorem of Orlicz, proved in the aforementioned book by Day, says that (SC) convergence in the weak topology of a Banach space is equivalent to (SC) convergence in the norm topology. Thanks to completeness, in the norm topology of a Banach space all forms of unconditional convergence are equivalent. The upshot is that (PA) automatically upgrades to (UA) in the context of the elements of a Banach space being considered as functions on the dual unit ball.

Retraction by contraction

The Euclidean space {\mathbb R^n} has a nice property: every closed convex subset {C\subset \mathbb R^n} is the image of the whole space under a map {f} that is simultaneously:

  • a contraction, meaning {\|f(a)-f(b)\|\le \|a-b\|} for all {a,b\in\mathbb R^n};
  • a retraction, meaning {f(a)=a} for all {a\in C}.

Indeed, we can simply define {f(x)} to be the (unique) nearest point of {C} to {x}; it takes a bit of work to verify that {f} is a contraction, but not much.


In other normed spaces, this nearest point projection does not work that well. For example, take {X=\ell_1^2}, the two-dimensional space with the Manhattan metric. Consider the line {C=\{(2t,t)\colon t\in\mathbb R\}} which is a closed and convex set. The nearest point of {C} to {(2,0)} is {(2,1)}: moving straight up, since changing the first coordinate doesn’t pay off. Since {(0,0)} remains fixed, the nearest point projection increases some pairwise distances, in this case from {2} to {3}.

However, there is a contractive retraction onto this line, given by the formuls {T(x_1,x_2) =    \left(\frac23(x_1+x_2), \frac13(x_1+x_2)\right)}. Indeed, this is a linear map that fixes the line {x_1=2x_2} pointwise and has norm {1} because

\displaystyle \|T(x)\|_1 = |x_1+x_2|\le \|x\|_1


More generally, in every normed plane, every closed convex subset admits a contractive retraction. To prove this, it suffices to consider closed halfplanes, since a closed convex set is an intersection of such, and contractions form a semigroup. Furthermore, it suffices to consider lines, because having a contractive retraction onto a line, we can redefine it to be the identity map on one side of the line, and get a contractive retraction onto a halfplane.

Such a retraction onto a line, which is a linear map, is illustrated below.

Every line in a normed plane is 1-complemented
Every line in a normed plane is 1-complemented

Given the unit circle (black) and a line (blue), draw supporting lines (red) to the unit circle at the points where it meets the line. Project onto the blue line along the red ones. By construction, the image of the unit disk under this projection is contained in the unit disk. This precisely means that the map has operator norm {1}.


In spaces of dimensions {3} or higher, there are closed convex subsets without a contractive retraction. For example, consider the plane in {\ell_\infty^3} passing through the points {A = (2,2,0)}, {B= (2,0,2)}, and {C= (0,2,2)}. This plane has equation {x_1+x_2+x_3=4}. The point {D=(1,1,1)} is at distance {1} from each of A,B,C, and it does not lie on the plane. For any point E other than D, at least one of the distances AE, BE, CE exceeds 1. More precisely, the best place to project D is {(4/3, 4/3, 4/3)} which is at distance {4/3} from A, B, and C.


Two natural questions: (a) is there a nice characterization of normed spaces that admit a contractive retraction onto every closed convex subset? (b) what is the smallest constant {L=L(n)} such that every {n}-dimensional normed space admits an {L}-Lipschitz retraction onto every closed convex subset?

(The answers may well be known, but not to me at present.)

The shortest circle is a hexagon

Let {\|\cdot\|} be some norm on {{\mathbb R}^2}. The norm induces a metric, and the metric yields a notion of curve length: the supremum of sums of distances over partitions. The unit circle {C=\{x\in \mathbb R^2\colon \|x\|=1\}} is a closed curve; how small can its length be under the norm?

For the Euclidean norm, the length of unit circle is {2\pi\approx 6.28}. But it can be less than that: if {C} is a regular hexagon, its length is exactly {6}. Indeed, each of the sides of {C} is a unit vector with respect to the norm defined by {C}, being a parallel translate of a vector connecting the center to a vertex.

Hexagon as unit disk
Hexagon as unit disk

To show that {6} cannot be beaten, suppose that {C} is the unit circle for some norm. Fix a point {p\in C}. Draw the circle {\{x\colon \|x-p\|=1\}}; it will cross {C} at some point {q}. The points {p,q,q-p, -p, -q, p-q} are vertices of a hexagon inscribed in {C}. Since every side of the hexagon has length {1}, the length of {C} is at least {6}.

It takes more effort to prove that the regular hexagon and its affine images, are the only unit circles of length {6}; a proof can be found in Geometry of Spheres in Normed Spaces by Juan Jorge Schäffer.

Nonlinear Closed Graph Theorem

If a function {f\colon {\mathbb R}\rightarrow {\mathbb R}} is continuous, then its graph {G_f = \{(x,f(x))\colon x\in{\mathbb R}\}} is closed. The converse is false: a counterexample is given by any extension of {y=\tan x} to the real line.

Closed graph, not continuous
Closed graph, not continuous

The Closed Graph Theorem of functional analysis states that a linear map between Banach spaces is continuous whenever its graph is closed. Although the literal extension to nonlinear maps fails, it’s worth noting that linear maps are either continuous or discontinuous everywhere. Hence, if one could show that a nonlinear map with a closed graph has at least one point of continuity, that would be a nonlinear version of the Closed Graph Theorem.


Here is an example of a function with a closed graph and an uncountable set of discontinuities. Let {C\subset {\mathbb R}} be a closed set with empty interior, and define

\displaystyle f(x) = \begin{cases} 0,\quad & x\in C \\ \textrm{dist}\,(x,C)^{-1},\quad & x\notin C \end{cases}

For a general function, the set of discontinuities is an Fσ set. When the graph is closed, we can say more: the set of discontinuities is closed. Indeed, suppose that a function {f} is bounded in a neighborhood of {a} but is not continuous at {a}. Then there are two sequences {x_n\rightarrow a} and {y_n\rightarrow a} such that both sequences {f(x_n)} and {f(y_n)} converge but have different limits. Since at least one of these limits must be different from {f(a)}, the graph of {f} is not closed. Conclusion: a function with a closed graph is continuous at {a} if and only if it is bounded in a neighborhood of {a}. In particular, the set of discontinuities is closed.


Furthermore, the set of discontinuities has empty interior. Indeed, suppose that {f} is discontinuous at every point of a nontrivial closed interval {[a,b]}. Let {A_n = G_f \cap ([a,b]\times [-n,n])}; this is a closed bounded set, hence compact. Its projection onto the {x}-axis is also compact, and this projection is exactly the set {B_n=\{x\in [a,b] : |f(x)|\le n\}}. Thus, {B_n} is closed. The set {B_n} has empty interior, since otherwise {f} would be continuous at its interior points. Finally, {\bigcup B_n=[a,b]}, contradicting the Baire Category theorem.

Summary: for closed-graph functions on {\mathbb R}, the sets of discontinuity are precisely the closed sets with empty interior. In particular, every such function has a point of continuity. The proof works just as well for maps from {\mathbb R^n} to any metric space.


However, the above result does not extend to the setting of Banach spaces. Here is an example of a map {F\colon X\rightarrow X} on a Banach space {X} such that {\|F(x)-F(y)\|=1} whenever {x\ne y}; this property implies that the graph is closed, despite {F} being discontinuous everywhere.

Let {X} the space of all bounded functions {\phi \colon (0,1]\rightarrow\mathbb R} with the supremum norm. Let {(q_n)_{n=1}^\infty} be an enumeration of all rational numbers. Define the function {\psi =F(\phi )} separately on each subinterval {(2^{-n},2^{1-n}]}, {n=1,2,\dots} as

{\displaystyle \psi(t) = \begin{cases} 1 \quad &\text{if } \phi (2^nt-1) > q_n \\ 0 \quad &\text{if } \phi (2^n t-1)\le q_n\end{cases}}

For any two distinct elements {\phi_1,\phi_2} of {X} there is a point {s\in (0,1]} and a number {n\in\mathbb N} such that {q_n} is strictly between {\phi_1(s)} and {\phi_2(s)}. According to the definition of {F} this implies that the functions {F(\phi_1)} and {F(\phi_2)} take on different values at the point {t=2^{-n}(s+1)}. Thus the norm of their difference is {1}.


So much for Nonlinear Closed Graph Theorem. However, the space {X} in the above example is nonseparable. Is there an nowhere continuous map between separable Banach spaces such that its graph is closed?