Nonlinear Closed Graph Theorem

If a function {f\colon {\mathbb R}\rightarrow {\mathbb R}} is continuous, then its graph {G_f = \{(x,f(x))\colon x\in{\mathbb R}\}} is closed. The converse is false: a counterexample is given by any extension of {y=\tan x} to the real line.

Closed graph, not continuous
Closed graph, not continuous

The Closed Graph Theorem of functional analysis states that a linear map between Banach spaces is continuous whenever its graph is closed. Although the literal extension to nonlinear maps fails, it’s worth noting that linear maps are either continuous or discontinuous everywhere. Hence, if one could show that a nonlinear map with a closed graph has at least one point of continuity, that would be a nonlinear version of the Closed Graph Theorem.

Here is an example of a function with a closed graph and an uncountable set of discontinuities. Let {C\subset {\mathbb R}} be a closed set with empty interior, and define

\displaystyle f(x) = \begin{cases} 0,\quad & x\in C \\ \textrm{dist}\,(x,C)^{-1},\quad & x\notin C \end{cases}

For a general function, the set of discontinuities is an Fσ set. When the graph is closed, we can say more: the set of discontinuities is closed. Indeed, suppose that a function {f} is bounded in a neighborhood of {a} but is not continuous at {a}. Then there are two sequences {x_n\rightarrow a} and {y_n\rightarrow a} such that both sequences {f(x_n)} and {f(y_n)} converge but have different limits. Since at least one of these limits must be different from {f(a)}, the graph of {f} is not closed. Conclusion: a function with a closed graph is continuous at {a} if and only if it is bounded in a neighborhood of {a}. In particular, the set of discontinuities is closed.

Furthermore, the set of discontinuities has empty interior. Indeed, suppose that {f} is discontinuous at every point of a nontrivial closed interval {[a,b]}. Let {A_n = G_f \cap ([a,b]\times [-n,n])}; this is a closed bounded set, hence compact. Its projection onto the {x}-axis is also compact, and this projection is exactly the set {B_n=\{x\in [a,b] : |f(x)|\le n\}}. Thus, {B_n} is closed. The set {B_n} has empty interior, since otherwise {f} would be continuous at its interior points. Finally, {\bigcup B_n=[a,b]}, contradicting the Baire Category theorem.

Summary: for closed-graph functions on {\mathbb R}, the sets of discontinuity are precisely the closed sets with empty interior. In particular, every such function has a point of continuity. The proof works just as well for maps from {\mathbb R^n} to any metric space.

However, the above result does not extend to the setting of Banach spaces. Here is an example of a map {F\colon X\rightarrow X} on a Banach space {X} such that {\|F(x)-F(y)\|=1} whenever {x\ne y}; this property implies that the graph is closed, despite {F} being discontinuous everywhere.

Let {X} the space of all bounded functions {\phi \colon (0,1]\rightarrow\mathbb R} with the supremum norm. Let {(q_n)_{n=1}^\infty} be an enumeration of all rational numbers. Define the function {\psi =F(\phi )} separately on each subinterval {(2^{-n},2^{1-n}]}, {n=1,2,\dots} as

{\displaystyle \psi(t) = \begin{cases} 1 \quad &\text{if } \phi (2^nt-1) > q_n \\ 0 \quad &\text{if } \phi (2^n t-1)\le q_n\end{cases}}

For any two distinct elements {\phi_1,\phi_2} of {X} there is a point {s\in (0,1]} and a number {n\in\mathbb N} such that {q_n} is strictly between {\phi_1(s)} and {\phi_2(s)}. According to the definition of {F} this implies that the functions {F(\phi_1)} and {F(\phi_2)} take on different values at the point {t=2^{-n}(s+1)}. Thus the norm of their difference is {1}.

So much for Nonlinear Closed Graph Theorem. However, the space {X} in the above example is nonseparable. Is there an nowhere continuous map between separable Banach spaces such that its graph is closed?

3 calculus 3 examples

The function {f(x,y)=\dfrac{xy}{x^2+y^2}} might be the world’s most popular example demonstrating that the existence of partial derivatives does not imply differentiability.


But in my opinion, it is somewhat extreme and potentially confusing, with discontinuity added to the mix. I prefer

\displaystyle  f(x,y)=\frac{xy}{\sqrt{x^2+y^2}}

pictured below.


This one is continuous. In fact, it is Lipschitz continuous because the first-order partials {f_x} and {f_y} are bounded. The restriction of {f} to the line {y=x} is {f(x,y)=x^2/\sqrt{2x^2} = |x|/\sqrt{2}}, which is a familiar single-variable example of a nondifferentiable function.

To unify the analysis of such examples, let {f(x,y)=xy\,g(x^2+y^2)}. Then

\displaystyle    f_x = y g+ 2x^2yg'

With {g(t)=t^{-1/2}}, where {t=x^2+y^2}, we get

\displaystyle    f_x = O(t^{1/2}) t^{-1/2} + O(t^{3/2})t^{-3/2} = O(1),\quad t\rightarrow 0

By symmetry, {f_y} is bounded as well.

My favorite example from this family is more subtle, with a deceptively smooth graph:

Looks like xy
Looks like xy

The formula is

\displaystyle    f(x,y)=xy\sqrt{-\log(x^2+y^2)}

Since {f} decays almost quadratically near the origin, it is differentiable at {(0,0)}. Indeed, the first order derivatives {f_x} and {f_y} are continuous, as one may observe using {g(t)=\sqrt{-\log t}} above.

And the second-order partials {f_{xx}} and {f_{yy}} are also continuous, if just barely. Indeed,

\displaystyle    f_{xx} = 6xy g'+ 4x^3yg''

Since the growth of {g} is sub-logarithmic, it follows that {g'(t)=o(t^{-1})} and {g''(t)=o(t^{-2})}. Hence,

\displaystyle    f_{xx} = O(t) o(t^{-1}) + O(t^{2}) o(t^{-2}) = o(1),\quad t\rightarrow 0

So, {f_{xx}(x,y)\rightarrow 0 = f_{xx}(0,0)} as {(x,y)\rightarrow (0,0)}. Even though the graph of {f_{xx}} looks quite similar to the first example in this post, this one is continuous. Can’t trust these plots.

Despite its appearance, f_{xx} is continuous
Despite its appearance, f_{xx} is continuous

By symmetry, {f_{yy}} is continuous as well.

But the mixed partial {f_{xy}} does not exist at {(0,0)}, and tends to {+\infty} as {(x,y)\rightarrow (0,0)}. The first claim is obvious once we notice that {f_x(0,y)= y\, g(y^2)} and {g} blows up at {0}. The second one follows from

\displaystyle    f_{xy} = g + 2(x^2+y^2) g' + 4x^2y^2 g''

where {g\rightarrow\infty} while the other two terms tend to zero, as in the estimate for {f_{xx}}. Here is the graph of {f_{xy}}.

Up, up and away
Up, up and away

This example is significant for the theory of partial differential equations, because it shows that a solution of the Poisson equation {f_{xx}+f_{yy} = h } with continuous {h} may fail to be in {C^2} (twice differentiable, with continuous derivatives). The expected gain of two derivatives does not materialize here.

The situation is rectified by upgrading the continuity condition to Hölder continuity. Then {f} indeed gains two derivatives: if {h\in C^\alpha} for some {\alpha\in (0,1)}, then {f\in C^{2,\alpha}}. In particular, the Hölder continuity of {f_{xx} } and {f_{yy} } implies the Hölder continuity of {f_{xy} }.

How much multivariable calculus can be done along curves?

Working with functions of two (or more) real variables is significantly harder than with functions of one variable. It is tempting to reduce the complexity by considering the restrictions of a multivariate function to lines passing through a point of interest. But standard counterexamples of Calculus III, such as \displaystyle f(x,y)=\frac{xy^2}{x^2+y^4}, f(0,0)=0, show that lines are not enough: this function f is not continuous at (0,0), even though its restriction to every line is continuous. It takes a parabola, such as x=y^2, to detect the discontinuity.

Things look brighter if we do allow parabolas and other curves into consideration.

Continuity: f is continuous at a\in\mathbb R^n if and only if f\circ \gamma is continuous at 0 for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma is continuous at 0.

Proof: If f is not continuous, we can find a sequence a_n\to a such that f(a_n)\not\to f(a), and run \gamma through these points, for example in a piecewise linear way.

Having been successful at the level of continuity, we can hope for a similar differentiability result:

Differentiability, take 1: f is differentiable at a\in\mathbb R^n if and only if f\circ \gamma is differentiable at 0 for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma'(0) exists.

Alas, this is false. Take a continuous function g\colon S^{n-1}\to \mathbb R which preserves antipodes (i.e., g(-x)=-g(x)) and extend it to \mathbb R^n via f(tx)=tg(x). Consider \gamma as above, with a\in \mathbb R^n being the origin. If \gamma'(0)=0 when (f\circ \gamma)'(0)=0 because f is Lipschitz. If \gamma'(0)\ne 0, we can rescale the parameter so that \gamma'(0) is a unit vector. It is easy to see that \displaystyle \frac{f(\gamma(t))}{t}= \frac{f(\gamma(t))}{|\gamma(t)|\mathrm{sign}\,t} \frac{|\gamma(t)|}{|t|}\to g(\gamma'(0)), hence f\circ \gamma is differentiable at 0. However, f is not differentiable at a unless g happens to be the restriction of a linear map.

I can’t think of a way to detect the nonlinearity of directional derivative by probing f with curves. Apparently, it has to be imposed artificially.

Differentiability, take 2: f is differentiable at a\in\mathbb R^n if and only if there exists a linear map T such that (f\circ \gamma)'(0)=T\gamma'(0) for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma'(0) exists.

Note that the only viable candidate for T is given by partial derivatives, and those are computed along lines. Thus, we are able determine the first-order differentiability of f using only the tools of single-variable calculus.

Proof goes along the same lines as for continuity, with extra care taken in forming \gamma.

  1. We may assume that T=0 by subtracting Tx from our function. Also assume a=0.
  2. Suppose f is not differentiable at 0. Pick a sequence v_k\to 0 such that |f(v_k)|\ge \epsilon |v_k| for all k.
  3. Passing to a subsequence, make sure that v_k/|v_k| tends to a unit vector v, and also that |v_{k+1}|\le 2^{-k}|v_k|.
  4. Connect the points v_k by line segments. Parametrize this piecewise-linear curve by arc length.
  5. The distance from v_{k+1} and v_k is bounded by |v_{k+1}|+|v_k|\le (1+2^{-k})|v_k|, the triangle inequality. Hence, the total length between 0 and v_k does not exceed \sum_{m\ge k}(1+2^{-m})|v_m| \le (1+c_k)|v_k|, where c_k\to 0 as k\to \infty.
  6. By 3, 4, and 5 the constructed curve \gamma has a one-sided derivative when it reaches 0. Shift the parameter so that \gamma(0)=0. Extend \gamma linearly to get two-sided derivative at 0.
  7. By assumption, |f(\gamma (t))|/|t|\to 0 as t\to 0. This contradicts 2 and 5.

Can one go further and detect the second order differentiability by probing f with paths? But the second derivative is not a pointwise asymptotic condition: it requires the first derivative to exist in a neighborhood. The pointwise second derivative might be possible to detect, but I’m not sure… and it’s getting late.