f(f(x)) = 4x

There are plenty of continuous functions {f} such that {f(f(x)) \equiv x}. Besides the trivial examples {f(x)=x} and {f(x)=-x}, one can take any equation {F(x,y)=0} that is symmetric in {x,y} and has a unique solution for one variable in terms of the other. For example: {x^3+y^3-1 =0 } leads to {f(x) = (1-x^3)^{1/3}}.

x3y3
x^3+y^3 = 1

I can’t think of an explicit example that is also differentiable, but implicitly one can be defined by {x^3+y^3+x+y=1}, for example. In principle, this can be made explicit by solving the cubic equation for {x}, but I’d rather not.

implicit
x^3+y^3+x+y = 1

At the time of writing, I could not think of any diffeomorphism {f\colon \mathbb R \rightarrow \mathbb R} such that both {f} and {f^{-1}} have a nice explicit form. But Carl Feynman pointed out in a comment that the hyperbolic sine {f(x)= \sinh x = (e^x-e^{-x})/2} has the inverse {f^{-1}(x) = \log(x+\sqrt{x^2+1})} which certainly qualifies as nice and explicit.


Let’s change the problem to {f(f(x))=4x}. There are still two trivial, linear solutions: {f(x)=2x} and {f(x)=-2x}. Any other? The new equation imposes stronger constraints on {f}: for example, it implies

\displaystyle f(4x) = f(f(f(x)) = 4f(x)

But here is a reasonably simple nonlinear continuous example: define

\displaystyle f(x) = \begin{cases} 2^x,\quad & 1\le x\le 2 \\ 4\log_2 x,\quad &2\le x\le 4 \end{cases}

and extend to all {x} by {f(\pm 4x) = \pm 4f(x)}. The result looks like this, with the line {y=2x} drawn in red for comparison.

ff4x
f(f(x)) = 4x

To check that this works, notice that {2^x} maps {[1,2]} to {[2,4]}, which the function {4\log_2 x} maps to {[4,8]}, and of course {4\log _2 2^x = 4x}.

From the plot, this function may appear to be differentiable for {x\ne 0}, but it is not. For example, at {x=2} the left derivative is {4\ln 2 \approx 2.8} while the right derivative is {2/\ln 2 \approx 2.9}.
This could be fixed by picking another building block instead of {2^x}, but not worth the effort. After all, the property {f(4x)=4f(x)} is inconsistent with differentiability at {0} as long as {f} is nonlinear.

The plots were made in Sage, with the function f define thus:

def f(x):
    if x == 0:
        return 0
    xa = abs(x)
    m = math.floor(math.log(xa, 2))
    if m % 2 == 0:
        return math.copysign(2**(m + xa/2**m), x)
    else:
        return math.copysign(2**(m+1) * (math.log(xa, 2)-m+1), x)

3 calculus 3 examples

The function {f(x,y)=\dfrac{xy}{x^2+y^2}} might be the world’s most popular example demonstrating that the existence of partial derivatives does not imply differentiability.

xy/(x^2+y^2)
xy/(x^2+y^2)

But in my opinion, it is somewhat extreme and potentially confusing, with discontinuity added to the mix. I prefer

\displaystyle  f(x,y)=\frac{xy}{\sqrt{x^2+y^2}}

pictured below.

xy/sqrt(x^2+y^2)
xy/sqrt(x^2+y^2)

This one is continuous. In fact, it is Lipschitz continuous because the first-order partials {f_x} and {f_y} are bounded. The restriction of {f} to the line {y=x} is {f(x,y)=x^2/\sqrt{2x^2} = |x|/\sqrt{2}}, which is a familiar single-variable example of a nondifferentiable function.

To unify the analysis of such examples, let {f(x,y)=xy\,g(x^2+y^2)}. Then

\displaystyle    f_x = y g+ 2x^2yg'

With {g(t)=t^{-1/2}}, where {t=x^2+y^2}, we get

\displaystyle    f_x = O(t^{1/2}) t^{-1/2} + O(t^{3/2})t^{-3/2} = O(1),\quad t\rightarrow 0

By symmetry, {f_y} is bounded as well.

My favorite example from this family is more subtle, with a deceptively smooth graph:

Looks like xy
Looks like xy

The formula is

\displaystyle    f(x,y)=xy\sqrt{-\log(x^2+y^2)}

Since {f} decays almost quadratically near the origin, it is differentiable at {(0,0)}. Indeed, the first order derivatives {f_x} and {f_y} are continuous, as one may observe using {g(t)=\sqrt{-\log t}} above.

And the second-order partials {f_{xx}} and {f_{yy}} are also continuous, if just barely. Indeed,

\displaystyle    f_{xx} = 6xy g'+ 4x^3yg''

Since the growth of {g} is sub-logarithmic, it follows that {g'(t)=o(t^{-1})} and {g''(t)=o(t^{-2})}. Hence,

\displaystyle    f_{xx} = O(t) o(t^{-1}) + O(t^{2}) o(t^{-2}) = o(1),\quad t\rightarrow 0

So, {f_{xx}(x,y)\rightarrow 0 = f_{xx}(0,0)} as {(x,y)\rightarrow (0,0)}. Even though the graph of {f_{xx}} looks quite similar to the first example in this post, this one is continuous. Can’t trust these plots.

Despite its appearance, f_{xx} is continuous
Despite its appearance, f_{xx} is continuous

By symmetry, {f_{yy}} is continuous as well.

But the mixed partial {f_{xy}} does not exist at {(0,0)}, and tends to {+\infty} as {(x,y)\rightarrow (0,0)}. The first claim is obvious once we notice that {f_x(0,y)= y\, g(y^2)} and {g} blows up at {0}. The second one follows from

\displaystyle    f_{xy} = g + 2(x^2+y^2) g' + 4x^2y^2 g''

where {g\rightarrow\infty} while the other two terms tend to zero, as in the estimate for {f_{xx}}. Here is the graph of {f_{xy}}.

Up, up and away
Up, up and away

This example is significant for the theory of partial differential equations, because it shows that a solution of the Poisson equation {f_{xx}+f_{yy} = h } with continuous {h} may fail to be in {C^2} (twice differentiable, with continuous derivatives). The expected gain of two derivatives does not materialize here.

The situation is rectified by upgrading the continuity condition to Hölder continuity. Then {f} indeed gains two derivatives: if {h\in C^\alpha} for some {\alpha\in (0,1)}, then {f\in C^{2,\alpha}}. In particular, the Hölder continuity of {f_{xx} } and {f_{yy} } implies the Hölder continuity of {f_{xy} }.

How much multivariable calculus can be done along curves?

Working with functions of two (or more) real variables is significantly harder than with functions of one variable. It is tempting to reduce the complexity by considering the restrictions of a multivariate function to lines passing through a point of interest. But standard counterexamples of Calculus III, such as \displaystyle f(x,y)=\frac{xy^2}{x^2+y^4}, f(0,0)=0, show that lines are not enough: this function f is not continuous at (0,0), even though its restriction to every line is continuous. It takes a parabola, such as x=y^2, to detect the discontinuity.

Things look brighter if we do allow parabolas and other curves into consideration.

Continuity: f is continuous at a\in\mathbb R^n if and only if f\circ \gamma is continuous at 0 for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma is continuous at 0.

Proof: If f is not continuous, we can find a sequence a_n\to a such that f(a_n)\not\to f(a), and run \gamma through these points, for example in a piecewise linear way.

Having been successful at the level of continuity, we can hope for a similar differentiability result:

Differentiability, take 1: f is differentiable at a\in\mathbb R^n if and only if f\circ \gamma is differentiable at 0 for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma'(0) exists.

Alas, this is false. Take a continuous function g\colon S^{n-1}\to \mathbb R which preserves antipodes (i.e., g(-x)=-g(x)) and extend it to \mathbb R^n via f(tx)=tg(x). Consider \gamma as above, with a\in \mathbb R^n being the origin. If \gamma'(0)=0 when (f\circ \gamma)'(0)=0 because f is Lipschitz. If \gamma'(0)\ne 0, we can rescale the parameter so that \gamma'(0) is a unit vector. It is easy to see that \displaystyle \frac{f(\gamma(t))}{t}= \frac{f(\gamma(t))}{|\gamma(t)|\mathrm{sign}\,t} \frac{|\gamma(t)|}{|t|}\to g(\gamma'(0)), hence f\circ \gamma is differentiable at 0. However, f is not differentiable at a unless g happens to be the restriction of a linear map.

I can’t think of a way to detect the nonlinearity of directional derivative by probing f with curves. Apparently, it has to be imposed artificially.

Differentiability, take 2: f is differentiable at a\in\mathbb R^n if and only if there exists a linear map T such that (f\circ \gamma)'(0)=T\gamma'(0) for every map \gamma\colon \mathbb R\to \mathbb R^n such that \gamma(0)=a and \gamma'(0) exists.

Note that the only viable candidate for T is given by partial derivatives, and those are computed along lines. Thus, we are able determine the first-order differentiability of f using only the tools of single-variable calculus.

Proof goes along the same lines as for continuity, with extra care taken in forming \gamma.

  1. We may assume that T=0 by subtracting Tx from our function. Also assume a=0.
  2. Suppose f is not differentiable at 0. Pick a sequence v_k\to 0 such that |f(v_k)|\ge \epsilon |v_k| for all k.
  3. Passing to a subsequence, make sure that v_k/|v_k| tends to a unit vector v, and also that |v_{k+1}|\le 2^{-k}|v_k|.
  4. Connect the points v_k by line segments. Parametrize this piecewise-linear curve by arc length.
  5. The distance from v_{k+1} and v_k is bounded by |v_{k+1}|+|v_k|\le (1+2^{-k})|v_k|, the triangle inequality. Hence, the total length between 0 and v_k does not exceed \sum_{m\ge k}(1+2^{-m})|v_m| \le (1+c_k)|v_k|, where c_k\to 0 as k\to \infty.
  6. By 3, 4, and 5 the constructed curve \gamma has a one-sided derivative when it reaches 0. Shift the parameter so that \gamma(0)=0. Extend \gamma linearly to get two-sided derivative at 0.
  7. By assumption, |f(\gamma (t))|/|t|\to 0 as t\to 0. This contradicts 2 and 5.

Can one go further and detect the second order differentiability by probing f with paths? But the second derivative is not a pointwise asymptotic condition: it requires the first derivative to exist in a neighborhood. The pointwise second derivative might be possible to detect, but I’m not sure… and it’s getting late.