How to measure the nonlinearity of a function where is an interval? A natural way is to consider the smallest possible deviation from a line , that is . It turns out to be convenient to divide this by , the length of the interval . So, let . (This is similar to β-numbers of Peter Jones, except the deviation from a line is measured only in the vertical direction.)
Relation with derivatives
The definition of derivative immediately implies that if exists, then as shrinks to (that is, gets smaller while containing ). A typical construction of a nowhere differentiable continuous function is based on making bounded from below; it is enough to do this for dyadic intervals, and that can be done by adding wiggly terms like : see the blancmange curve.
The converse is false: if as shrinks to , the function may still fail to be differentiable at . The reason is that the affine approximation may have different slopes at different scales. An example is in a neighborhood of . Consider a small interval . The line with is a good approximation to because on most of the interval except for a very small part near , and on that part is very close to anyway.
Why the root of logarithm? Because has a fixed amount of change on a fixed proportion of , independently of . We need a function slower than the logarithm, so that as decreases, there is a smaller amount of change on a larger part of the interval .
Nonlinearity of Lipschitz functions
Suppose is a Lipschitz function, that is, there exists a constant such that for all . It’s easy to see that , by taking the mid-range approximation . But the sharp bound is whose proof is not as trivial. The sharpness is shown by with .
Proof. Let be the slope of the linear function that agrees with at the endpoints of . Subtracting this linear function from gives us a Lipschitz function such that and . Let . Chebyshev’s inequality gives lower bounds for the measures of the sets and : namely, and . By adding these, we find that . Since , the mid-range approximation to has error at most . Hence .
Turns out, the graph of every Lipschitz function has relatively large almost-flat pieces. That is, there are subintervals of nontrivial size where the measure of nonlinearity is much smaller than the Lipschitz constant. This result is a special (one-dimensional) case of Theorem 2.3 in Affine approximation of Lipschitz functions and nonlinear quotients by Bates, Johnson, Lindenstrauss, Preiss, and Schechtman.
Theorem AA (for “affine approximation”): For every there exists with the following property. If is an -Lipschitz function, then there exists an interval with and .
Theorem AA should not be confused with Rademacher’s theorem which says that a Lipschitz function is differentiable almost everywhere. The point here is a lower bound on the size of the interval . Differentiability does not provide that. In fact, if we knew that is smooth, or even a polynomial, the proof of Theorem AA would not become any easier.
Proof of Theorem AA
We may assume and . For let . That is, is the restricted Lipschitz constant, one that applies for distances at least . It is a decreasing function of , and .
Note that and that every value of is within of either or . Hence, the oscillation of on is at most . If , then the constant mid-range approximation on gives the desired conclusion, with . From now on .
The sequence is increasing toward , which implies for some . Pick an interval that realizes , that is and . Without loss of generality (otherwise consider ). Let be the middle half of . Since each point of is within distance of both and , it follows that for all .
So far we have pinched between two affine functions of equal slope. Let us consider their difference:
. Recall that , which gives a bound of for the difference. Approximating by the average of the two affine functions we conclude that as required.
It remains to consider the size of , about which we only know so far. Naturally, we want to take the smallest such that holds. Let be this value; then . Here and . The conclusion is that , hence . This finally yields as an acceptable choice, completing the proof of Theorem AA.
A large amount of work has been done on quantifying in various contexts; for example Heat flow and quantitative differentiation by Hytönen and Naor.