Folded letters and uniformly bounded function sequences

We know that every bounded sequence of real numbers has a convergent subsequence. For a sequence of functions, say {f_n\colon [0, 1]\to \mathbb R}, the notion of boundedness can be stated as: there exists a constant {M} such that {|f_n(x)|\le M} for every {n} and for all {x\in [0, 1]}. Such a sequence is called uniformly bounded.

Once we fix some point {x\in [0, 1]}, the boundedness assumption provides a subsequence {\{f_{n_k}\}} which converges at that point. But since different points may require different subsequences, it is not obvious whether we can pick a subsequence {\{f_{n_k}\}} which converges for all {x\in [0, 1]}. (Such a subsequence is called pointwise convergent.)

It is easy to give an example of a uniformly bounded sequence with no uniformly convergent subsequence (uniform convergence {f_n\to f} requires {\sup |f_n-f|\to 0}, which is stronger than {f_n(x) \to f(x)} for every {x}). Indeed, {f_n(x) = x^n} does the job. This sequence is uniformly bounded (by {M=1}) and converges pointwise to a discontinuous function {g} such that {g(1)=1 } and {g(x)=0} elsewhere. Any subsequence {f_{n_k}} has the same pointwise limit {g} and since {g} is not continuous, the convergence cannot be uniform.

But what would be an example of a uniformly bounded sequence of continuous functions with no pointwise convergent subsequence? In Principles of Mathematical Analysis Rudin gives {f_n(x) = \sin nx} as such an example but then uses the Lebesgue Dominated Convergence Theorem to prove the non-existence of pointwise convergent subsequences. I do not want to use the DCT.

The simplest example I could think of is based on the letter-folding function {F\colon [0, 1]\to [0, 1]} defined by

{\displaystyle F(x) = \begin{cases} 3x, & x\in [0, 1/3] \\ 2 - 3x, & x\in [1/3, 2/3] \\  3x - 2, & x \in [2/3, 1]  \end{cases} }

or by a magic one-line formula if you prefer: {F(x) = 3x - 1 - |3x-1| + |3x - 2|}.

Letter-folding function

Let {\{f_n\}} be the sequence of the iterates of {L}, that is {f_0(x) = x} and {f_{n+1}(x) = F(f_n(x))}. This means {f_1 = F}, {f_2 = F\circ F}, {f_3 = F\circ F \circ F}, and so on.

The second and third iterates of F

By construction, the sequence {\{f_n\}} is uniformly bounded ({M=1}). It is somewhat similar to the example {\{\sin nx\}} in that we have increasingly rapid oscillations. But the proof that {\{f_n\}} has no pointwise convergent subsequence is elementary. It is based on the following observations.

(A) Suppose that {\{a_j\}} is a sequence such that {a_j\in \{0, 2\}} for each {j \in \mathbb N}. Then the number {\displaystyle x = \sum_{j=1}^\infty \frac{a_j}{3^j}} satisfies {x\in [0, 1/3]} if {a_1=0} and {x\in [2/3, 1]} if {a_1 = 2}.

The proof of (A) amounts to summing a geometric series. Incidentally, observe that {a_1, a_2, \dots} are the digits of {x} in base {3}.

(B) For {x} as above we have {\displaystyle F(x) = \sum_{j=1}^\infty \frac{a_{j+1}}{3^j}}. In other words, {F} shifts the ternary digits of {x} to the left. As a consequence, {f_n} shifts them to the left by {n} places.

(C) Given any subsequence {\{f_{n_k}\}}, let {a_j = 2} if {j = n_{2k} + 1} for some {k}, and {a_j=0} otherwise. By part (B), {\displaystyle f_{n_k}(x) = \sum_{j=1}^\infty \frac{a_{j + n_k}}{3^j} } which means the first ternary digit of {f_{n_k}(x)} is {a_{n_k + 1}}. By construction, this digit is {2} when {k} is even, and {0} when {k} is odd. By part (A) we have {f_{n_k}(x) \ge 2/3} when {k} is even, and {f_{n_k}(x) \le 1/3} when {k} is odd. Thus, {\{f_{n_k}(x)\}} does not converge. This completes the proof.


The set of all points {x} of the form considered in (A), (B), (C), i.e., those with all ternary digits {0} or {2}, is precisely the standard Cantor set {C}.

The function {F} magnifies each half of {C} by the factor of {3}, and maps it onto {C}.

The important part of the formula for {F} is that {F(x) = 3x\bmod 1} when {x\in C}. The rest of it could be some other continuous extension to {[0, 1]}.

A similar example could be based on the tent map {T(x) = 1 - |2x-1|}, whose graph is shown below.

The tent map

However, in case of the tent map it is more convenient to use the sequence of even iterates: {T\circ T}, {T\circ T\circ T\circ T}, and so on.

The second and fourth iterates of T

Indeed, since {T(T(x)) = 4x \bmod 1} when {x\in [0, 1/4]\cup [1/2, 3/4]}, one can simply replace base {3} with base {4} in all of the above computations and arrive at the same conclusion, except the orbit of {x} will now be jumping between the intervals {[0, 1/4]} and {[1/2, 3/4]}.

The tent map is conjugate to the logistic map {L(x) = 4x(1-x)}, shown below. This conjugacy is {L\circ \varphi = \varphi\circ T} where {\varphi(x) = (1 - \cos \pi x)/2}.

Logistic map

The conjugacy shows that the {n}th iterate {L^{n}} is {\varphi\circ T^n \circ \varphi^{-1}}. Therefore, the sequence of the even iterates of {L} has no pointwise convergent subsequence. This provides an example with smooth functions, even polynomials.

The second and fourth iterates of L

One could use the sequence of all iterates (not just the even ones) in the constructions based on {T} and {L}, it just takes a bit of extra work that I prefer to avoid.

Pi and Python: how 22/7 morphs into 355/113

This is a brief return to the topic of Irrational Sunflowers. The sunflower associated with a real number {a} is the set of points with polar coordinates {r=\sqrt{k}} and {\theta = a(2\pi k)}, {k=1, 2, \dots, n}. A sunflower reduces to {n} equally spaced rays if and only if {a} is a rational number written in lowest terms as {m/n}.

Here is the sunflower of {a=\pi} of size {n = 10000}.

Click to see the larger image

Seven rays emanate from the center because {\pi \approx 22/7}, then they become spirals, and spirals rearrange themselves into 113 rays because {\pi \approx 355/113}. Counting these rays is boring, so here is a way to do this automatically with Python (+NumPy as np):

a = np.pi
n = 5000
x = np.mod(a*np.arange(n, 2*n), 1)
np.sum(np.diff(np.sort(x)) > 1/n)

This code computes the polar angles of sunflower points indexed {5000\le k<10000}, sorts them and counts the relatively large gaps between the sorted values. These correspond to the gaps between sunflower rays, except that one of the gaps gets lost when the circle is cut and straightened onto the interval {[0, 2\pi)}. So the program output (112) means there are 113 rays.

Here is the same sunflower with the points alternatively colored red and blue.

Click to see the larger image

The colors blur into purple when the rational approximation pattern is strong. But they are clearly seen in the transitional period from 22/7 approximation to 355/113.

  1. How many points would we need to see the next rational approximation after 355/113?
  2. What will that approximation be? Yes, 22/7 and 355/113 and among the convergent of the continued fraction of {\pi}. But so is 333/106 which I do not see in the sunflower. Are some convergents better than others?

Finally, the code I used to plot sunflowers.

import numpy as np
import matplotlib.pyplot as plt
a = np.pi
k = np.arange(10000)
r = np.sqrt(k)
t = a*2*np.pi*k 
plt.plot(r*np.cos(t), r*np.sin(t), '.')

Transcendental-free Riemann-Lebesgue lemma

Calculus books tend to introduce transcendental functions (trigonometric, exponential, logarithm) early. Analysis textbooks such as Principles of Mathematical Analysis by Rudin tend to introduce them later, because of how long it takes to develop enough of the theory of power series.

The Riemann-Lebesgue lemma involves either trigonometric or exponential functions. But the following version works with the “late transcendentals” approach.

Transcendental-free Riemann-Lebesgue Lemma

TFRLL. Suppose that {f\colon [a, b]\to \mathbb R} and {g\colon \mathbb R\to \mathbb R} are continuously differentiable functions, and {g} is bounded. Then {\int_a^b f(x)g'(nx)\,dx \to 0} as {n\to\infty}.

The familiar form of the lemma is recovered by letting {g(x) = \sin x} or {g(x) = \cos x}.

Proof. By the chain rule, {g'(nx)} is the derivative of {g(nx)/n}. Integrate by parts:

{\displaystyle  \int_a^b f(x)g'(nx)\,dx = \frac{f(b)g(nb)}{n} - \frac{f(a)g(na)}{n} - \int_a^b f'(x)\frac{g(nx)}{n}\,dx }

By assumption, there exists a constant {M} such that {|g|\le M} everywhere. Hence {\displaystyle \left| \frac{f(b)g(nb)}{n}\right| \le \frac{|f(b)| M}{n} }, {\displaystyle \left|\frac{f(a)g(na)}{n}\right| \le \frac{|f(a)| M}{n}}, and {\displaystyle \left|\int_a^b f'(x)\frac{g(nx)}{n}\,dx\right| \le \frac{M}{n} \int_a^b |f'(x)|\,dx}. By the triangle inequality,

{\displaystyle \left|\int_a^b f(x)g'(nx)\,dx \right| \le \frac{M}{n}\left(|f(b)|+|f(a)| + \int_a^b |f'(x)|\,dx \right) \to 0 }

completing the proof.

As a non-standard example, TFRLL applies to, say, {g(x) = \sin (x^2) } for which {g'(x) = 2x\cos (x^2)}. The conclusion is that {\displaystyle \int_a^b f(x) nx \cos (n^2 x^2) \,dx \to 0 }, that is, {\displaystyle  \int_a^b xf(x) \cos (n^2 x^2) \,dx = o(1/n)} which seems somewhat interesting. When {0\notin [a, b]}, the factor of {x} can be removed by applying the result to {f(x)/x}, leading to {\displaystyle \int_a^b f(x) \cos (n^2 x^2) \,dx = o(1/n)}.

What if we tried less smoothness?

Later in Rudin’s book we encounter the Weierstrass theorem: every continuous function on {[a, b]} is a uniform limit of polynomials. Normally, this would be used to make the Riemann-Lebesgue lemma work for any continuous function {f}. But the general form given above, with an unspecified {g}, presents a difficulty.

Indeed, suppose {f} is continuous on {[a, b]}. Given {\epsilon > 0 }, choose a polynomial {p} such that {|p-f|\le \epsilon} on {[a, b]}. Since {p} has continuous derivative, it follows that {\int_a^b p(x)g'(nx)\,dx \to 0}. It remains to show that {\int_a^b p(x)g'(nx)\,dx} is close to {\int_a^b f(x)g'(nx)\,dx}. By the triangle inequality,

{\displaystyle \left| \int_a^b (p(x) - f(x))g'(nx)\,dx \right| \le \epsilon \int_a^b |g'(nx)|\,dx }

which is bounded by … um. Unlike for {\sin } and {\cos}, we do not have a uniform bound for {|g'|} or for its integral. Indeed, with {g(x) = \sin x^2} the integrals {\displaystyle  \int_0^1 |g'(nx)| \,dx = \int_0^1 2nx |\cos (n^2x^2)| \,dx  } grow linearly with {n}. And this behavior would be even worse with {g(x) = \sin e^x}, for example.

At present I do not see a way to prove TFRLL for continuous {f}, let alone for integrable {f}. But I do not have a counterexample either.

Partitions of unity and monotonicity-preserving approximation

There are many ways to approximate a given continuous function {f\colon [a, b]\to \mathbb R} (I will consider the interval {[a, b]=[0, 1]} for convenience.) For example, one can use piecewise linear interpolation through the points {(k/n, f(k/n))}, where {k=0, 1, \dots, n}. The resulting piecewise linear function {g} has some nice properties: for example, it is increasing if {f} is increasing. But it is not smooth.

A convenient way to represent piecewise linear interpolation is the sum {g(x) = \sum_{k=0}^n f(k/n) \varphi_k(x)} where the functions {\varphi_k} are the triangles shown below: {\varphi_k(x) = \max(0, 1 - |nx-k|)}.

Triangular basis functions

The functions {{\varphi_k}} form a partition of unity, meaning that {\sum_k \varphi_k \equiv 1} and all {\varphi_k} are nonnegative. This property leads to the estimate

{\displaystyle |f(x) - g(x)| = \left| \sum_{k=0}^n (f(x) - f(k/n)) \varphi_k(x)\right| \le \sum_{k=0}^n |f(x) - f(k/n)| \varphi_k(x) }

The latter sum is small because when {x} is close to {k/n}, the first factor {|f(x) - f(k/n)|} is small by virtue of continuity, while the second factor {\varphi_k(x)} is bounded by {1}. When {x} is far from {k/n}, the second factor {\varphi_k(x)} is zero, so the first one is irrelevant. The upshot is that {f-g} is uniformly small.

But if we want a smooth approximation {g}, we need a smooth partition of unity {{\varphi_k}}. But not just any set of smooth nonnegative functions that add up to {0} is equally good. One desirable property is preserving monotonicity: if {f} is increasing, then {g} should be increasing, just as this works for piecewise linear interpolation. What does this condition require of our partition of unity?

An increasing function can be expressed as a limit of sums of the form {\sum_{j} c_j [x \ge t_j]} where {c_j>0} and {[\cdots ]} is the Iverson bracket: 1 if true, 0 if false. By linearity, it suffices to have increasing {g} for the case {f(x) = [x \ge t]}. In this case {g} is simply {s_m := \sum_{k=m}^n \varphi_k} for some {m}, {0\le m\le n}. So we want all {s_m} to be increasing functions. Which is the case for the triangular partition of unity, when each {s_m} looks like this:

Also known as a piecewise linear activation function

One smooth choice is Bernstein basis polynomials: {\displaystyle \varphi_k(x) = \binom{n}{k} x^k (1-x)^{n-k}}. These are nonnegative on {[0, 1]}, and the binomial formula shows {\displaystyle \sum_{k=0}^n \varphi_k(x) = (x + 1-x)^n \equiv 1}. Are the sums {\displaystyle s_m(x) = \sum_{k=m}^n \binom{n}{k} x^k (1-x)^{n-k}} increasing with {x}? Let’s find out. By the product rule,

{\displaystyle s_m'(x) = \sum_{k=m}^n \binom{n}{k} k x^{k-1} (1-x)^{n-k} - \sum_{k=m}^n \binom{n}{k} (n-k) x^{k} (1-x)^{n-k - 1}}

In the second sum the term with {k=n} vanishes, and the terms with {k<n} can be rewritten as {\displaystyle \frac{n!}{k! (n-k)!} (n-k) x^{k} (1-x)^{n-k - 1}}, which is {\frac{n!}{(k+1)! (n-k-1)!} (k+1) x^{k} (1-x)^{n-k - 1}}, which is {\binom{n}{k+1} (k+1) x^{k} (1-x)^{n-k - 1} }. After the index shift {k+1\mapsto k} this becomes identical to the terms of the first sum and cancels them out (except for the first one). Thus,

{\displaystyle s_m'(x) = \binom{n}{m} m x^{m-1} (1-x)^{n-m} \ge 0 }

To summarize: the Bernstein polynomials {\displaystyle B_n(x) = \sum_{k=0}^n f(k/n) \binom{n}{k} x^k (1-x)^{n-k}} are monotone whenever {f} is. On the other hand, the proof that {B_n\to f} uniformly is somewhat complicated by the fact that the polynomial basis functions {\varphi_k} are not localized the way that the triangle basis functions are: the factors {\varphi_k(x)} do not vanish when {x} is far from {k/n}. I refer to Wikipedia for a proof of convergence (which, by the way, is quite slow).

Bernstein polynomial basis

Is there some middle ground between non-smooth triangles and non-localized polynomials? Yes, of course: piecewise polynomials, splines. More specifically, B-splines which can be defined as follows: B-splines of degree {1} are the triangle basis functions shown above; a B-spline of degree {d+1} is the moving averages of a {B}-spline of degree {d} with a window of length {h = 1/n}. The moving average of {F} can be written as {\frac{1}{h} \int_{x-h/2}^{x+h/2} f}. We get a partition of unity because the sum of moving averages is the moving average of a sum, and averaging a constant function does not change it.

The splines of even degrees are awkward to work with… they are obtained from the triangles by taking those integrals with {h/2} an odd number of times, which makes their knots fall in the midpoints of the uniform grid instead of the grid points themselves. But I will use {d=2} anyway, because this degree is enough for {C^1}-smooth approximation.

Recall that a triangular basis function {\varphi_k} has slope {\pm n} and is supported on an interval {[(k-1)h, (k+1)h]} where {h=1/n}. Accordingly, its moving average {\psi_k} will be supported on {[(k-3/2)h, (k+3/2)h]}. Since {\psi_k'(x) = n(\phi_k(x+h/2) - \phi_k(x-h/2))}, the second derivative {\psi_k''} is {n^2} when {-3/2< nx-k< -1/2}, is {-2n^2} when {|nx-k| < 1/2}, and is {n^2} again when {1/2 < nx-k < 3/2}. This is enough to figure out the formula for {\psi_k}:

{\displaystyle \psi_k(n) = \begin{cases} (nx-k+3/2)^2 / 2, & -3/2\le nx -k\le -1/2 \\ 3/4 -(nx-k)^2, & -1/2\le nx-k \le 1/2 \\ (nx-k-3/2)^2 / 2, & 1/2\le nx -k \le 3/2 \\ \end{cases} }

These look like:

Is this right?

Nice! But wait a moment, the sum near the endpoints is not constant: it is less than 1 because we do not get a contributions of two splines to the left and right of the interval. To correct for this boundary effect, replace {\psi_0} with {\psi_0 + \psi_{-1}} and {\psi_n} with {\psi_n + \psi_{n+1}}, using “ghost” elements of the basis that lie outside of the actual grid. Now the quadratic B-spline basis is correct:

A smooth partition of unity by quadratic splines

Does this partition of unity preserve monotinicity? Yes, it does: {\displaystyle \sum_{k\ge m}\psi_k'(x) = n\sum_{k\ge m} (\phi_k(x+h/2) - \phi_k(x-h/2)) = n(s(x+h/2) - s(x-h/2))} which is nonnegative because the sum {s := \sum_{k\ge m} \phi_k} is an increasing piecewise linear function, as noted previously. Same logic works for B-splines of higher degree.

In conclusion, here is a quadratic B-spline approximation (orange) to a tricky increasing function (blue).

Smooth approximation

One may wonder why the orange curve deviates from the line at the end – did we miss some boundary effect there? Yes, in a way… the spline actually approximates the continuous extension of our original function by constant values on the left and right. Imagine the blue graph continuing to the right as a horizontal line: this creates a corner at {x=1} and the spline is smoothing that corner. To avoid this effect, one may want to extend {f} in a better way and then work with the extended function, not folding the ghosts {\psi_{-1}, \psi_{n+1}} into {\psi_0, \psi_n}.

But even so, B-spline achieves a better approximation than the Bernstein polynomial with the same number of basis functions (eight):

Bernstein polynomial

The reason is the non-local nature of the polynomial basis {\varphi_k}, which was noted above. Bernstein polynomials do match the function perfectly at the endpoints, but this is small consolation.

The closure of periodic functions and sums of waves with incommensurable periods

Consider the space {C(\mathbb R)} of all bounded continuous functions {f\colon \mathbb R\to\mathbb R}, with the uniform norm {\|f\| = \sup |f|}. Let {P} be its subset that consists of all periodic continuous functions: recall that {f} is periodic if there exists {T>0} such that {f(x+T)=f(x)} for all {x\in \mathbb R}.

The set {P} is not closed in the topology of {C(\mathbb R)}. Indeed, let {d(x) = \mathrm{dist}\,(x, \mathbb Z)} be the distance from {x} to nearest integer. The function {d} is periodic with {T=1}. Therefore, each sum of the form {\displaystyle \sum_{k=0}^n 2^{-k} d(2^{-k} x)} is periodic with {T=2^n}. Hence the sum of the infinite series {\displaystyle f(x) = \sum_{k=0}^\infty 2^{-k} d(2^{-k} x) } is a uniform limit of periodic functions. Yet, {f} is not periodic, because {f(0)=0} and {f(x)>0 } for {x\ne 0} (for every {x\ne 0} there exists {k} such that {2^{-k}x} is not an integer).

Uniform limit of periodic functions

The above example (which was suggested to me by Yantao Wu) is somewhat similar to the Takagi function, which differs from it by the minus sign in the exponent: {\displaystyle T(x) = \sum_{k=0}^\infty 2^{-k} d(2^{k} x) }. Of course, the Takagi function is periodic with period {1}.

Takagi function

Do we really need an infinite series to get such an example? In other words, does the set {\overline{P}\setminus P} contain an elementary function?

A natural candidate is the sum of trigonometric waves with incommensurable periods (that is, the ratio of periods must be irrational). For example, consider the function {g(x) = \cos (x) + \cos (\sqrt{2}x)} whose graph is shown below.

Looks somewhat periodic, does not it?

Since {g(0)=2} and {g(x) < 2} for all {x\ne 0}, the function {g} is not periodic. Its graph looks vaguely similar to the graph of {f}. Is {g} a uniform limit of periodic functions?

Suppose {h\colon \mathbb R\to\mathbb R} is a {T}-periodic function such that {\|h-g\|<\epsilon}. Then {h(0) > 2-\epsilon}, hence {h(nT)>2-\epsilon} for all {n\in \mathbb Z}, hence {g(nT) > 2- 2\epsilon }. By the definition of {g} this implies {\cos (nT) > 1-2\epsilon} and {\cos (\sqrt{2}nT) > 1-2\epsilon} for all {n\in \mathbb Z}. The following lemma shows a contradiction between these properties.

Lemma. If a real number {t} satisfies {\cos nt > -1/2} for all {n\in \mathbb Z}, then {t} is an integer multiple of {2\pi}.

Proof. Suppose {t} is not an integer multiple of {2\pi}. We can assume {0 < t < \pi} without loss of generality, because {t} can be replaced by {t - 2\pi k} to get it in the interval {(0, 2\pi)} and then by {2\pi - t} to get it in {(0, \pi)}. Since {\cos t > -1/2}, we have {t\in (0, 2\pi/3)}. Let {k} be the smallest positive integer such that {2^k t \ge 2\pi/3}. The minimality of {k} implies {2^{k-1} t < 2\pi/3}, hence {2^k t \in [2\pi/3, 4\pi/3)}. But then {\cos (2^k t) \le -1/2}, a contradiction. {\quad \Box}

The constant {-1/2} in the lemma is best possible, since {\cos (2n\pi/3)\ge -1/2} for all {n\in \mathbb Z}.

Returning to the paragraph before the lemma, choose {\epsilon=3/4} so that {1-2\epsilon = -1/2}. The lemma says that both {T} and {\sqrt{2} T} must be integer multiples of {2\pi}, which is impossible since they are incommensurable. This contradiction shows that {\|g-h\|\ge 3/4} for any periodic function {h}, hence {g} is not a uniform limit of periodic functions.

The above result can be stated as {\mathrm{dist}(g, P) \ge 3/4}. I guess {\mathrm{dist}(g, P)} is actually {1}. It cannot be greater than {1} since {|g(x)-\cos x|\le 1} for all {x}. (Update: Yantao pointed out that the density of irrational rotations implies the distance is indeed equal to 1.)

Note: the set {\overline{P}} is a proper subset of the set of (Bohr / Bochner / uniform) almost periodic functions (as Yemon Choi pointed out in a comment). The latter is a linear space while {\overline{P}} is not. I was confused by the sentence “Bohr defined the uniformly almost-periodic functions as the closure of the trigonometric polynomials with respect to the uniform norm” on Wikipedia. To me, a trigonometric polynomial is a periodic function of particular kind. What Bohr called Exponentialpolynom is a finite sum of the form {\sum a_n e^{\lambda_n x}} where {\lambda_n} can be any real numbers. To summarize: the set considered above is the closure of {P} while the set of almost periodic functions is the closed linear span of {P}. The function {g(x)=\cos (x) + \cos(\sqrt{2} x)} is an example of the latter, not of the former.

Points of maximal curvature

In Calculus I students are taught how to find the points at which the graph of a function has zero curvature (that is, the points of inflection). The points of maximal curvature are usually not discussed. This post attempts to rectify this omission.

The (signed) curvature of {y=f(x)} is

{\displaystyle \kappa = \frac{y''}{(1+(y')^2)^{3/2}}}

We want to maximize the absolute value of {\kappa}, whether the function is positive or negative there (so both maxima and minima of {\kappa} can be of interest). The critical points of {\kappa} are the zeros of

{\displaystyle \kappa' = \frac{y''' (1+(y')^2) - 3 y' (y'')^2 }{(1+(y')^2)^{5/2}}}

So we are lead to consider a polynomial of the first three derivatives of {y}, namely {p := y''' (1+(y')^2) - 3 y' (y'')^2 }.

Begin with some simple examples:

{y = x^2} has {p = -24x} so the curvature of a parabola is maximal at its vertex. No surprise there.

{y = x^3} has {p = 6(1 - 45x^4)}, indicating two symmetric points of maximal curvature, {x\approx \pm 0.386}, pretty close to the point of inflection.

y = x^3


{y=x^4} has {p = 24 x (1 - 56 x^6)}. This has three real roots, but {x=0} actually minimizes curvature (it vanishes there).

y = x^4

More generally, {y=x^n} with positive integer {n} yields {\displaystyle p = n(n-1)x^{n-3} (n - 2 - (2n^3-n^2) x^{2n-2})} indicating two points {\displaystyle x = \pm \left(\frac{n-2}{2n^3-n^2} \right)^{1/(2n-2)}} which tend to {\pm 1} as {n} grows.

The graph of a polynomial of degree {n} can have at most {n-2} points of zero curvature, because the second derivative vanishes at those. How many points of maximal curvature can it have? The degree of expression {p} above is {3n-5} but it is not obvious whether all of its roots can be real and distinct, and also be the maxima of {|\kappa|} (unlike {x=0} for {y=x^4}). For {n=2} we do get {3n-5 = 1} point of maximal curvature. But for {n=3}, there can be at most {2} such points, not {3n-5 = 4}. Edwards and Gordon (Extreme curvature of polynomials, 2004) conjectured that the graph of a polynomial of degree {n} has at most {n-1} points of maximal curvature. This remains open despite several partial results: see the recent paper Extreme curvature of polynomials and level sets (2017).

A few more elementary functions:

{y = e^x} has {p = e^x(1-2e^{2x})}, so the curvature is maximal at {x=-\log(2)/2}. Did I expect the maximum of curvature to occur for a negative {x}? Not really.

y = exp(x)

{y=\sin x} has {p = (\cos 2x - 3)\cos x}. The first factor is irrelevant: the points of maximum curvature of a sine wave are at its extrema, as one would guess.

{y = \tan x} has {p = -6\tan^8(x) - 16\tan^6(x) - 6\tan^4(x) + 8\tan(x)^2 + 4} which is zero at… ahem. The expression factors as

{\displaystyle p = -2(\tan^2 x+1)(3\tan^6(x) + 5\tan^4(x) - 2\tan^2(x) - 2)}

Writing {u = \tan^2(x)} we can get a cubic equation in {u}, but it is not a nice one. Or we could do some trigonometry and reduce {p=0} to the equation {8 - 41\cos 2x + \cos 6x =0}. Either way, a numerical solution is called for: {x \approx \pm 0.6937} (and {+\pi n} for other periods).

y = tan(x)

Index of questions

This is a meta-post which collects links to other posts on this blog with (sometimes implicit) questions that were left unanswered there. This does not necessarily mean that nobody has an answer, just that I did not have one when writing the post. The collection is in reverse chronological order.

Fundamental limits as differences of reciprocals

The three fundamental limits (as they were taught to me) are

{\displaystyle \lim_{x\to 0} \frac{e^x-1}{x} = \lim_{x\to 0} \frac{\log(1+x)}{x} = \lim_{x\to 0} \frac{\sin x}{x} = 1}

(exponential, logarithmic, trigonometric). Usually they come with pictures like

exp(x) – 1 and x

which show that the graph of the numerator in each limit, say {y=f(x)}, is indeed close to the line {y=x}.

But there are so many different degrees of “close”: for example, {0.0103\approx 0.01} but the reciprocals of these numbers differ by almost {3}. As a stress-test of the approximation {f(x)\approx x}, let us consider the behavior of {\displaystyle \frac{1}{f(x)} - \frac{1}{x}} for each of the fundamental limits.

Exponential limit

Expressed as a difference: {\displaystyle \frac{1}{e^x - 1} - \frac{1}{x}}

1/(exp(x)-1) – 1/x

Always negative, which is obvious once we recall that the graph of {e^x-1} lies above its tangent line. The graph has central symmetry about {(0, -1/2)} which is not as obvious but is an easy algebra exercise. Since the asymptote on the right is {y=0}, the other asymptote is {y=-1}. This looks a lot like the logistic function {1/(1+\exp(-x))} … but it is not, because the logistic function approaches its asymptotes exponentially fast, while our function does so at the rate {1/x}.
For comparison, here is the logistic curve {\displaystyle y = \frac{1}{1+\exp(-x/3)} - 1} (in green) scaled to match the behavior at {0}.

Too slow to be logistic

This could be potentially useful if one needs a logistic-type function that behaves like a rational function at infinity. Simply using a rational function for this purpose would not do: it cannot have two distinct horizontal asymptotes.

Logarithmic limit

{\displaystyle \frac{1}{\log(x+1)} - \frac{1}{x}}

1/log(1+x) – 1/x

This one is always positive, and has a vertical asymptote at {x=-1} in addition to the horizontal asymptote at {0}. At a glance it may look like shifted/scaled hyperbola {y=1/x}. Indeed, {y = (1+x/3)/(2+x)} is a decent approximation to it, shown in green below.

Not really a hyperbola

Trigonometric limit

{\displaystyle \frac{1}{\sin x} - \frac{1}{x}}

1/sin(x) – 1/x

Unlike in the previous cases, the difference of reciprocals vanished at {0} due to the approximation {\sin x \approx x} being of higher order: the error term is cubic rather than quadratic. The graph looks like the tangent function but it cannot be exactly that since the nearest vertical asymptotes are {x=\pm \pi} rather than {x=\pm \pi/2}. (Not to mention other reasons such as non-periodicity.) A stretched and rescaled tangent, namely {\displaystyle \frac{1}{3}\tan \frac{x}{2}}, sort of fits:

Like a tangent?

Bonus limit

{\tan x \approx x}, with the reciprocal difference being {\displaystyle \frac{1}{\tan x} - \frac{1}{x}}.

1/tan(x) – 1/x

The limit {\displaystyle \lim_{x\to 0} \frac{\tan x}{x} = 1} adds nothing new compared to the previous one, since {\cos 0=1}. But the difference of reciprocals is another story. For one thing, the principal error term for it is twice as large as for the sine limit: {-x/3} versis {x/6}. Accordingly, the graph looks more like {\displaystyle -\frac{2}{3}\tan \frac{x}{2}}.

Not really like a tangent

Taylor series

All of the above differences have derivatives of all orders at the origin, which is not easy to prove with the standard calculus tools. Complex analysis makes the situation clear: the reciprocal of a holomorphic function with a zero at {z=0} can be expanded into a Laurent series, and subtracting {1/z} eliminates the principal part of that series, leaving a convergent power series, i.e., another holomorphic function. Let us take a look at these series:

{\displaystyle \frac{1}{e^x - 1} - \frac{1}{x} = - \frac{1}{2} + \frac{x}{12} - \frac{x^{3}}{720} + \frac{x^{5}}{30240} - \frac{x^{7}}{1209600} + \frac{x^{9}}{47900160} -\cdots }

Nice, the coefficients have alternating signs and all of them have numerator {1}… oh no, next term is {\displaystyle - \frac{691 x^{11}}{1307674368000}}. The signs do continue to alternate. Apart from the constant term, only odd powers of {x} appear, according to the central symmetry noted above. The coefficient of {x^{n-1}} in this series is {B_n/n!} where {B_n} is the {n}th Bernoulli number. These are the “modern” Bernoulli numbers, with {B_1 = -1/2} rather than {1/2}.

{\displaystyle \frac{1}{\log(x+1)} - \frac{1}{x} = \frac{1}{2} - \frac{x}{12} + \frac{x^{2}}{24} - \frac{19 x^{3}}{720} + \frac{3 x^{4}}{160} - \frac{863 x^{5}}{60480} + \cdots }

Also alternating but not omitting even powers, and not decaying nearly as fast as the coefficients of the previous series. (Of course: this function has a singularity at {-1} while the nearest singularities of the exponential thing are at {\pm 2\pi i}.)  These are Gregory coefficients, which according to Wikipedia “often appear in works of modern authors who do not recognize them”. I would not recognize them either without OEIS.

{\displaystyle \frac{1}{\sin x} - \frac{1}{x} = \frac{x}{6} + \frac{7 x^{3}}{360} + \frac{31 x^{5}}{15120} + \frac{127 x^{7}}{604800} + \frac{73 x^{9}}{3421440} + \cdots }

These are all positive for some reason.

{\displaystyle \frac{1}{\tan x} - \frac{1}{x} = - \frac{x}{3} - \frac{x^{3}}{45} - \frac{2 x^{5}}{945} - \frac{x^{7}}{4725} - \frac{2 x^{9}}{93555} - \cdots }

These are all negative for some reason.

Discrete maximum principle for polynomials

The polynomially convex hull of a compact set {K\subset \mathbb C} is defined as the set of all points {z\in \mathbb C} such that the inequality {|p(z)|\le \sup_K |p|} (a form of the maximum principle) holds for every polynomial {p}. For example, the polynomially convex hull of a simple closed curve is the union of that curve with its interior region. In general, this process fills up the holes in the set {K}, resulting in the complement of the unbounded connected component of {\mathbb C\setminus K}.

We can recover the usual convex hull from this construction by restricting {p} to the polynomials of first degree. Indeed, when {p} is linear, the set {|p|\le M} is a closed disk, and we end up with the intersection of all closed disks that contain {K}. This is precisely the convex hull of {K}.

What if we restrict {p} to the polynomials of degree at most {n}? Let’s call the resulting set the degree-{n} convex hull of {K}, denoted {K_n}. By definition, it is contained in the convex hull and contains the polynomially convex hull. To exactly compute {K_n} for general {K} appears to be difficult even when {n=2}.

Consider finite sets. When {K} has at most {n} points, we have {K_n=K} because there is a polynomial of degree {n} whose zero set is precisely {K}. So, the first nontrivial case is of {K} having {n+1} points. Let us write {K=\{z_0, \dots, z_n\}}.

Depending on the location of the points, {K_n} may be strictly larger than {K}. For example, if {K} consists of the vertices of a regular {(n+1)}-gon, then {K_n} also contains its center. Here is why. By a linear transformation, we can make sure that {K=\{\zeta^k\colon k=0, \dots, n\}} where {\zeta = \exp(2\pi i/(n+1))}. For {j=1, \dots, n} we have {\sum_{k=0}^n \zeta^{kj} = (\zeta^{(n+1)j}-1)/(\zeta^j - 1) = 0}. Hence, for any polynomial {p} of degree at most {n}, the sum {\sum_{k=0}^n p(\zeta^k)} is equal to {(n+1)p(0)}. This implies {|p(0)|\le \max_{0\le k\le n}|p(\zeta^k)|}, a kind of a discrete maximum principle.

A more systematic approach is to use the Lagrange basis polynomials, that is {L_j(z) = \prod_{k\ne j} (z-z_k)/(z_j-z_k)}, which satisfy {L_j(z_k) = \delta_{jk}}. Since {p = \sum_j p(z_j)L_j} for any polynomial of degree at most {n}, it follows that {z\in K_n} if and only if {\left|\sum_j c_j L_j(z)\right|\le \max |c_j| } holds for every choice of scalars {c_0, \dots, c_n}. The latter is equivalent to the inequality {\sum_j |L_j(z)|\le 1}.

This leads us to consider the function {S=\sum_j |L_j|}, the sum of the absolute values of the Lagrange basis polynomials. (Remark: S is called a Lebesgue function for this interpolation problem.) Since {\sum_j L_j\equiv 1}, it follows that {S\ge 1} everywhere. By construction, {S=1} on {K}. At a point {z\notin K}, the equality {S(z)=1} holds if and only if {\arg L_j(z)} is the same for all {j}.

In the trivial case {K=\{z_0, z_1\}}, the function {S(z) = (|z-z_0|+|z-z_1|)/|z_0-z_1|} is equal to {1} precisely on the linear segment with endpoints {z_0, z_1}. Of course, this only repeats what we already knew: the degree-1 convex hull is the ordinary convex hull.

If {K=\{x_0, \dots, x_n\}} with {x_0<\cdots<x_n} real and {n\ge 2}, then {K_n=K}. Indeed, if {x\in K_n\setminus K}, then {x} lies in the convex hull of {K}, and therefore {x_{j-1}<x<x_j} for some {j}. The basis polynomial {L_j} is positive at {x}, since it is equal to {1} at {x_j} and does not vanish outside of {K}. Since a polynomial changes its sign at every simple zero, it follows that {L_{j+1}(x) < 0}. Well, there is no {L_{j+1}} if {j=n}, but in that case, the same reasoning applies to {L_{j-2}(x)<0}. In any case, the conclusion is that {\arg L_k(x)} cannot be the same for all {k}.

At this point one might guess that the vertex set of a regular polygon is the only kind of finite sets that admit a nontrivial discrete maximum principle for polynomials. But this is not so: the vertices of a rhombus work as well. Indeed, if {K=\{a, -a, ib, -ib\}} with {a, b>0}, then {L_j(0)>0} for all {j}, hence {S(0)=1}.

The vertices of a non-square rectangle do not work: if {K} is the set of these vertices, the associated function {S} is strictly greater than 1 on the complement of {K}.

Are there any other finite sets that support a discrete maximum principle for polynomials?

Institutions ranked by the number of AMS Fellows, 2020 edition

Only those with the count of 4 or greater are included. Not counting the deceased. Considering CUNY as a single institution. Source of data.

Top 10 changes compared to 2019 ranking: MIT takes sole possession of the 3rd place with Berkeley dropping into 4th, UIUC rises from 8th to 6th, Princeton drops from 6th to 8th, Stanford rises from 11th to 9th, Wisconsin-Madison drops from 8th to 10th, Illinois-Chicago rises from 13th to 10th. Due to ties, the “top 10” are actually top 12.

Honorable mention: Texas A&M rises from 19th to 16th. Having once set “top 20” as the goal of their “Vision 2020” campaign, they achieved it at least by this measure.

  1. Rutgers The State University of New Jersey New Brunswick : 44
  2. University of California, Los Angeles : 39
  3. Massachusetts Institute of Technology : 35
  4. University of California, Berkeley : 32
  5. University of Michigan : 31
  6. Cornell University : 26
  7. University of Illinois, Urbana-Champaign : 26
  8. Princeton University : 25
  9. Stanford University : 24
  10. Brown University : 23
  11. University of Illinois at Chicago : 23
  12. University of Wisconsin, Madison : 23
  13. New York University, Courant Institute : 22
  14. University of California, San Diego : 22
  15. University of Texas at Austin : 22
  16. Texas A&M University : 21
  17. University of Chicago : 21
  18. The City University of New York : 20
  19. University of Washington : 20
  20. Stony Brook University : 19
  21. University of Minnesota-Twin Cities : 19
  22. Purdue University : 17
  23. University of California, Santa Barbara : 17
  24. University of Pennsylvania : 17
  25. Duke University : 16
  26. Indiana University, Bloomington : 16
  27. Ohio State University, Columbus : 16
  28. University of Maryland : 16
  29. Georgia Institute of Technology : 15
  30. Northwestern University : 15
  31. Pennsylvania State University : 15
  32. University of California, Irvine : 14
  33. University of Utah : 14
  34. Johns Hopkins University, Baltimore : 13
  35. University of British Columbia : 12
  36. Boston University : 11
  37. Harvard University : 11
  38. University of California, Davis : 11
  39. University of Notre Dame : 11
  40. University of Toronto : 11
  41. Eidgenössische Technische Hochschule Zürich (ETH Zürich) : 10
  42. University of North Carolina at Chapel Hill : 10
  43. University of Virginia : 10
  44. Vanderbilt University : 10
  45. Brandeis University : 9
  46. Columbia University : 9
  47. Institute for Advanced Study : 9
  48. University of Oregon : 9
  49. Michigan State University : 8
  50. Rice University : 8
  51. Tel Aviv University : 8
  52. University of Georgia : 8
  53. University of Nebraska-Lincoln : 8
  54. University of Southern California : 8
  55. California Institute of Technology : 7
  56. Ecole Polytechnique Fédérale de Lausanne (EPFL) : 7
  57. Microsoft Research : 7
  58. North Carolina State University : 7
  59. University of Oxford : 7
  60. University of Rochester : 7
  61. Williams College : 7
  62. Carnegie Mellon University : 6
  63. Imperial College : 6
  64. Louisiana State University, Baton Rouge : 6
  65. The Hebrew University of Jerusalem : 6
  66. Université Pierre et Marie Curie (Paris VI) : 6
  67. University of Arizona : 6
  68. Harvey Mudd College : 5
  69. Northeastern University : 5
  70. Temple University : 5
  71. Université Paris-Diderot : 5
  72. University of California, Riverside : 5
  73. University of Colorado, Boulder : 5
  74. Virginia Polytechnic Institute and State University : 5
  75. Boston College : 4
  76. Florida State University : 4
  77. NYU Polytechnic School of Engineering : 4
  78. Norwegian University of Science and Technology : 4
  79. Rutgers The State University of New Jersey Newark : 4
  80. Université Paris-Sud (Paris XI) : 4
  81. University of California, Santa Cruz : 4
  82. University of Cambridge : 4
  83. University of Connecticut, Storrs : 4
  84. University of Missouri-Columbia : 4
  85. University of Tennessee, Knoxville : 4
  86. University of Warwick : 4
  87. Washington University : 4
  88. Weizmann Institute of Science : 4
  89. Yale University : 4