Extremal Taylor polynomials

Suppose {f(z)=a_0+a_1z+a_2z^2+\cdots} is a holomorphic function in the unit disk {|z|<1} such that {|f|\le 1} in the disk. How large can its Taylor polynomial {T_n(z)=a_0+a_1z+\cdots +a_n z^n} be in the disk?

We should not expect {T_n} to be bounded by 1 as well. Indeed, the Möbius transformation {f(z)=(z+1/2)/(1+z/2)} has Taylor expansion {(z+1/2)(1-z/2+O(z^2)) = 1/2 + (3/4)z + O(z^2)}, so {T_1(1)=5/4} in this case. This turns out to be the worst case: in general {T_1} is bounded by 5/4 in the disk.

For the second-degree polynomial {T_2} the sharp bound is {89/64}, attained when {f(z) = (8z^2 + 4z + 3)/(3z^2 + 4z + 8)}; the image of the unit circle under the extremal {T_2} is shown below. Clearly, there is something nontrivial going on.

T2
Extremal T_2 attains 89/64 > 1.39

Edmund Landau established the sharp bound for {|T_n|} in his paper Abschätzung der Koeffizientensumme einer Potenzreihe, published in Archiv der Mathematik und Physik (3) 21 in 1913. Confusingly, there are two papers with the same title in the same issue of the journal: one on pages 42-50, the other on pages 250-255, and they appear in different volumes of Landau’s Collected Works. The sharp bound is in the second paper.

First steps

By rotation, it suffices to bound {|T_n(1)|}, which is {|a_0+\cdots +a_n|}. As is often done, we rescale {f} a bit so that it’s holomorphic in a slightly larger disk, enabling the use of the Cauchy integral formula on the unit circle {\mathbb T}. The Cauchy formula says {2\pi i a_k = \int_{\mathbb T} z^{-k-1} f(z) \,dz}. Hence

{\displaystyle 2\pi |T_n(1)| = \left| \int_{\mathbb T} z^{-n-1}(1+z+\dots+z^n) f(z) \,dz \right|}

It is natural to use {|f(z)|\le 1} now, which leads to

{\displaystyle 2\pi |T_n(1)| \le \int_{\mathbb T} |1+z+\dots+z^n|\, |dz| }

Here we can use the geometric sum formula and try to estimate the integral of {|(1-z^{n+1})/(1-z)|} on the unit circle. This is what Landau does in the first of two papers; the result is {O(\log n)} which is the correct rate of growth (this is essentially the Dirichlet kernel estimate from the theory of Fourier series). But there is a way to do better and get the sharp bound.

Key ideas

First idea: the factor {1+z+\dots+z^n} could be replaced by any polynomial {Q} as long as the coefficients of powers up to {n} stay the same. Higher powers contribute nothing to the integral that evaluates {T_n(1)}, but they might reduce the integral of {|Q|}.

Second idea: we should choose {Q} to be the square of some polynomial, {Q=P^2}, because {(2\pi)^{-1}\int_{\mathbb T} |P(z)|^2\, |dz|} can be computed exactly: it is just the sum of squares of the coefficients of {P}, by Parseval’s formula.

Implementation

Since {1+z+\dots+z^n} is the {n}-th degree Taylor polynomial of {(1-z)^{-1}}, it is natural to choose {P} to be the {n}-th degree Taylor polynomial of {(1-z)^{-1/2}}. Indeed, if {P_n(z) = (1-z)^{-1/2} + O(z^{n+1})}, then {P_n(z)^2 = (1-z)^{-1} + O(z^{n+1}) = 1+z+\dots+z^n + O(z^{n+1})} as desired (asymptotics as {z\to 0}). The binomial formula tells us that
{\displaystyle P_n(z)=\sum_{k=0}^n (-1)^k\binom{-1/2}{k}z^k }

The coefficient of {z^k} here can be written out as {(2k-1)!!/(2k)!!} or rewritten as {4^{-k}\binom{2k}{k}} which shows that in lowest terms, its denominator is a power of 2. To summarize, {|T_n(1)|} is bounded by the sum of squares of the coefficients of {P_n}. Such sums are referred to as the Landau constants,

{\displaystyle G_n = 1+ \left(\frac{1}{2}\right)^2 + \left(\frac{1\cdot 3}{2\cdot 4}\right)^2 + \cdots + \left(\frac{(2n-1)!!}{(2n)!!}\right)^2 }

A number of asymptotic and non-asymptotic formulas have been derived for {G_n}, for example Brutman (1982) shows that {G_n - (1/\pi)\log(n+1)} is between 1 and 1.0663.

Sharpness

To demonstrate the sharpness of the bound {|T_n|\le G_n}, we want {|f|\equiv 1} and {P_n(z)^2f(z)/z^n\ge 0} on the unit circle. Both are arranged by taking {f(z) = z^n P_n(1/z) / P_n(z)} which is a Blaschke product of degree {n}. Note that the term {P_n(1/z)} can also be written as {\overline{P_n(1/\bar z)}}. Hence {P_n(z)^2f(z)/z^n = P_n(z) \overline{P_n(1/\bar z)}} which is simply {|P_n(z)|^2} when {|z|=1}. Equality holds in all the estimates above, so they are sharp.

Here are the images of the unit circle under extremal Taylor polynomials {T_5} and {T_{20}}.

T5
Extremal Taylor polynomial of 5th degree
T20
Extremal Taylor polynomial of 20th degree

These polynomials attain large values only on a short subarc of the circle; most of the time they oscillate at levels less than 1. Indeed, the mean value of {|T_n|^2} cannot exceed the mean of {|f|^2} which is at most 1. Here is the plot of the roots of extremal {T_n}:  they are nearly uniform around the circle, except for a gap near 1.

Troots10
Roots of extremail T_10
Troots20
Roots of extremal T_20

But we are not done…

Wait a moment. Does {f(z) = z^n P_n(1/z) / P_n(z)} define a holomorphic function in the unit disk? We are dividing by {P_n} here. Fortunately, {P_n} has no zeros in the unit disk, because its coefficients are positive and decreasing as the exponent {k} increases. Indeed, if {p(z)=c_0+c_1z+\cdots + c_nz^n} with {c_0>c_1>\dots>c_n > 0}, then {(1-z)p(z)} has constant term {c_0} and other coefficients {c_1-c_0}, {c_2-c_1}, … {c_n-c_{n-1}}, {-c_n}. Summing the absolute values of the coefficients of nonconstant terms we get {c_0}. So, when these coefficients are attached to {z^k} with {|z|<1}, the sum of nonconstant terms is strictly less than {c_0} in absolute value. This proves {P_n\ne 0} in the unit disk. Landau credits Adolf Hurwitz with this proof.

In fact, the zeros of {P_n} (Taylor polynomials of {(1-z)^{-1/2}}) lie just outside of the unit disk.

roots20
Zeros of P_20
roots50
Zeros of P_50

The zeros of the Blaschke products formed from {P_n} are the reciprocals of the zeros of  {P_n}, so they lie just inside the unit circle, much like the zeros of {T_n} (though they are different).

Quadrature rules and quadrature domains

The midpoint rule of numerical integration

\displaystyle \int_a^b f(x)\,dx \approx (b-a)f\left(\frac{a+b}{2}\right)

is approximate in general, but exact for linear functions (polynomials of degree at most one).

Midpoint Rule
Midpoint Rule

With two sample points we can integrate any cubic polynomial exactly. The choice of sample points is not obvious: they are to be placed at distance {\dfrac{1}{2\sqrt{3}}(b-a)} from the midpoint. On the example below, {a=0} and {b=1}, so the sample points are {x_{1,2} = \dfrac12 \pm \dfrac{1}{2\sqrt{3}}}. The integral is equal to {\dfrac12 f(x_1)+\dfrac12 f(x_2)}. One can say that each sample point has weight {1/2} in this rule.

Two-point quadrature: areas of yellow rectangles add up to the integral
Two-point quadrature: areas of yellow rectangles add up to the integral

Three sample points are enough for polynomials of degrees up to and including five. This time, the weights attached to the sample points are not equal. The midpoint is used again, this time with the weight of {4/9}. Two other sample points are at distance {\dfrac{\sqrt{3}}{2\sqrt{5}}(b-a)} from the midpoint, and their weights are {5/18} each. This contraption exactly integrates polynomials of degrees up to five.

Three-point quadrature
Three-point quadrature: rectangles of unequal width

Compare this with Simpson’s rule, which also uses three sample points but is exact only up to degree three.

The above are examples of Gaussian quadrature: for each positive integer {n}, one can integrate polynomials of degree up to {2n-1} by taking {n} samples at the right places, and weighing them appropriately.


Let’s move from the real line to the complex plane. If one accepts that the analog of interval {(a,b)} is a disk in the plane, then quadrature becomes very simple: for any disk {D} and any complex polynomials {p},

\displaystyle   \iint_D p = \pi r^2 p(c)

where {c} is the center of the disk and {r} is its radius. One sample point is enough for all degrees! The proof is easy: rewrite {p} in terms of powers of {(z-c)} and integrate them in polar coordinates. The same works for any holomorphic function, as long as it is integrable in {D}.

Disk: a quadrature domain with one node
Disk: a quadrature domain with one node

But maybe the disk is a unique such shape? Not at all: there are other such quadrature domains. A simple family of examples is Neumann ovals (Neumann as in “boundary condition”). Geometrically, they are ellipses inverted in a concentric circle. Analytically, they are (up to linear transformations) images of the unit disk {\mathbb{D}} under

\displaystyle  \varphi(z)=\frac{z}{1-c^2 z^2}\quad (0\le c<1)

This image, denoted {\Omega } below, looks much like an ellipse when {c} is small:

Neumann oval with c=0.3
Neumann oval with c=0.3

Then it becomes peanut-shaped:

Neumann oval with c=0.6
Neumann oval with c=0.6

For {c\approx 1} it looks much like the union of two disks (but the boundary is smooth, contrary to what the plot suggests):

Neumann oval with c=0.95
Neumann oval with c=0.95

In each of these images, the marked points are the quadrature nodes. Let’s find out what they are and how they work.

Suppose {f} is holomorphic and integrable in {\Omega}. By a change of variables,

\displaystyle  \iint_{\Omega } f = \iint_{\mathbb D} (f\circ \varphi)\, \varphi' \,\overline{\varphi'}

Here { (f\circ \varphi) \varphi'} is holomorphic, but { \overline{\varphi'} } is anti-holomorphic. We want to know what { \overline{\varphi'} } does when integrated against something holomorphic. Power series to the rescue:

\displaystyle    \varphi(z) = z\sum_{n=0}^\infty c^{2n} z^{2n} = \sum_{n=0}^\infty c^{2n} z^{2n+1}

hence

\displaystyle  \overline{\varphi'(z)} = \sum_{n=0}^\infty c^{2n} (2n+1) \bar z^{2n}

Multiply this by {z^{k}} and integrate over {\mathbb D} in polar coordinates: the result is {0} if {k} is odd and

\displaystyle    2\pi c^{k} (k+1) \int_0^1 r^{2k} r\,dr= \pi c^{k}

if {k} is even. So, integration of {\overline{\varphi'(z)}} against a power series {g(z) = \sum a_k z^k} produces the sum of {a_{k} c^{k}} over even powers {k} only (with the factor of {\pi}). The process of dropping odd powers amounts to taking the even part of {g}:

\displaystyle  \iint_{\mathbb D} g\, \overline{\varphi'} = \frac{\pi}{2} ( g(c)+ g(-c))

Yes, there’s something about { \overline{\varphi'} } that’s magic (key words: reproducing kernel, specifically Bergman kernel). Plug {g= (f\circ \varphi)\, \varphi' } to conclude

\displaystyle  \iint_{\Omega } f = \frac{\pi}{2} \Big\{f(\varphi(c))\, \varphi'(c) + f(\varphi(-c) ) \,\varphi'(-c)\Big\}

So, the nodes are {\varphi(\pm c) = \pm c/(1-c^4)}. They have equal weight, because {\varphi'(\pm c) = \dfrac{1+c^4}{(1-c^4)^2}}. Final result:

\displaystyle  \iint_{\Omega } f = \frac{\pi (1+c^4)}{2 (1-c^4)^2}\, \left\{f \left( \frac{c}{1-c^4} \right) + f \left(- \frac{c}{1-c^4} \right) \right\}

Again, this is a two-point quadrature formula that is exact for all complex polynomials.

As a bonus, put {f\equiv 1} to find that the area of {\Omega} is {\dfrac{\pi (1+c^4)}{ (1-c^4)^2}}.

Three-point test for being holomorphic

This is a marvelous exercise in complex analysis; I heard it from Steffen Rohde but don’t remember the original source.

Let {D=\{z\in \mathbb C\colon |z|<1\}}. Suppose that a function {f\colon D\rightarrow D} satisfies the following property: for every three points {z_1,z_2,z_3\in D} there exists a holomorphic function {g\colon D\rightarrow D} such that {f(z_k)=g(z_k)} for {k=1,2,3}. Prove that {f} is holomorphic.

No solution here, just some remarks.

  • The domain does not matter, because holomorphicity is a local property.
  • The codomain matters: {D} cannot be replaced by {\mathbb C}. Indeed, for any function {f\colon D\rightarrow\mathbb C} and any finite set {z_1,\dots, z_n\in D} there is a holomorphic function that agrees with {f} at {z_1, \dots, z_n} — namely, an interpolating polynomial.
  • Two points {z_1,z_2} would not be enough. For example, {f(z)=\mathrm{Re}\,z} passes the two-point test but is not holomorphic.

Perhaps the last item is not immediately obvious. Given two points {z_1,z_2\in D}, let {x_k=\mathrm{Re}\,z_k}. The hyperbolic distance {\rho} between {z_1} and {z_2} is the infimum of {\displaystyle \int_\gamma \frac{1}{1-|z|^2}} taken over all curves {\gamma} connecting {z_1} to {z_2}. Projecting {\gamma} onto the real axis, we obtain a parametrized curve {\tilde \gamma} connecting {x_1} to {x_2}.

Projection does not increase the hyperbolic distance.
Projection does not increase the hyperbolic distance.

Since

\displaystyle  \int_{\tilde \gamma} \frac{1}{1-|z|^2} =    \int_{\tilde \gamma} \frac{1}{1-|\mathrm{Re}\,z|^2}\le  \int_{\gamma} \frac{1}{1-|\mathrm{Re}\,z|^2}\le \int_\gamma \frac{1}{1-|z|^2}

it follows that {\rho(x_1,x_2)\le \rho(z_1,z_2)}. That is, {f} is a nonexpanding map in the hyperbolic metric of the disk.

We can assume that {x_1\le x_2}. There is a Möbius map {\phi} such that {\phi(z_1)=x_1}; moreover, we can arrange that {\phi(z_2)} is a real number greater than {x_1}, by applying a hyperbolic rotation about {x_1}. Since {\phi} is a hyperbolic isometry, {\rho(x_1,\phi(z_2))\ge \rho(x_1,x_2)}, which implies {\phi(z_2)\ge x_2}. Let {\lambda(z)=x_1+(z-x_1)\dfrac{x_2-x_1}{\phi(z_2)-x_1}}; this is a Euclidean homothety such that {\lambda(x_1)=x_1} and {\lambda(\phi(z_2))= x_2}. By convexity of {D}, {\lambda(D)\subset D}. The map {g=\lambda\circ \phi} achieves {g(z_k)=x_k} for {k=1,2}.

The preceding can be immediately generalized: {f} passes the two-point test if and only if it is a nonexpanding map in the hyperbolic metric. Such maps need not be differentiable even in the real-variable sense.

However, the three-point test is a different story.