The polynomially convex hull of a compact set is defined as the set of all points such that the inequality (a form of the maximum principle) holds for every polynomial . For example, the polynomially convex hull of a simple closed curve is the union of that curve with its interior region. In general, this process fills up the holes in the set , resulting in the complement of the unbounded connected component of .
We can recover the usual convex hull from this construction by restricting to the polynomials of first degree. Indeed, when is linear, the set is a closed disk, and we end up with the intersection of all closed disks that contain . This is precisely the convex hull of .
What if we restrict to the polynomials of degree at most ? Let’s call the resulting set the degree- convex hull of , denoted . By definition, it is contained in the convex hull and contains the polynomially convex hull. To exactly compute for general appears to be difficult even when .
Consider finite sets. When has at most points, we have because there is a polynomial of degree whose zero set is precisely . So, the first nontrivial case is of having points. Let us write .
Depending on the location of the points, may be strictly larger than . For example, if consists of the vertices of a regular -gon, then also contains its center. Here is why. By a linear transformation, we can make sure that where . For we have . Hence, for any polynomial of degree at most , the sum is equal to . This implies , a kind of a discrete maximum principle.
A more systematic approach is to use the Lagrange basis polynomials, that is , which satisfy . Since for any polynomial of degree at most , it follows that if and only if holds for every choice of scalars . The latter is equivalent to the inequality .
This leads us to consider the function , the sum of the absolute values of the Lagrange basis polynomials. (Remark: S is called a Lebesgue function for this interpolation problem.) Since , it follows that everywhere. By construction, on . At a point , the equality holds if and only if is the same for all .
In the trivial case , the function is equal to precisely on the linear segment with endpoints . Of course, this only repeats what we already knew: the degree-1 convex hull is the ordinary convex hull.
If with real and , then . Indeed, if , then lies in the convex hull of , and therefore for some . The basis polynomial is positive at , since it is equal to at and does not vanish outside of . Since a polynomial changes its sign at every simple zero, it follows that . Well, there is no if , but in that case, the same reasoning applies to . In any case, the conclusion is that cannot be the same for all .
At this point one might guess that the vertex set of a regular polygon is the only kind of finite sets that admit a nontrivial discrete maximum principle for polynomials. But this is not so: the vertices of a rhombus work as well. Indeed, if with , then for all , hence .
The vertices of a non-square rectangle do not work: if is the set of these vertices, the associated function is strictly greater than 1 on the complement of .
Are there any other finite sets that support a discrete maximum principle for polynomials?
Suppose is a holomorphic function in the unit disk such that in the disk. How large can its Taylor polynomial be in the disk?
We should not expect to be bounded by 1 as well. Indeed, the Möbius transformation has Taylor expansion , so in this case. This turns out to be the worst case: in general is bounded by 5/4 in the disk.
For the second-degree polynomial the sharp bound is , attained when ; the image of the unit circle under the extremal is shown below. Clearly, there is something nontrivial going on.
Edmund Landau established the sharp bound for in his paper Abschätzung der Koeffizientensumme einer Potenzreihe, published in Archiv der Mathematik und Physik (3) 21 in 1913. Confusingly, there are two papers with the same title in the same issue of the journal: one on pages 42-50, the other on pages 250-255, and they appear in different volumes of Landau’s Collected Works. The sharp bound is in the second paper.
By rotation, it suffices to bound , which is . As is often done, we rescale a bit so that it’s holomorphic in a slightly larger disk, enabling the use of the Cauchy integral formula on the unit circle . The Cauchy formula says . Hence
It is natural to use now, which leads to
Here we can use the geometric sum formula and try to estimate the integral of on the unit circle. This is what Landau does in the first of two papers; the result is which is the correct rate of growth (this is essentially the Dirichlet kernel estimate from the theory of Fourier series). But there is a way to do better and get the sharp bound.
First idea: the factor could be replaced by any polynomial as long as the coefficients of powers up to stay the same. Higher powers contribute nothing to the integral that evaluates , but they might reduce the integral of .
Second idea: we should choose to be the square of some polynomial, , because can be computed exactly: it is just the sum of squares of the coefficients of , by Parseval’s formula.
Since is the -th degree Taylor polynomial of , it is natural to choose to be the -th degree Taylor polynomial of . Indeed, if , then as desired (asymptotics as ). The binomial formula tells us that
The coefficient of here can be written out as or rewritten as which shows that in lowest terms, its denominator is a power of 2. To summarize, is bounded by the sum of squares of the coefficients of . Such sums are referred to as the Landau constants,
A number of asymptotic and non-asymptotic formulas have been derived for , for example Brutman (1982) shows that is between 1 and 1.0663.
To demonstrate the sharpness of the bound , we want and on the unit circle. Both are arranged by taking which is a Blaschke product of degree . Note that the term can also be written as . Hence which is simply when . Equality holds in all the estimates above, so they are sharp.
Here are the images of the unit circle under extremal Taylor polynomials and .
These polynomials attain large values only on a short subarc of the circle; most of the time they oscillate at levels less than 1. Indeed, the mean value of cannot exceed the mean of which is at most 1. Here is the plot of the roots of extremal : they are nearly uniform around the circle, except for a gap near 1.
But we are not done…
Wait a moment. Does define a holomorphic function in the unit disk? We are dividing by here. Fortunately, has no zeros in the unit disk, because its coefficients are positive and decreasing as the exponent increases. Indeed, if with , then has constant term and other coefficients , , … , . Summing the absolute values of the coefficients of nonconstant terms we get . So, when these coefficients are attached to with , the sum of nonconstant terms is strictly less than in absolute value. This proves in the unit disk. Landau credits Adolf Hurwitz with this proof.
In fact, the zeros of (Taylor polynomials of ) lie just outside of the unit disk.
The zeros of the Blaschke products formed from are the reciprocals of the zeros of , so they lie just inside the unit circle, much like the zeros of (though they are different).
For every nonnegative integer there exists a (unique) polynomial of degree in and separately with the following reproducing property:
for every polynomial of degree at most , and for every . For example, ; other examples are found in the post Polynomial delta function.
This fact gives an explicit pointwise bound on a polynomial in terms of its integral on an interval:
where . For example, .
Although in principle could be any real or complex number, it makes sense to restrict attention to , where integration takes place. This leads to the search for extreme values of on the square . Here is how this function looks for :
The symmetries are evident here.
where is the Legendre polynomial of degree and the factor is included to make the polynomials an orthonormal set in . Since oscillates between and , it follows that
and this bound is attained at because and .
the minimum value of on the square ? Certainly not for even . Indeed, differentiating the sum
with respect to and using , we arrive at
which is negative if is even, ruling out this point as a minimum.
What about odd , then: is it true that on the square ?
: yes, is clear enough.
: the inequality is also true… at least numerically. It can be simplified to but I do not see a way forward from there. At least on the boundary of it can be shown without much work:
The quadratic term has no real roots, which is easy to check.
: similar story, the inequality appears to be true but I can only prove it on the boundary, using
The quartic term has no real roots, which is not so easy to check.
: surprisingly, which is about , disproving the conjectural bound . This fact is not at all obvious from the plot.
Is on the square with some universal constant ?
Is the minimum of on always attained on the boundary of ?
Can be expressed in closed form, at least for small degrees ?
The choice of piecewise polynomials of degree 3 for interpolation is justifiably popular: even-degree splines are algebraically awkward to construct, degree 1 is simply piecewise linear interpolation (not smooth), and degree 5, while feasible, entails juggling too many coefficients. Besides, a cubic polynomial minimizes the amount of wiggling (the integral of second derivative squared) for given values and slopes at the endpoints of an interval. (Recall Connecting dots naturally.)
But the derivative of a cubic spline is a quadratic spline. And one needs the derivative to find the critical points. This results in an awkward example in SciPy documentation, annotated with “(NB: sproot only works for order 3 splines, so we fit an order 4 spline)”.
Although not implemented in SciPy, the task of computing the roots of a quadratic spline is a simple one. Obtaining the roots from the internal representation of a quadratic spline in SciPy (as a linear combination of B-splines) would take some work and reading. But a quadratic polynomial is determined by three values, so sampling it at three points, such as two consecutive knots and their average, is enough.
Quadratic formula with values instead of coefficients
Suppose we know the values of a quadratic polynomial q at -1, 0, 1, and wish to find if it has roots between -1 and 1. Let’s normalize so that q(0)=1, and let x = q(-1), y = q(1). If either x or y is negative, there is definitely a root on the interval. If they are positive, there is still a chance: we need the parabola to be concave up, have a minimum within [-1, 1], and for the minimum to be negative. All of this is easily determined once we note that the coefficients of the polynomial are a = (x+y)/2 – 1, b = (y-x)/2, and c = 1.
The inequality ensures the suitable sign of the discriminant. It describes a parabola with vertex (1, 1) and focus (2, 2), contained in the first quadrant and tangent to the axes at (4, 0) and (0, 4). Within the orange region there are no real roots.
The line x+y=2, tangent to the parabola at its vertex, separates convex and concave parabolas. While concavity in conjunction with x, y being positive definitely precludes having roots in [-1, 1], slight convexity is not much better: it results in real roots outside of the interval. Here is the complete picture: green means there is a root in [-1, 1], orange means no real roots, red covers the rest.
Back to splines
Since the derivative of a spline is implemented in SciPy (B-splines have a nice formula for derivatives), all we need is a root-finding routine for quadratic splines. Here it is, based on the above observations but using built-in NumPy polynomial solver np.roots to avoid dealing with various special cases for the coefficients.
roots = 
knots = spl.get_knots()
for a, b in zip(knots[:-1], knots[1:]):
u, v, w = spl(a), spl((a+b)/2), spl(b)
t = np.roots([u+w-2*v, w-u, 2*v])
t = t[np.isreal(t) & (np.abs(t) <= 1)]
roots.extend(t*(b-a)/2 + (b+a)/2)
A demonstration, which plots the spline (blue), its critical points (red), and original data points (black) as follows:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import InterpolatedUnivariateSpline
x = np.arange(7)
y = np.array([3, 1, 1, 2, 2, 4, 3])
f = InterpolatedUnivariateSpline(x, y, k=3)
crit_pts = quadratic_spline_roots(f.derivative())
t = np.linspace(x, x[-1], 500)
plt.plot(x, y, 'kd')
plt.plot(crit_pts, f(crit_pts), 'ro')
In a 1946 paper Charles Pisot proved a theorem involving a curious constant . It can be defined as follows:
monic polynomial such that whenever
Equivalently, is determined by the requirement that the set have logarithmic capacity 1; this won’t be used here. The theorem is stated below, although this post is really about the constant.
Theorem: If an entire function takes integer values at nonnegative integers and is for some , then it is a finite linear combination of terms of the form , where each is an algebraic integer.
The value of is best possible; thus, in some sense Pisot’s theorem completed a line of investigation that began with a 1915 theorem by Pólya which had in place of , and where the conclusion was that is a polynomial. (Informally speaking, Pólya proved that is the “smallest” entire-function that is integer-valued on nonnegative integers.)
Although the constant was mentioned in later literature (here, here, and here), no further digits of it have been stated anywhere, as far as I know. So, let it be known that the decimal expansion of begins with 0.84383.
A lower bound on can be obtained by constructing a monic polynomial that is bounded by 1 on the set . Here is E(0.843):
It looks pretty round, except for that flat part on the left. In fact, E(0.82) is covered by a disk of unit radius centered at 1.3, which means that the choice shows .
How to get an upper bound on ? Turns out, it suffices to exhibit a monic polynomial that has all zeros in and satisfies on the boundary of . The existence of such shows . Indeed, suppose that is monic and on . Consider the function . By construction on the boundary of . Also, is subharmonic in its complement, including , where the singularities of both logarithms cancel out, leaving . This contradicts the maximum principle for subharmonic functions, according to which cannot exceed the maximum of on the boundary.
The choice of works for .
So we have boxed between 0.82 and 0.89; how to get more precise bounds? I don’t know how Pisot achieved the precision of 0.843… it’s possible that he strategically picked some linear and quadratic factors, raised them to variable integer powers and optimized the latter. Today it is too tempting to throw some optimization routine on the problem and let it run for a while.
But what to optimize? The straightforward approach is to minimize the maximum of on the circle , approximated by sampling the function at a sufficiently fine uniform grid and picking the maximal value. This works… unspectacularly. One problem is that the objective function is non-differentiable. Another is that taking maximum throws out a lot of information: we are not using the values at other sample points to better direct the search. After running optimization for days, trying different optimization methods, tolerance options, degrees of the polynomial, and starting values, I was not happy with the results…
Turns out, the optimization is much more effective if one minimizes the variance of the set . Now we are minimizing a polynomial function of , which pushes them toward having the same absolute value — the behavior that we want the polynomial to have. It took from seconds to minutes to produce the polynomials shown below, using BFGS method as implemented in SciPy.
As the arguments for optimization function I took the real and imaginary parts of the zeros of the polynomial. The symmetry about the real axis was enforced automatically: the polynomial was the product of quadratic terms . This eliminated the potentially useful option of having real zeros of odd order, but I did not feel like special-casing those.
Again, nearly the same polynomial works for upper and lower bounds. The fact that the absolute value of each of these polynomials is below 1 (for lower bounds) or greater than 1 (for upper bounds) can be ascertained by sampling them and using an upper estimate on the derivative; there is enough margin to trust computations with double precision.
Finally, the Python script I used. The function “obj” is getting minimized while function “values” returns the actual values of interest: the minimum and maximum of polynomial. The degree of polynomial is 2n, and the radius under consideration is r. The sample points are collected in array s. To begin with, the roots are chosen randomly. After minimization runs (inevitably, ending in a local minimum of which there are myriads), the new starting point is obtained by randomly perturbing the local minimum found. (The perturbation is smaller if minimization was particularly successful.)
import numpy as np
from scipy.optimize import minimize
rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
p = np.prod(np.abs(s-rc)**2, axis=0)
rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
p = np.prod(np.abs(s-rc), axis=0)
return [np.min(p), np.max(p)]
r = 0.84384
n = 10
record = 2
s = np.exp(r * np.exp(1j*np.arange(0, np.pi, 0.01)))
xr = np.random.uniform(0.8, 1.8, size=(n,))
xi = np.random.uniform(0, 0.7, size=(n,))
x0 = np.concatenate((xr, xi))
res = minimize(obj, x0, method = 'BFGS')
if res['fun'] < record:
record = res['fun']
x0 = res['x'] + np.random.uniform(-0.001, 0.001, size=x0.shape)
x0 = res['x'] + np.random.uniform(-0.05, 0.05, size=x0.shape)
The function has a curious property: for any linear function , and any point , the integral evaluates to . This is easy to check using the fact that odd powers of integrate to zero:
More generally, for any integer there exists a unique symmetric polynomial that has degree in and separately and satisfies for all polynomials of degree at most . For example, (obviously) and
The formula is not really intuitive, and a 3d plot would not help the matter much. To visualize , I plotted , , and below (green, red, blue respectively).
For and large , the function approaches the Dirac delta at , although the convergence is slow, especially when is close to . I don’t think there is anything good to be said about the case .
The existence and uniqueness of are a consequence of the Riesz representation of linear functionals on an inner product space. Indeed, polynomials of degree at most form such a space with inner product , and the functional is linear for any fixed . Hence, this functional can be written as for some . The function is a reproducing kernel for this space. Its symmetry is not immediately obvious.
The Legendre polynomials are an orthogonal basis of ; more precisely, form an orthonormal basis. It’s a general fact about reproducing kernels that
(which, incidentally, proves the symmetry ). Indeed, taking this sum as the definition of and writing , we find
This is the Sage code used for the above plots.
n = 20
k = sum([(j+1/2)*legendre_P(j,x)*legendre_P(j,y) for j in range(0,n+1)])
plot(k(x,y=-3/4),(x,-1,1),color='green') + plot(k(x,y=0),(x,-1,1),color='red') + plot(k(x,y=1/2),(x,-1,1),color='blue')
If one picks two real numbers from the interval (independent, uniformly distributed), their sum has the triangular distribution.
The sum of three such numbers has a differentiable probability density function:
And the density of is smoother still: the p.d.f. has two
As the number of summands increases, these distributions converge to normal if they are translated and scaled properly. But I am not going to do that. Let’s keep the number of summands to four at most.
The p.d.f. of is a piecewise polynomial of degree . Indeed, for the density is piecewise constant, and the formula
provides the inductive step.
For each , the translated copies of function form a partition of unity:
The integral recurrence relation gives an easy proof of this:
And here is the picture for the quadratic case:
A partition of unity can be used to approximate functions by piecewise polynomials: just multiply each partition element by the value of the function at the center of the corresponding interval, and add the results.
Doing this with amounts to piecewise linear interpolation: the original function is in blue, the weighted sum of hat functions in red.
With we get a smooth curve.
Unlike interpolating splines, this curve does not attempt to pass through the given points exactly. However, it has several advantages over interpolating splines:
Is easier to calculate; no linear system to solve;