For every nonnegative integer there exists a (unique) polynomial of degree in and separately with the following reproducing property:
for every polynomial of degree at most , and for every . For example, ; other examples are found in the post Polynomial delta function.
This fact gives an explicit pointwise bound on a polynomial in terms of its integral on an interval:
where . For example, .
Although in principle could be any real or complex number, it makes sense to restrict attention to , where integration takes place. This leads to the search for extreme values of on the square . Here is how this function looks for :
Degree 1Degree 2Degree 3
The symmetries are evident here.
Explicitly,
where is the Legendre polynomial of degree and the factor is included to make the polynomials an orthonormal set in . Since oscillates between and , it follows that
and this bound is attained at because and .
Is
the minimum value of on the square ? Certainly not for even . Indeed, differentiating the sum
with respect to and using , we arrive at
which is negative if is even, ruling out this point as a minimum.
What about odd , then: is it true that on the square ?
: yes, is clear enough.
: the inequality is also true… at least numerically. It can be simplified to but I do not see a way forward from there. At least on the boundary of it can be shown without much work:
The quadratic term has no real roots, which is easy to check.
: similar story, the inequality appears to be true but I can only prove it on the boundary, using
The quartic term has no real roots, which is not so easy to check.
: surprisingly, which is about , disproving the conjectural bound . This fact is not at all obvious from the plot.
Degree 7 kernel
Questions:
Is on the square with some universal constant ?
Is the minimum of on always attained on the boundary of ?
Can be expressed in closed form, at least for small degrees ?
The choice of piecewise polynomials of degree 3 for interpolation is justifiably popular: even-degree splines are algebraically awkward to construct, degree 1 is simply piecewise linear interpolation (not smooth), and degree 5, while feasible, entails juggling too many coefficients. Besides, a cubic polynomial minimizes the amount of wiggling (the integral of second derivative squared) for given values and slopes at the endpoints of an interval. (Recall Connecting dots naturally.)
But the derivative of a cubic spline is a quadratic spline. And one needs the derivative to find the critical points. This results in an awkward example in SciPy documentation, annotated with “(NB: sproot only works for order 3 splines, so we fit an order 4 spline)”.
Although not implemented in SciPy, the task of computing the roots of a quadratic spline is a simple one. Obtaining the roots from the internal representation of a quadratic spline in SciPy (as a linear combination of B-splines) would take some work and reading. But a quadratic polynomial is determined by three values, so sampling it at three points, such as two consecutive knots and their average, is enough.
Quadratic formula with values instead of coefficients
Suppose we know the values of a quadratic polynomial q at -1, 0, 1, and wish to find if it has roots between -1 and 1. Let’s normalize so that q(0)=1, and let x = q(-1), y = q(1). If either x or y is negative, there is definitely a root on the interval. If they are positive, there is still a chance: we need the parabola to be concave up, have a minimum within [-1, 1], and for the minimum to be negative. All of this is easily determined once we note that the coefficients of the polynomial are a = (x+y)/2 – 1, b = (y-x)/2, and c = 1.
The inequality ensures the suitable sign of the discriminant. It describes a parabola with vertex (1, 1) and focus (2, 2), contained in the first quadrant and tangent to the axes at (4, 0) and (0, 4). Within the orange region there are no real roots.
No real roots in the orange region
The line x+y=2, tangent to the parabola at its vertex, separates convex and concave parabolas. While concavity in conjunction with x, y being positive definitely precludes having roots in [-1, 1], slight convexity is not much better: it results in real roots outside of the interval. Here is the complete picture: green means there is a root in [-1, 1], orange means no real roots, red covers the rest.
Green = there is a root in the interval [-1, 1]
Back to splines
Since the derivative of a spline is implemented in SciPy (B-splines have a nice formula for derivatives), all we need is a root-finding routine for quadratic splines. Here it is, based on the above observations but using built-in NumPy polynomial solver np.roots to avoid dealing with various special cases for the coefficients.
def quadratic_spline_roots(spl):
roots = []
knots = spl.get_knots()
for a, b in zip(knots[:-1], knots[1:]):
u, v, w = spl(a), spl((a+b)/2), spl(b)
t = np.roots([u+w-2*v, w-u, 2*v])
t = t[np.isreal(t) & (np.abs(t) <= 1)]
roots.extend(t*(b-a)/2 + (b+a)/2)
return np.array(roots)
A demonstration, which plots the spline (blue), its critical points (red), and original data points (black) as follows:
There can be 0, 1, or 2 critical points between two knots
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import InterpolatedUnivariateSpline
x = np.arange(7)
y = np.array([3, 1, 1, 2, 2, 4, 3])
f = InterpolatedUnivariateSpline(x, y, k=3)
crit_pts = quadratic_spline_roots(f.derivative())
t = np.linspace(x[0], x[-1], 500)
plt.plot(t, f(t))
plt.plot(x, y, 'kd')
plt.plot(crit_pts, f(crit_pts), 'ro')
plt.show()
In a 1946 paper Charles Pisot proved a theorem involving a curious constant . It can be defined as follows:
monic polynomial such that whenever
Equivalently, is determined by the requirement that the set have logarithmic capacity 1; this won’t be used here. The theorem is stated below, although this post is really about the constant.
Theorem: If an entire function takes integer values at nonnegative integers and is for some , then it is a finite linear combination of terms of the form , where each is an algebraic integer.
The value of is best possible; thus, in some sense Pisot’s theorem completed a line of investigation that began with a 1915 theorem by Pólya which had in place of , and where the conclusion was that is a polynomial. (Informally speaking, Pólya proved that is the “smallest” entire-function that is integer-valued on nonnegative integers.)
Although the constant was mentioned in later literature (here, here, and here), no further digits of it have been stated anywhere, as far as I know. So, let it be known that the decimal expansion of begins with 0.84383.
A lower bound on can be obtained by constructing a monic polynomial that is bounded by 1 on the set . Here is E(0.843):
It looks pretty round, except for that flat part on the left. In fact, E(0.82) is covered by a disk of unit radius centered at 1.3, which means that the choice shows .
p(z) = z-1.3 gives lower bound 0.82
How to get an upper bound on ? Turns out, it suffices to exhibit a monic polynomial that has all zeros in and satisfies on the boundary of . The existence of such shows . Indeed, suppose that is monic and on . Consider the function . By construction on the boundary of . Also, is subharmonic in its complement, including , where the singularities of both logarithms cancel out, leaving . This contradicts the maximum principle for subharmonic functions, according to which cannot exceed the maximum of on the boundary.
The choice of works for .
So we have boxed between 0.82 and 0.89; how to get more precise bounds? I don’t know how Pisot achieved the precision of 0.843… it’s possible that he strategically picked some linear and quadratic factors, raised them to variable integer powers and optimized the latter. Today it is too tempting to throw some optimization routine on the problem and let it run for a while.
But what to optimize? The straightforward approach is to minimize the maximum of on the circle , approximated by sampling the function at a sufficiently fine uniform grid and picking the maximal value. This works… unspectacularly. One problem is that the objective function is non-differentiable. Another is that taking maximum throws out a lot of information: we are not using the values at other sample points to better direct the search. After running optimization for days, trying different optimization methods, tolerance options, degrees of the polynomial, and starting values, I was not happy with the results…
Turns out, the optimization is much more effective if one minimizes the variance of the set . Now we are minimizing a polynomial function of , which pushes them toward having the same absolute value — the behavior that we want the polynomial to have. It took from seconds to minutes to produce the polynomials shown below, using BFGS method as implemented in SciPy.
As the arguments for optimization function I took the real and imaginary parts of the zeros of the polynomial. The symmetry about the real axis was enforced automatically: the polynomial was the product of quadratic terms . This eliminated the potentially useful option of having real zeros of odd order, but I did not feel like special-casing those.
Again, nearly the same polynomial works for upper and lower bounds. The fact that the absolute value of each of these polynomials is below 1 (for lower bounds) or greater than 1 (for upper bounds) can be ascertained by sampling them and using an upper estimate on the derivative; there is enough margin to trust computations with double precision.
Finally, the Python script I used. The function “obj” is getting minimized while function “values” returns the actual values of interest: the minimum and maximum of polynomial. The degree of polynomial is 2n, and the radius under consideration is r. The sample points are collected in array s. To begin with, the roots are chosen randomly. After minimization runs (inevitably, ending in a local minimum of which there are myriads), the new starting point is obtained by randomly perturbing the local minimum found. (The perturbation is smaller if minimization was particularly successful.)
import numpy as np
from scipy.optimize import minimize
def obj(r):
rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
p = np.prod(np.abs(s-rc)**2, axis=0)
return np.var(p)
def values(r):
rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
p = np.prod(np.abs(s-rc), axis=0)
return [np.min(p), np.max(p)]
r = 0.84384
n = 10
record = 2
s = np.exp(r * np.exp(1j*np.arange(0, np.pi, 0.01)))
xr = np.random.uniform(0.8, 1.8, size=(n,))
xi = np.random.uniform(0, 0.7, size=(n,))
x0 = np.concatenate((xr, xi))
while True:
res = minimize(obj, x0, method = 'BFGS')
if res['fun'] < record:
record = res['fun']
print(repr(res['x']))
print(values(res['x']))
x0 = res['x'] + np.random.uniform(-0.001, 0.001, size=x0.shape)
else:
x0 = res['x'] + np.random.uniform(-0.05, 0.05, size=x0.shape)
The function has a curious property: for any linear function , and any point , the integral evaluates to . This is easy to check using the fact that odd powers of integrate to zero:
More generally, for any integer there exists a unique symmetric polynomial that has degree in and separately and satisfies for all polynomials of degree at most . For example, (obviously) and
The formula is not really intuitive, and a 3d plot would not help the matter much. To visualize , I plotted , , and below (green, red, blue respectively).
Degree 1Degree 2Degree 4
For and large , the function approaches the Dirac delta at , although the convergence is slow, especially when is close to . I don’t think there is anything good to be said about the case .
Degree 10Degree 20
The existence and uniqueness of are a consequence of the Riesz representation of linear functionals on an inner product space. Indeed, polynomials of degree at most form such a space with inner product , and the functional is linear for any fixed . Hence, this functional can be written as for some . The function is a reproducing kernel for this space. Its symmetry is not immediately obvious.
The Legendre polynomials are an orthogonal basis of ; more precisely, form an orthonormal basis. It’s a general fact about reproducing kernels that
(which, incidentally, proves the symmetry ). Indeed, taking this sum as the definition of and writing , we find
This is the Sage code used for the above plots.
n = 20
k = sum([(j+1/2)*legendre_P(j,x)*legendre_P(j,y) for j in range(0,n+1)])
plot(k(x,y=-3/4),(x,-1,1),color='green') + plot(k(x,y=0),(x,-1,1),color='red') + plot(k(x,y=1/2),(x,-1,1),color='blue')
If one picks two real numbers from the interval (independent, uniformly distributed), their sum has the triangular distribution.
Also known as the hat function
The sum of three such numbers has a differentiable probability density function:
Piecewise quadratic, C1 smooth
And the density of is smoother still: the p.d.f. has two
continuous derivatives.
This begins to look normal
As the number of summands increases, these distributions converge to normal if they are translated and scaled properly. But I am not going to do that. Let’s keep the number of summands to four at most.
The p.d.f. of is a piecewise polynomial of degree . Indeed, for the density is piecewise constant, and the formula
provides the inductive step.
For each , the translated copies of function form a partition of unity:
The integral recurrence relation gives an easy proof of this:
And here is the picture for the quadratic case:
Piecewise quadratic partition of unity
A partition of unity can be used to approximate functions by piecewise polynomials: just multiply each partition element by the value of the function at the center of the corresponding interval, and add the results.
Doing this with amounts to piecewise linear interpolation: the original function is in blue, the weighted sum of hat functions in red.
PL interpolation of exp(-x/2)
With we get a smooth curve.
Approximating exp(-x/2) with a cubic B-spline
Unlike interpolating splines, this curve does not attempt to pass through the given points exactly. However, it has several advantages over interpolating splines:
Is easier to calculate; no linear system to solve;
For some reason I wanted to construct polynomials approximating this piecewise constant function :
So square
Of course approximation cannot be uniform, since the function is not continuous. But it can be achieved in the sense of convergence of graphs in the Hausdorff metric: their limit should be the “graph” shown above, with the vertical line included. In concrete terms, this means for every there is such that for the polynomial satisfies
and also
How to get such explicitly? I started with the functions when is large. The idea is that as , the limit of is what is wanted: when , when . Also, for each there is a Taylor polynomial that approximates uniformly on . Since the Taylor series is alternating, it is not hard to find suitable . Let’s shoot for in the Taylor remainder and see where this leads:
Degree polynomial for
Degree polynomial for
Degree polynomial for
Degree polynomial for
Degree polynomial for
The results are unimpressive, though:
Taylor polynomials of exp(-x^m) are not so square
To get within of the desired square-ness, we need . This means . Then, to have the Taylor remainder bounded by at , we need . Instead of messing with Stirling’s formula, just observe that does not even begin to decrease until exceeds , which is more than . That’s a … high degree polynomial. I would not try to ask a computer algebra system to plot it.
To avoid dealing with , it is better to use odd degrees. For comparison, I used the same or smaller degrees as above: .
Squarish Bernstein polynomials
Looks good. But I don’t know of a way to estimate the degree of Bernstein polynomial required to obtain Hausdorff distance less than a given (say, ) from the square function.
is approximate in general, but exact for linear functions (polynomials of degree at most one).
Midpoint Rule
With two sample points we can integrate any cubic polynomial exactly. The choice of sample points is not obvious: they are to be placed at distance from the midpoint. On the example below, and , so the sample points are . The integral is equal to . One can say that each sample point has weight in this rule.
Two-point quadrature: areas of yellow rectangles add up to the integral
Three sample points are enough for polynomials of degrees up to and including five. This time, the weights attached to the sample points are not equal. The midpoint is used again, this time with the weight of . Two other sample points are at distance from the midpoint, and their weights are each. This contraption exactly integrates polynomials of degrees up to five.
Three-point quadrature: rectangles of unequal width
Compare this with Simpson’s rule, which also uses three sample points but is exact only up to degree three.
The above are examples of Gaussian quadrature: for each positive integer , one can integrate polynomials of degree up to by taking samples at the right places, and weighing them appropriately.
Let’s move from the real line to the complex plane. If one accepts that the analog of interval is a disk in the plane, then quadrature becomes very simple: for any disk and any complex polynomials ,
where is the center of the disk and is its radius. One sample point is enough for all degrees! The proof is easy: rewrite in terms of powers of and integrate them in polar coordinates. The same works for any holomorphic function, as long as it is integrable in .
Disk: a quadrature domain with one node
But maybe the disk is a unique such shape? Not at all: there are other such quadrature domains. A simple family of examples is Neumann ovals (Neumann as in “boundary condition”). Geometrically, they are ellipses inverted in a concentric circle. Analytically, they are (up to linear transformations) images of the unit disk under
This image, denoted below, looks much like an ellipse when is small:
Neumann oval with c=0.3
Then it becomes peanut-shaped:
Neumann oval with c=0.6
For it looks much like the union of two disks (but the boundary is smooth, contrary to what the plot suggests):
Neumann oval with c=0.95
In each of these images, the marked points are the quadrature nodes. Let’s find out what they are and how they work.
Suppose is holomorphic and integrable in . By a change of variables,
Here is holomorphic, but is anti-holomorphic. We want to know what does when integrated against something holomorphic. Power series to the rescue:
hence
Multiply this by and integrate over in polar coordinates: the result is if is odd and
if is even. So, integration of against a power series produces the sum of over even powers only (with the factor of ). The process of dropping odd powers amounts to taking the even part of :
Yes, there’s something about that’s magic (key words: reproducing kernel, specifically Bergman kernel). Plug to conclude
So, the nodes are . They have equal weight, because . Final result:
Again, this is a two-point quadrature formula that is exact for all complex polynomials.