Extreme values of a reproducing kernel for polynomials

For every nonnegative integer {n} there exists a (unique) polynomial {K_n(x, y)} of degree {n} in {x} and {y} separately with the following reproducing property:

{\displaystyle p(x) = \int_{-1}^1 K_n(x, y)p(y)\,dy}

for every polynomial {p} of degree at most {n}, and for every {x}. For example, {K_1(x, y)= (3xy+1)/2}; other examples are found in the post Polynomial delta function.

This fact gives an explicit pointwise bound on a polynomial in terms of its integral on an interval:

{\displaystyle |p(x)| \le M_n(x) \int_{-1}^1 |p(y)|\,dy}

where {M_n(x) = \sup\{|K(x, y)| \colon y\in [-1, 1]\}}. For example, {M_1(x) = (3|x|+1)/2}.

Although in principle {x} could be any real or complex number, it makes sense to restrict attention to {x\in [-1, 1]}, where integration takes place. This leads to the search for extreme values of {K} on the square {Q=[-1, 1]\times [-1, 1]}. Here is how this function looks for {n=1, 2, 3}:

K1
Degree 1
K2
Degree 2
K3
Degree 3

The symmetries {K(x, y)=K(-x, -y) = K(y, x)} are evident here.

Explicitly,

{\displaystyle K_n(x, y) = \sum_{k=0}^n \frac{2k+1}{2} P_k(x)P_k(y)}

where {P_k} is the Legendre polynomial of degree {k} and the factor {(2k+1)/2} is included to make the polynomials an orthonormal set in {L^2(-1, 1)}. Since {P_k} oscillates between {-1} and {1}, it follows that

{\displaystyle |K_n(x, y)|\le \sum_{k=0}^n \frac{2k+1}{2} = \frac{(n+1)^2}{2}}

and this bound is attained at {K(1, 1)=K(-1,-1)=(n+1)^2/2} because {P_k(1)=1} and {P_k(-1)=(-1)^k}.

Is

{\displaystyle K_n(-1, 1) =\sum_{k=0}^n (-1)^k\frac{2k+1}{2} = (-1)^n \frac{n+1}{2}}

the minimum value of {K} on the square {Q}? Certainly not for even {n}. Indeed, differentiating the sum

{\displaystyle S_n(x) = K_n(x, 1) = \sum_{k=0}^n \frac{2k+1}{2} P_k(x)}

with respect to {x} and using {P_k'(-1) =(-1)^{k-1}k(k+1)/2}, we arrive at

{\displaystyle S_n'(-1) = (-1)^{n-1} \frac{n(n^2+3n+2)}{4}}

which is negative if {n} is even, ruling out this point as a minimum.

What about odd {n}, then: is it true that {K_n \ge -(n+1)/2} on the square {Q}?

{n=1}: yes, {K_1(x, y) = (3xy+1)/2 \ge (-3+1)/2 = -1} is clear enough.

{n=3}: the inequality {K_3\ge -2} is also true… at least numerically. It can be simplified to {35(xy)^3 + 9(xy)^2 + 15xy \ge (21x+21y+3)(x^2+y^2)} but I do not see a way forward from there. At least on the boundary of {Q} it can be shown without much work:

{\displaystyle K_3(x, 1) + 2 = \frac{5}{4}(x+1)(7x^2-4x+1)}

The quadratic term has no real roots, which is easy to check.

{n=5}: similar story, the inequality {K_5\ge -3} appears to be true but I can only prove it on the boundary, using

{\displaystyle K_5(x, 1)+3 = \frac{21}{16}(x + 1)(33 x^4 - 18x^3 - 12x^2 + 2x + 3)}

The quartic term has no real roots, which is not so easy to check.

{n=7}: surprisingly, {K_7(4/5, 1) = -2229959/500000} which is about {-4.46}, disproving the conjectural bound {K_7\ge -4}. This fact is not at all obvious from the plot.

K7
Degree 7 kernel

Questions:

  • Is {K_n \ge -Cn} on the square {Q = [-1, 1]\times [-1, 1]} with some universal constant {C}?
  • Is the minimum of {K_n} on {Q} always attained on the boundary of {Q}?
  • Can {M_n(x) = \sup\{|K(x, y)| \colon y\in [-1, 1]\}} be expressed in closed form, at least for small degrees {n}?

Critical points of a cubic spline

The choice of piecewise polynomials of degree 3 for interpolation is justifiably popular: even-degree splines are algebraically awkward to construct, degree 1 is simply piecewise linear interpolation (not smooth), and degree 5, while feasible, entails juggling too many coefficients. Besides, a cubic polynomial minimizes the amount of wiggling (the integral of second derivative squared) for given values and slopes at the endpoints of an interval. (Recall Connecting dots naturally.)

But the derivative of a cubic spline is a quadratic spline. And one needs the derivative to find the critical points. This results in an awkward example in SciPy documentation, annotated with “(NB: sproot only works for order 3 splines, so we fit an order 4 spline)”.

Although not implemented in SciPy, the task of computing the roots of a quadratic spline is a simple one. Obtaining the roots from the internal representation of a quadratic spline in SciPy (as a linear combination of B-splines) would take some work and reading. But a quadratic polynomial is determined by three values, so sampling it at three points, such as two consecutive knots and their average, is enough.

Quadratic formula with values instead of coefficients

Suppose we know the values of a quadratic polynomial q at -1, 0, 1, and wish to find if it has roots between -1 and 1. Let’s normalize so that q(0)=1, and let x = q(-1), y = q(1). If either x or y is negative, there is definitely a root on the interval. If they are positive, there is still a chance: we need the parabola to be concave up, have a minimum within [-1, 1], and for the minimum to be negative. All of this is easily determined once we note that the coefficients of the polynomial are a = (x+y)/2 – 1, b = (y-x)/2, and c = 1.

The inequality {(x-y)^2 \ge 8(x+y-2)} ensures the suitable sign of the discriminant. It describes a parabola with vertex (1, 1) and focus (2, 2), contained in the first quadrant and tangent to the axes at (4, 0) and (0, 4). Within the orange region there are no real roots.

orange
No real roots in the orange region

The line x+y=2, tangent to the parabola at its vertex, separates convex and concave parabolas. While concavity in conjunction with x, y being positive definitely precludes having roots in [-1, 1], slight convexity is not much better: it results in real roots outside of the interval. Here is the complete picture: green means there is a root in [-1, 1], orange means no real roots, red covers the rest.

all_colors
Green = there is a root in the interval [-1, 1]

Back to splines

Since the derivative of a spline is implemented in SciPy (B-splines have a nice formula for derivatives), all we need is a root-finding routine for quadratic splines. Here it is, based on the above observations but using built-in NumPy polynomial solver np.roots to avoid dealing with various special cases for the coefficients.

def quadratic_spline_roots(spl):
    roots = []
    knots = spl.get_knots()
    for a, b in zip(knots[:-1], knots[1:]):
        u, v, w = spl(a), spl((a+b)/2), spl(b)
        t = np.roots([u+w-2*v, w-u, 2*v])
        t = t[np.isreal(t) & (np.abs(t) <= 1)]
        roots.extend(t*(b-a)/2 + (b+a)/2)
    return np.array(roots)

A demonstration, which plots the spline (blue), its critical points (red), and original data points (black) as follows:

spline
There can be 0, 1, or 2 critical points between two knots
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import InterpolatedUnivariateSpline

x = np.arange(7)
y = np.array([3, 1, 1, 2, 2, 4, 3])
f = InterpolatedUnivariateSpline(x, y, k=3)
crit_pts = quadratic_spline_roots(f.derivative())

t = np.linspace(x[0], x[-1], 500)
plt.plot(t, f(t))
plt.plot(x, y, 'kd')
plt.plot(crit_pts, f(crit_pts), 'ro')
plt.show()

 

Pisot constant beyond 0.843

In a 1946 paper Charles Pisot proved a theorem involving a curious constant {\gamma_0= 0.843\dots}. It can be defined as follows:

{\gamma_0= \sup\{r \colon \exists } monic polynomial {p} such that {|p(e^z)| \le 1} whenever {|z|\le r \}}

Equivalently, {\gamma_0} is determined by the requirement that the set {\{e^z\colon |z|\le \gamma_0\}} have logarithmic capacity 1; this won’t be used here. The theorem is stated below, although this post is really about the constant.

Theorem: If an entire function takes integer values at nonnegative integers and is {O(e^{\gamma |z|})} for some {\gamma < \gamma_0}, then it is a finite linear combination of terms of the form {z^n \alpha^z}, where each {\alpha } is an algebraic integer.

The value of {\gamma_0} is best possible; thus, in some sense Pisot’s theorem completed a line of investigation that began with a 1915 theorem by Pólya which had {\log 2} in place of {\gamma_0}, and where the conclusion was that {f} is a polynomial. (Informally speaking, Pólya proved that {2^z} is the “smallest” entire-function that is integer-valued on nonnegative integers.)

Although the constant {\gamma_0} was mentioned in later literature (here, here, and here), no further digits of it have been stated anywhere, as far as I know. So, let it be known that the decimal expansion of {\gamma_0} begins with 0.84383.


A lower bound on {\gamma_0} can be obtained by constructing a monic polynomial that is bounded by 1 on the set {E(r) = \{e^z \colon |z|\le r \}}. Here is E(0.843):

er

It looks pretty round, except for that flat part on the left. In fact, E(0.82) is covered by a disk of unit radius centered at 1.3, which means that the choice {p(z) = z-1.3} shows {\gamma_0 > 0.82}.

082disk
p(z) = z-1.3 gives lower bound 0.82

How to get an upper bound on {\gamma_0}? Turns out, it suffices to exhibit a monic polynomial {q} that has all zeros in {E(r)} and satisfies {|q|>1} on the boundary of {E(r)}. The existence of such {q} shows {\gamma_0 < r}. Indeed, suppose that {p} is monic and {|p|\le 1} on {E(r)}. Consider the function {\displaystyle u(z) = \frac{\log|p(z)|}{\deg p} - \frac{\log|q(z)|}{\deg q}}. By construction {u<0} on the boundary of {E(r)}. Also, {u} is subharmonic in its complement, including {\infty}, where the singularities of both logarithms cancel out, leaving {u(\infty)=0}. This contradicts the maximum principle for subharmonic functions, according to which {u(\infty)} cannot exceed the maximum of {u} on the boundary.

The choice of {q(z) = z-1.42} works for {r=0.89}.

089disk

So we have {\gamma_0} boxed between 0.82 and 0.89; how to get more precise bounds? I don’t know how Pisot achieved the precision of 0.843… it’s possible that he strategically picked some linear and quadratic factors, raised them to variable integer powers and optimized the latter. Today it is too tempting to throw some optimization routine on the problem and let it run for a while.

But what to optimize? The straightforward approach is to minimize the maximum of {|p(e^z)|} on the circle {|z|=r}, approximated by sampling the function at a sufficiently fine uniform grid {\{z_k\}} and picking the maximal value. This works… unspectacularly. One problem is that the objective function is non-differentiable. Another is that taking maximum throws out a lot of information: we are not using the values at other sample points to better direct the search. After running optimization for days, trying different optimization methods, tolerance options, degrees of the polynomial, and starting values, I was not happy with the results…

Turns out, the optimization is much more effective if one minimizes the variance of the set {\{|p(\exp(z_k))|^2\}}. Now we are minimizing a polynomial function of {p(\exp(z_k)}, which pushes them toward having the same absolute value — the behavior that we want the polynomial to have. It took from seconds to minutes to produce the polynomials shown below, using BFGS method as implemented in SciPy.

As the arguments for optimization function I took the real and imaginary parts of the zeros of the polynomial. The symmetry about the real axis was enforced automatically: the polynomial was the product of quadratic terms {(z-x_k-iy_k) (z-x_k+iy_k)}. This eliminated the potentially useful option of having real zeros of odd order, but I did not feel like special-casing those.

Three digits

843
Degree 8, lower bound 0.843

Real part: 0.916, 1.186, 1.54, 1.783
Imaginary part: 0.399, 0.572, 0.502, 0.199

Here and below, only the zeros with positive imaginary part are listed (in the left-to-right order), the others being their conjugates.

844
Degree 10, upper bound 0.844

Real part: 0.878, 1.0673, 1.3626, 1.6514, 1.8277
Imaginary part: 0.3661, 0.5602, 0.6005, 0.4584, 0.171

Four digits

8438
Degree 14, lower bound 0.8438

Real part: 0.8398, 0.9358, 1.1231, 1.357, 1.5899, 1.776, 1.8788
Imaginary part: 0.3135, 0.4999 ,0.6163, 0.637, 0.553, 0.3751, 0.1326

8439
Degree 14, upper bound 0.8439

Real part: 0.8397, 0.9358, 1.1231, 1.3571, 1.5901, 1.7762, 1.879
Imaginary part: 0.3136, 0.5, 0.6164, 0.6372, 0.5531, 0.3751, 0.1326

No, I didn’t post the same picture twice. The polynomials are just that similar. But as the list of zeros shows, there are tiny differences…

Five digits

84383
Degree 20, lower bound 0.84383

Real part: 0.81527, 0.8553, 0.96028, 1.1082, 1.28274, 1.46689, 1.63723, 1.76302, 1.82066, 1.86273
Imaginary part: 0.2686, 0.42952, 0.556, 0.63835, 0.66857, 0.63906, 0.54572, 0.39701, 0.23637, 0.08842

84384
Degree 20, upper bound 0.84384

Real part: 0.81798, 0.85803, 0.95788, 1.09239, 1.25897, 1.44255, 1.61962, 1.76883, 1.86547, 1.89069
Imaginary part: 0.26631, 0.4234, 0.54324, 0.62676, 0.66903, 0.65366, 0.57719, 0.44358, 0.26486, 0.07896

Again, nearly the same polynomial works for upper and lower bounds. The fact that the absolute value of each of these polynomials is below 1 (for lower bounds) or greater than 1 (for upper bounds) can be ascertained by sampling them and using an upper estimate on the derivative; there is enough margin to trust computations with double precision.

Finally, the Python script I used. The function “obj” is getting minimized while function “values” returns the actual values of interest: the minimum and maximum of polynomial. The degree of polynomial is 2n, and the radius under consideration is r. The sample points are collected in array s. To begin with, the roots are chosen randomly. After minimization runs (inevitably, ending in a local minimum of which there are myriads), the new starting point is obtained by randomly perturbing the local minimum found. (The perturbation is smaller if minimization was particularly successful.)

import numpy as np
from scipy.optimize import minimize

def obj(r):
    rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
    p = np.prod(np.abs(s-rc)**2, axis=0)
    return np.var(p)

def values(r):
    rc = np.concatenate((r[:n]+1j*r[n:], r[:n]-1j*r[n:])).reshape(-1,1)
    p = np.prod(np.abs(s-rc), axis=0)
    return [np.min(p), np.max(p)]

r = 0.84384
n = 10
record = 2 
s = np.exp(r * np.exp(1j*np.arange(0, np.pi, 0.01)))
xr = np.random.uniform(0.8, 1.8, size=(n,))
xi = np.random.uniform(0, 0.7, size=(n,))
x0 = np.concatenate((xr, xi))

while True:
    res = minimize(obj, x0, method = 'BFGS')
    if res['fun'] < record:
        record = res['fun']
        print(repr(res['x']))
        print(values(res['x']))
        x0 = res['x'] + np.random.uniform(-0.001, 0.001, size=x0.shape)
    else:
        x0 = res['x'] + np.random.uniform(-0.05, 0.05, size=x0.shape)

 

Polynomial delta function

The function {k_1(x,y) = (3xy+1)/2} has a curious property: for any linear function {\ell}, and any point {y\in \mathbb R}, the integral {\int_{-1}^1 \ell(x)k_1(x,y)\,dx} evaluates to {\ell(y)}. This is easy to check using the fact that odd powers of {x} integrate to zero:

\displaystyle \frac12 \int_{-1}^1 (ax+b)(3xy+1)\,dx = \frac12 \int_{-1}^1 (3ax^2y+b)\,dx = \frac12(2ay+2b) = ay+b

More generally, for any integer {n\ge 0} there exists a unique symmetric polynomial {k_n(x,y)} that has degree {n} in {x} and {y} separately and satisfies {\int_{-1}^1 p(x)k_n(x,y)\,dx = p(y)} for all polynomials {p} of degree at most {n}. For example, {k_0(x,y)=1/2} (obviously) and

\displaystyle k_2(x,y)=\frac98+\frac32xy+\frac{15}{8}(x^2+y^2)+\frac{45}{8}x^2y^2

The formula is not really intuitive, and a 3d plot would not help the matter much. To visualize {k_n}, I plotted {k_n(x,-3/4)}, {k_n(x,0)}, and {k_n(x,1/2)} below (green, red, blue respectively).

Degree 1
Degree 1
Degree 2
Degree 2
Degree 4
Degree 4

For {y\in [-1,1]} and large {n}, the function {k_n(\cdot, y)} approaches the Dirac delta at {y}, although the convergence is slow, especially when {|y|} is close to {1}. I don’t think there is anything good to be said about the case {|y|>1}.

Degree 10
Degree 10
Degree 20
Degree 20

The existence and uniqueness of {k_n} are a consequence of the Riesz representation of linear functionals on an inner product space. Indeed, polynomials of degree at most {n} form such a space {\mathbb P_n} with inner product {\langle p,q\rangle = \int_{-1}^1p(x)q(x)\,dx}, and the functional {p\mapsto p(y)} is linear for any fixed {y\in\mathbb R}. Hence, this functional can be written as {p\mapsto \langle p, k_y\rangle } for some {k_y}. The function {(x,y) \mapsto k_x(y)} is a reproducing kernel for this space. Its symmetry is not immediately obvious.

The Legendre polynomials {P_0,\dots,P_n} are an orthogonal basis of {\mathbb P_n}; more precisely, {\widetilde{P}_j = \sqrt{j+1/2}P_j} form an orthonormal basis. It’s a general fact about reproducing kernels that

\displaystyle k(x,y) = \sum_j \widetilde{P}_j(x)\widetilde{P}_j(y)

(which, incidentally, proves the symmetry {k(y,x)=k(x,y)}). Indeed, taking this sum as the definition of {k} and writing {p = \sum_{j=0}^n c_j \widetilde{P}_j}, we find

\displaystyle \langle p, k(\cdot, y)\rangle = \sum_j \widetilde{P}_j(y) \langle p, \widetilde{P}_j\rangle = \sum_j \widetilde{P}_j(y) c_j = p(y)


This is the Sage code used for the above plots.

n = 20
k = sum([(j+1/2)*legendre_P(j,x)*legendre_P(j,y) for j in range(0,n+1)])
plot(k(x,y=-3/4),(x,-1,1),color='green') + plot(k(x,y=0),(x,-1,1),color='red') +  plot(k(x,y=1/2),(x,-1,1),color='blue')

Higher degrees cause some numerical issues…

Degree 22
Degree 22

Post motivated by Math.SE question

B-splines and probability

If one picks two real numbers {X_1,X_2} from the interval {[0,1]} (independent, uniformly distributed), their sum {S_2=X_1+X_2} has the triangular distribution.

Also known as the hat function
Also known as the hat function

The sum {S_3} of three such numbers has a differentiable probability density function:

Piecewise quadratic, C1 smooth
Piecewise quadratic, C1 smooth

And the density of {S_4=X_1+X_2+X_3+X_4} is smoother still: the p.d.f. has two
continuous derivatives.

This begins to look normal
This begins to look normal

As the number of summands increases, these distributions converge to normal if they are translated and scaled properly. But I am not going to do that. Let’s keep the number of summands to four at most.

The p.d.f. of {S_n} is a piecewise polynomial of degree {n-1}. Indeed, for {S_1=X_1} the density is piecewise constant, and the formula

\displaystyle  S_n(x) = \int_{x-1}^x S_{n-1}(t)\,dt

provides the inductive step.

For each {n}, the translated copies of function {S_n} form a partition of unity:

\displaystyle  \sum_{k\in\mathbb Z}S_n(x-k)\equiv 1

The integral recurrence relation gives an easy proof of this:

\displaystyle  \sum_{k\in\mathbb Z}\int_{x-k-1}^{x-k} S_{n-1}(t)\,dt = \int_{\mathbb R} S_{n-1}(t)\,dt = 1

And here is the picture for the quadratic case:

Piecewise quadratic partition of unity
Piecewise quadratic partition of unity

A partition of unity can be used to approximate functions by piecewise polynomials: just multiply each partition element by the value of the function at the center of the corresponding interval, and add the results.

Doing this with {S_2} amounts to piecewise linear interpolation: the original function {f(x)=e^{-x/2}} is in blue, the weighted sum of hat functions in red.

Using the function exp(-x/2)
PL interpolation of exp(-x/2)

With {S_4} we get a smooth curve.

Approximating exp(-x/2) with a cubic B-spline
Approximating exp(-x/2) with a cubic B-spline

Unlike interpolating splines, this curve does not attempt to pass through the given points exactly. However, it has several advantages over interpolating splines:

  • Is easier to calculate; no linear system to solve;
  • Yields positive function for positive data;
  • Yields monotone function for monotone data

Squarish polynomials

For some reason I wanted to construct polynomials approximating this piecewise constant function {f}:

How to approximate this with polynomials?
So square

Of course approximation cannot be uniform, since the function is not continuous. But it can be achieved in the sense of convergence of graphs in the Hausdorff metric: their limit should be the “graph” shown above, with the vertical line included. In concrete terms, this means for every {\epsilon>0} there is {N} such that for {n\ge N} the polynomial {p_n} satisfies

\displaystyle  |p_n-f|\le \epsilon\quad \text{ on }\ [0,2]\setminus [1-\epsilon,1+\epsilon]

and also

\displaystyle  -\epsilon\le  p_n  \le 1+\epsilon\quad \text{ on }\ [1-\epsilon,1+\epsilon]

How to get such {p_n} explicitly? I started with the functions {f_m(x) = \exp(-x^m)} when {m} is large. The idea is that as {m\rightarrow\infty}, the limit of {\exp(-x^m)} is what is wanted: {1} when {x<1}, {0} when {x>1}. Also, for each {m} there is a Taylor polynomial {T_{m,n}} that approximates {f_m} uniformly on {[0,2]}. Since the Taylor series is alternating, it is not hard to find suitable {n}. Let’s shoot for {\epsilon=0.01} in the Taylor remainder and see where this leads:

  • Degree {7} polynomial for {\exp(-x)}
  • Degree {26} polynomial for {\exp(-x^2)}
  • Degree {69} polynomial for {\exp(-x^3)}
  • Degree {180} polynomial for {\exp(-x^4)}
  • Degree {440} polynomial for {\exp(-x^5)}

The results are unimpressive, though:

Taylor polynomials of exp(-x^m) are not so square
Taylor polynomials of exp(-x^m) are not so square

To get within {0.01} of the desired square-ness, we need {\exp(-1.01^m)<0.01}. This means {m\ge 463}. Then, to have the Taylor remainder bounded by {0.01} at {x=2}, we need {2^{463n}/n! < 0.01}. Instead of messing with Stirling’s formula, just observe that {2^{463n}/n!} does not even begin to decrease until {n} exceeds {2^{463}}, which is more than {10^{139}}. That’s a … high degree polynomial. I would not try to ask a computer algebra system to plot it.

Bernstein polynomials turn out to work better. On the interval {[0,2]} they are given by

\displaystyle    p_n(x) = 2^{-n} \sum_{k=0}^n f(2k/n) \binom{n}{k} x^k (2-x)^{n-k}

To avoid dealing with {f(1)}, it is better to use odd degrees. For comparison, I used the same or smaller degrees as above: {7, 25, 69, 179, 439}.

Squarish Bernstein polynomials
Squarish Bernstein polynomials

Looks good. But I don’t know of a way to estimate the degree of Bernstein polynomial required to obtain Hausdorff distance less than a given {\epsilon} (say, {0.01}) from the square function.

Quadrature rules and quadrature domains

The midpoint rule of numerical integration

\displaystyle \int_a^b f(x)\,dx \approx (b-a)f\left(\frac{a+b}{2}\right)

is approximate in general, but exact for linear functions (polynomials of degree at most one).

Midpoint Rule
Midpoint Rule

With two sample points we can integrate any cubic polynomial exactly. The choice of sample points is not obvious: they are to be placed at distance {\dfrac{1}{2\sqrt{3}}(b-a)} from the midpoint. On the example below, {a=0} and {b=1}, so the sample points are {x_{1,2} = \dfrac12 \pm \dfrac{1}{2\sqrt{3}}}. The integral is equal to {\dfrac12 f(x_1)+\dfrac12 f(x_2)}. One can say that each sample point has weight {1/2} in this rule.

Two-point quadrature: areas of yellow rectangles add up to the integral
Two-point quadrature: areas of yellow rectangles add up to the integral

Three sample points are enough for polynomials of degrees up to and including five. This time, the weights attached to the sample points are not equal. The midpoint is used again, this time with the weight of {4/9}. Two other sample points are at distance {\dfrac{\sqrt{3}}{2\sqrt{5}}(b-a)} from the midpoint, and their weights are {5/18} each. This contraption exactly integrates polynomials of degrees up to five.

Three-point quadrature
Three-point quadrature: rectangles of unequal width

Compare this with Simpson’s rule, which also uses three sample points but is exact only up to degree three.

The above are examples of Gaussian quadrature: for each positive integer {n}, one can integrate polynomials of degree up to {2n-1} by taking {n} samples at the right places, and weighing them appropriately.


Let’s move from the real line to the complex plane. If one accepts that the analog of interval {(a,b)} is a disk in the plane, then quadrature becomes very simple: for any disk {D} and any complex polynomials {p},

\displaystyle   \iint_D p = \pi r^2 p(c)

where {c} is the center of the disk and {r} is its radius. One sample point is enough for all degrees! The proof is easy: rewrite {p} in terms of powers of {(z-c)} and integrate them in polar coordinates. The same works for any holomorphic function, as long as it is integrable in {D}.

Disk: a quadrature domain with one node
Disk: a quadrature domain with one node

But maybe the disk is a unique such shape? Not at all: there are other such quadrature domains. A simple family of examples is Neumann ovals (Neumann as in “boundary condition”). Geometrically, they are ellipses inverted in a concentric circle. Analytically, they are (up to linear transformations) images of the unit disk {\mathbb{D}} under

\displaystyle  \varphi(z)=\frac{z}{1-c^2 z^2}\quad (0\le c<1)

This image, denoted {\Omega } below, looks much like an ellipse when {c} is small:

Neumann oval with c=0.3
Neumann oval with c=0.3

Then it becomes peanut-shaped:

Neumann oval with c=0.6
Neumann oval with c=0.6

For {c\approx 1} it looks much like the union of two disks (but the boundary is smooth, contrary to what the plot suggests):

Neumann oval with c=0.95
Neumann oval with c=0.95

In each of these images, the marked points are the quadrature nodes. Let’s find out what they are and how they work.

Suppose {f} is holomorphic and integrable in {\Omega}. By a change of variables,

\displaystyle  \iint_{\Omega } f = \iint_{\mathbb D} (f\circ \varphi)\, \varphi' \,\overline{\varphi'}

Here { (f\circ \varphi) \varphi'} is holomorphic, but { \overline{\varphi'} } is anti-holomorphic. We want to know what { \overline{\varphi'} } does when integrated against something holomorphic. Power series to the rescue:

\displaystyle    \varphi(z) = z\sum_{n=0}^\infty c^{2n} z^{2n} = \sum_{n=0}^\infty c^{2n} z^{2n+1}

hence

\displaystyle  \overline{\varphi'(z)} = \sum_{n=0}^\infty c^{2n} (2n+1) \bar z^{2n}

Multiply this by {z^{k}} and integrate over {\mathbb D} in polar coordinates: the result is {0} if {k} is odd and

\displaystyle    2\pi c^{k} (k+1) \int_0^1 r^{2k} r\,dr= \pi c^{k}

if {k} is even. So, integration of {\overline{\varphi'(z)}} against a power series {g(z) = \sum a_k z^k} produces the sum of {a_{k} c^{k}} over even powers {k} only (with the factor of {\pi}). The process of dropping odd powers amounts to taking the even part of {g}:

\displaystyle  \iint_{\mathbb D} g\, \overline{\varphi'} = \frac{\pi}{2} ( g(c)+ g(-c))

Yes, there’s something about { \overline{\varphi'} } that’s magic (key words: reproducing kernel, specifically Bergman kernel). Plug {g= (f\circ \varphi)\, \varphi' } to conclude

\displaystyle  \iint_{\Omega } f = \frac{\pi}{2} \Big\{f(\varphi(c))\, \varphi'(c) + f(\varphi(-c) ) \,\varphi'(-c)\Big\}

So, the nodes are {\varphi(\pm c) = \pm c/(1-c^4)}. They have equal weight, because {\varphi'(\pm c) = \dfrac{1+c^4}{(1-c^4)^2}}. Final result:

\displaystyle  \iint_{\Omega } f = \frac{\pi (1+c^4)}{2 (1-c^4)^2}\, \left\{f \left( \frac{c}{1-c^4} \right) + f \left(- \frac{c}{1-c^4} \right) \right\}

Again, this is a two-point quadrature formula that is exact for all complex polynomials.

As a bonus, put {f\equiv 1} to find that the area of {\Omega} is {\dfrac{\pi (1+c^4)}{ (1-c^4)^2}}.