Suppose you have a reasonable continuous function on some interval, say on , and you want to approximate it by a trigonometric polynomial. A straightforward approach is to write

which, frankly, is not a very good deal for the price.

Still using the standard Fourier expansion formulas, one can improve approximation by shifting the function to and expanding it into the cosine Fourier series.

where

Then replace with to shift the interval back. With , the partial sum is

which gives a much better approximation with fewer coefficients to calculate.

To see what is going on, one has to look beyond the interval on which is defined. The first series actually approximates the periodic extension of , which is discontinuous because the endpoint values are not equal:

Cosines, being even, approximate the symmetric periodic extension of , which is continuous whenever is.

Discontinuities hurt the quality of Fourier approximation more than the lack of smoothness does.

Just for laughs I included the pure sine approximation, also with .

The dreaded calculus torture device that works for exactly two integrals, and .

Actually, no. A version of it (with one integration by parts) works for :

hence (assuming )

Yes, this is more of a calculus joke. A more serious example comes from Fourier series.

The functions , , are orthogonal on , in the sense that

This is usually proved using a trigonometric identity that converts the product to a sum. But the double integration by parts give a nicer proof, because no obscure identities are needed. No boundary terms will appear because the sines vanish at both endpoints:

All integrals here must vanish because . As a bonus, we get the orthogonality of cosines, , with no additional effort.

The double integration by parts is also a more conceptual proof, because it gets to the heart of the matter: eigenvectors of a symmetric matrix (operator) that correspond to different eigenvalues are orthogonal. The trigonometric form is incidental, the eigenfunction property is essential. Let’s try this one more time, for the mixed boundary value problem , . Suppose that and satisfy the boundary conditions, , and . Since and vanish at both endpoints, we can pass the primes easily:

Convolution of a continuous function on the circle with the Fejér kernel is guaranteed to produce trigonometric polynomials that converge to uniformly as . For the Dirichlet kernel this is not the case: the sequence may fail to converge to even pointwise. The underlying reason is that , while the Fejér kernel, being positive, has constant norm. Does this mean that Fejér’s kernel is to be preferred for approximation purposes?

Let’s compare the performance of both kernels on the function , which is reasonably nice: . Convolution with yields . The trigonometric polynomial is in blue, the original function in red:

I’d say this is a very good approximation.

Now try the Fejér kernel, also with . The polynomial is

This is not good at all.

And even with terms the Fejér approximation is not as good as Dirichlet with merely .

The performance of is comparable to that of . Of course, a -term approximation is not what one normally wants to use. And it still has visible deviation near the origin, where the function is smooth:

In contrast, the Dirichlet kernel with gives a low-degree polynomial
that approximates to within the resolution of the plot:

What we have here is the trigonometric version of Biased and unbiased mollification. Convolution with amounts to truncation of the Fourier series at index . Therefore, it reproduces the trigonometric polynomials of low degrees precisely. But performs soft thresholding: it multiplies the th Fourier coefficient of by . In particular, it transforms into , introducing the error of order — a pretty big one. Since this error is built into the kernel, it limits the rate of convergence no matter how smooth the function is. Such is the price that must be paid for positivity.

This reminds me of a parenthetical remark by G. B. Folland in Real Analysis (2nd ed., page 264):

if one wants to approximate a function uniformly by trigonometric polynomials, one should not count on partial sums to do the job; the Cesàro means work much better in general.

Right, for ugly “generic” elements of the Fejér kernel is a safer option. But for decently behaved functions the Dirichlet kernel wins by a landslide. The function above was -smooth; as a final example I take which is merely Lipschitz on . The original function is in red, is in blue, and is in green.

Added: the Jackson kernel is the square of , normalized. I use as the index because squaring doubles the degree. Here is how it approximates :

The Jackson kernel performs somewhat better than , because the coefficient of is off by . Still not nearly as good as the non-positive Dirichlet kernel.