Discrete Cosine Transforms: Trapezoidal vs Midpoint

The existence of multiple versions of Discrete Cosine Transform (DCT) can be confusing. Wikipedia explains that its 8 types are determined by how one reflects across the boundaries. E.g., one can reflect 1, 2, 3 across the left boundary as 3, 2, 1, 2, 3 or as 3, 2, 1, 1, 2, 3, and there are such choices for the other boundary too (also, the other reflection can be odd or even). Makes sense enough.

But there is another aspect to the two most used forms, DCT-I and DCT-II (types I and II): they can be expressed in terms of the Trapezoidal and Midpoint rules for integration. Here is how.

The cosines ${\cos kx}$, ${k=0,1,\dots}$ are orthogonal on the interval ${[0,\pi]}$ with respect to the Lebesgue measure. The basis of discrete transforms is that these cosines are also orthogonal with respect to some discrete measures, until certain frequency is reached. Indeed, if cosines are orthogonal with respect to measure ${\mu}$ whose support consists of ${n}$ points, then we can efficiently use them to represent any function defined on the support of ${\mu}$, and those are naturally identified with sequences of length n.

How to find such measures? It helps that some simple rules of numerical integration are exact for trigonometric polynomials up to some degree.

For example, the trapezoidal rule with n sample points exactly integrates the functions ${\cos kx}$ for ${k=0,\dots, 2n-3}$. This could be checked by converting to exponential form and summing geometric progressions, but here is a visual explanation with n=4, where the interval ${[0,\pi]}$ is represented as upper semi-circle. The radius of each red circle indicates the weight placed at that point; the endpoints get 1/2 of the weight of other sample points. To integrate ${\cos x}$ correctly, we must have the x-coordinate of the center of mass equal to zero, which is obviously the case.

Replacing ${\cos x}$ by ${\cos kx}$ means multiplying the polar angle of each sample point by ${k}$. This is what we get:

In all cases the x-coordinate of the center of mass is zero. With k=6 this breaks down, as all the weight gets placed in one point. And this is how it goes in general with integration of sines and cosines: equally spaced points work perfectly until they don’t work at all, which happens when the step size is equal to the period of the function. When ${k=2n-2}$, the period ${2\pi/k}$ is equal to ${\pi/(n-1)}$, the spacing of points in the trapezoidal rule.

The orthogonality of cosines has to do with the formula ${\cos kx \cos j x = \frac12(\cos (k-j)x) + \frac12(\cos (k+j)x)}$. Let ${\tau_n}$ be the measure expressing the trapezoidal rule on ${[0,\pi]}$ with ${n}$ sample points; so it’s the sum of point masses at ${0, \pi/(n-1), \dots , \pi}$. Then ${\{\cos k x \colon k=0,\dots, n-1\}}$ are orthogonal with respect to ${\tau_n}$ because any product ${\cos k x\cos j x }$ with ${k >j}$ taken from this range will have ${k-j, k+j \le 2n-3}$. Consequently, we can compute the coefficients of any function ${f}$ in the cosine basis as

${\displaystyle c_k = \int f(x)\cos kx\,d\tau_n(x) \bigg/ \int \cos^2 kx\,d\tau_n(x)}$

The above is what DCT-I (discrete cosine transform of type 1) does, up to normalization.

The DCT-II transform uses the Midpoint rule instead of the Trapezoidal rule. Let ${\mu_n}$ be the measure expressing the Midpoint rule on ${[0,\pi]}$ with ${n}$ sample points; it gives equal mass to the points ${(2j-1)\pi/(2n)}$ for ${j=1,\dots, n}$. These are spaced at ${\pi/n}$ and therefore the midpoint rule is exact for ${\cos kx}$ with ${k=0,\dots, 2n-1}$ which is better than what the trapezoidal rule does. Perhaps more significantly, by identifying the given data points with function values at the midpoints of subintervals we stay away from the endpoints ${0,\pi}$ where the cosines are somewhat restricted by having to have zero slope.

Let’s compare DCT-I and DCT-II on the same data set, ${y=(0, \sqrt{1}, \sqrt{2}, \dots, \sqrt{8})}$. There are 9 numbers here. Following DCT-I we place them at the sample points of the trapezoidal rule, and expand into cosines using the inner product with respect to ${\tau_n}$. Here is the plot of the resulting trigonometric polynomial: of course it interpolates the data.

But DCT-II does it better, despite having exactly the same cosine functions. The only change is that we use ${\mu_n}$ and so place the ${y}$-values along its support.

Less oscillation means the high-degree coefficients are smaller, and therefore easier to discard in order to compress information. For example, drop the last two coefficients in each expansion, keeping 6 numbers instead of 8. DCT-II clearly wins in accuracy then.

Okay, so the Midpoint rule is better, no surprise. After all, it’s in general about twice as accurate as the Trapezoidal rule. What about Simpson’s rule, would it lead to some super-efficient form of DCT? That is, why don’t we let ${\sigma_n}$ be the discrete measure that expresses Simpson’s rule and use the inner product ${\int fg\,d\sigma_n}$ for cosine expansion? Alas, Simpson’s rule on ${n}$ points is exact only for ${\cos kx}$ with ${k=0,\dots, n-2}$, which is substantially worse than either Trapezoidal or Midpoint rules. As a result, we don’t get enough orthogonal cosines with respect to ${\sigma_n}$ to have an orthogonal basis. Simpson’s rule has an advantage when dealing with algebraic polynomials, not with trigonometric ones.

Finally, the Python code used for the graphics; I did not use SciPy’s DCT method (which is of course more efficient) to keep the relation to numerical integration explicit in the code. The method trapz implements the trapezoidal rule, and the midpoint rule is just the summation of sampled values. In both cases there is no need to worry about factor dx, since it cancels out when we divide one numerical integral by the other.

import numpy as np
import matplotlib.pyplot as plt
#   Setup
y = np.sqrt(np.arange(9))
c = np.zeros_like(y)
n = y.size
#   DCT-I, trapezoidal
x = np.arange(n)*np.pi/(n-1)
for k in range(n):
c[k] = np.trapz(y*np.cos(k*x))/np.trapz(np.cos(k*x)**2)
t = np.linspace(0, np.pi, 500)
yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1)
plt.plot(x, y, 'ro')
plt.plot(t, yy)
plt.show()
#   DCT-II, midpoint
x = np.arange(n)*np.pi/n + np.pi/(2*n)
for k in range(n):
c[k] = np.sum(y*np.cos(k*x))/np.sum(np.cos(k*x)**2)
t = np.linspace(0, np.pi, 500)
yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1)
plt.plot(x, y, 'ro')
plt.plot(t, yy)
plt.show()

Peanut allergy and Polar interpolation

Draw a closed curve passing through the points ${(\pm 3,0)}$ and ${(0,\pm 1)}$: more specifically, from (3,0) to (0,1) to (-3,0) to (0,-1) and back to (3,0).

I’ll wait.

… Done?

This is an interpolation problem. The geometry of the problem does not allow for curves of the form ${y=f(x)}$. But polar graphs ${\rho=f(\theta)}$ should work. (Aside: in ${(x,y)}$ the variable usually taken to be independent is listed first; in ${(\rho,\theta)}$ it is listed second. Is consistent notation too much to ask for? Yes, I know the answer…) Since ${f}$ must be ${2\pi}$-periodic, it is natural to interpolate ${(3,0),(1,\pi/2),(3,\pi),(1,3\pi/2)}$ by a trigonometric polynomial. The polynomial ${f(\theta)=2+\cos 2\theta}$ does the job. But does it do it well?

This is not what I would draw. My picture would be more like the ellipse with the given points as vertices:

The unnecessary concavity of the peanut graph comes from the second derivative ${f''}$ being too large at the points of minimum (and too small at the points of maximum). We need a function with flat minima and sharp maxima. Here is the comparison between ${f(\theta)=2+\cos 2\theta}$ (red) and the function that describes the ellipse in polar coordinates (blue):

I thought of pre-processing: apply some monotone function ${\varphi\colon (0,\infty)\rightarrow \mathbb R}$ to the ${\rho}$-values, interpolate, and then compose the interpolant with ${\varphi^{-1}}$. The function ${\varphi}$ should have a large derivative near ${0}$, so that its inverse has a small derivative there, flattening the minima.

The first two candidates ${\varphi(r)=\log r}$ and ${\varphi(r)=1/r}$ did not improve the picture enough. But when I tried ${\varphi(r)=1/r^2}$, the effect surprised me. Interpolation between ${(1/9,0),(1,\pi/2),(1/9,\pi),(1,3\pi/2)}$ yields ${(5-4\cos 2\theta)/9}$, which after application of ${\varphi^{-1}}$ produces ${\rho = 3/\sqrt{5-4\cos 2\theta}}$exactly the ellipse pictured above.

Next, I tried a less symmetric example: curve through ${(5,0);(0,2);(-3,0);(0,-2)}$.

Again, the second approach produces a more natural looking curve.

Finally, a curve with five-fold symmetry, with ${\rho}$ ranging from ${2}$ to ${5}$.

The interpolation and plotting were done in Scilab, by invoking the functions given below with commands polar1([3,1,3,1]) or polar2([2,5,2,5,2,5,2,5,2,5]), etc. Because I considered equally spaced interpolation nodes, the coefficients of interpolated polynomials are nothing but the (inverse) discrete Fourier transform of the given values. First function interpolates the radius ${\rho}$ itself.

function polar1(R)
clf()
p = fft(R,1)
t = 0:.01:2*%pi
rho = zeros(t)
for i = 1:length(p)
rho = rho + real(p(i)*exp(-%i*(i-1)*t))
end
polarplot(t,rho,style=5)
t = resize_matrix(linspace(0,2*%pi,length(R)+1),1,length(R))
plot(R.*cos(t), R.*sin(t), 'o')
endfunction

The second function includes preprocessing: it interpolates ${1/\rho^2}$ and then raises the interpolant to power ${-1/2}$.

function polar2(R)
clf()
p = fft(R^(-2),1)
t = 0:.01:2*%pi
rho = zeros(t)
for i = 1:length(p)
rho = rho + real(p(i)*exp(-%i*(i-1)*t))
end
rho = max(rho,0.01*ones(rho))
polarplot(t,rho^(-1/2),style=5)
t = resize_matrix(linspace(0,2*%pi,length(R)+1),1,length(R))
plot(R.*cos(t), R.*sin(t), 'o')
endfunction

Added: one more comparison, ${\rho^{1}}$ vs ${\rho^{-2}}$ vs ${\rho^{-3}}$; the last one loses convexity again.

And then to ${\rho^{-7}}$: