Discrete Cosine Transforms: Trapezoidal vs Midpoint

The existence of multiple versions of Discrete Cosine Transform (DCT) can be confusing. Wikipedia explains that its 8 types are determined by how one reflects across the boundaries. E.g., one can reflect 1, 2, 3 across the left boundary as 3, 2, 1, 2, 3 or as 3, 2, 1, 1, 2, 3, and there are such choices for the other boundary too (also, the other reflection can be odd or even). Makes sense enough.

But there is another aspect to the two most used forms, DCT-I and DCT-II (types I and II): they can be expressed in terms of the Trapezoidal and Midpoint rules for integration. Here is how.

The cosines {\cos kx}, {k=0,1,\dots} are orthogonal on the interval {[0,\pi]} with respect to the Lebesgue measure. The basis of discrete transforms is that these cosines are also orthogonal with respect to some discrete measures, until certain frequency is reached. Indeed, if cosines are orthogonal with respect to measure {\mu} whose support consists of {n} points, then we can efficiently use them to represent any function defined on the support of {\mu}, and those are naturally identified with sequences of length n.

How to find such measures? It helps that some simple rules of numerical integration are exact for trigonometric polynomials up to some degree.

For example, the trapezoidal rule with n sample points exactly integrates the functions {\cos kx} for {k=0,\dots, 2n-3}. This could be checked by converting to exponential form and summing geometric progressions, but here is a visual explanation with n=4, where the interval {[0,\pi]} is represented as upper semi-circle. The radius of each red circle indicates the weight placed at that point; the endpoints get 1/2 of the weight of other sample points. To integrate {\cos x} correctly, we must have the x-coordinate of the center of mass equal to zero, which is obviously the case.

Trapezoidal rule with 4 points

Replacing {\cos x} by {\cos kx} means multiplying the polar angle of each sample point by {k}. This is what we get:

Argument of cosine multiplied by k=2 or k=4
Argument multiplied by k=3
Argument multiplied by k=5

In all cases the x-coordinate of the center of mass is zero. With k=6 this breaks down, as all the weight gets placed in one point. And this is how it goes in general with integration of sines and cosines: equally spaced points work perfectly until they don’t work at all, which happens when the step size is equal to the period of the function. When {k=2n-2}, the period {2\pi/k} is equal to {\pi/(n-1)}, the spacing of points in the trapezoidal rule.

The orthogonality of cosines has to do with the formula {\cos kx \cos j x = \frac12(\cos (k-j)x) + \frac12(\cos (k+j)x)}. Let {\tau_n} be the measure expressing the trapezoidal rule on {[0,\pi]} with {n} sample points; so it’s the sum of point masses at {0, \pi/(n-1), \dots , \pi}. Then {\{\cos k x \colon k=0,\dots, n-1\}} are orthogonal with respect to {\tau_n} because any product {\cos k x\cos j x } with {k >j} taken from this range will have {k-j, k+j \le 2n-3}. Consequently, we can compute the coefficients of any function {f} in the cosine basis as

{\displaystyle c_k = \int f(x)\cos kx\,d\tau_n(x) \bigg/ \int \cos^2 kx\,d\tau_n(x)}

The above is what DCT-I (discrete cosine transform of type 1) does, up to normalization.

The DCT-II transform uses the Midpoint rule instead of the Trapezoidal rule. Let {\mu_n} be the measure expressing the Midpoint rule on {[0,\pi]} with {n} sample points; it gives equal mass to the points {(2j-1)\pi/(2n)} for {j=1,\dots, n}. These are spaced at {\pi/n} and therefore the midpoint rule is exact for {\cos kx} with {k=0,\dots, 2n-1} which is better than what the trapezoidal rule does. Perhaps more significantly, by identifying the given data points with function values at the midpoints of subintervals we stay away from the endpoints {0,\pi} where the cosines are somewhat restricted by having to have zero slope.

Let’s compare DCT-I and DCT-II on the same data set, {y=(0, \sqrt{1}, \sqrt{2}, \dots, \sqrt{8})}. There are 9 numbers here. Following DCT-I we place them at the sample points of the trapezoidal rule, and expand into cosines using the inner product with respect to {\tau_n}. Here is the plot of the resulting trigonometric polynomial: of course it interpolates the data.

DCT-I interpolation

But DCT-II does it better, despite having exactly the same cosine functions. The only change is that we use {\mu_n} and so place the {y}-values along its support.

DCT-II interpolation

Less oscillation means the high-degree coefficients are smaller, and therefore easier to discard in order to compress information. For example, drop the last two coefficients in each expansion, keeping 6 numbers instead of 8. DCT-II clearly wins in accuracy then.

Truncated DCT-I
Truncated DCT-II

Okay, so the Midpoint rule is better, no surprise. After all, it’s in general about twice as accurate as the Trapezoidal rule. What about Simpson’s rule, would it lead to some super-efficient form of DCT? That is, why don’t we let {\sigma_n} be the discrete measure that expresses Simpson’s rule and use the inner product {\int fg\,d\sigma_n} for cosine expansion? Alas, Simpson’s rule on {n} points is exact only for {\cos kx} with {k=0,\dots, n-2}, which is substantially worse than either Trapezoidal or Midpoint rules. As a result, we don’t get enough orthogonal cosines with respect to {\sigma_n} to have an orthogonal basis. Simpson’s rule has an advantage when dealing with algebraic polynomials, not with trigonometric ones.

Finally, the Python code used for the graphics; I did not use SciPy’s DCT method (which is of course more efficient) to keep the relation to numerical integration explicit in the code. The method trapz implements the trapezoidal rule, and the midpoint rule is just the summation of sampled values. In both cases there is no need to worry about factor dx, since it cancels out when we divide one numerical integral by the other.

import numpy as np
import matplotlib.pyplot as plt
#   Setup 
y = np.sqrt(np.arange(9))
c = np.zeros_like(y)
n = y.size
#   DCT-I, trapezoidal
x = np.arange(n)*np.pi/(n-1)
for k in range(n):
  c[k] = np.trapz(y*np.cos(k*x))/np.trapz(np.cos(k*x)**2)
t = np.linspace(0, np.pi, 500)
yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1)
plt.plot(x, y, 'ro')
plt.plot(t, yy)
#   DCT-II, midpoint
x = np.arange(n)*np.pi/n + np.pi/(2*n)
for k in range(n):
  c[k] = np.sum(y*np.cos(k*x))/np.sum(np.cos(k*x)**2)
t = np.linspace(0, np.pi, 500)
yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1)
plt.plot(x, y, 'ro')
plt.plot(t, yy)