The existence of multiple versions of Discrete Cosine Transform (DCT) can be confusing. Wikipedia explains that its 8 types are determined by how one reflects across the boundaries. E.g., one can reflect 1, 2, 3 across the left boundary as 3, 2, 1, 2, 3 or as 3, 2, 1, 1, 2, 3, and there are such choices for the other boundary too (also, the other reflection can be odd or even). Makes sense enough.
But there is another aspect to the two most used forms, DCT-I and DCT-II (types I and II): they can be expressed in terms of the Trapezoidal and Midpoint rules for integration. Here is how.
The cosines , are orthogonal on the interval with respect to the Lebesgue measure. The basis of discrete transforms is that these cosines are also orthogonal with respect to some discrete measures, until certain frequency is reached. Indeed, if cosines are orthogonal with respect to measure whose support consists of points, then we can efficiently use them to represent any function defined on the support of , and those are naturally identified with sequences of length n.
How to find such measures? It helps that some simple rules of numerical integration are exact for trigonometric polynomials up to some degree.
For example, the trapezoidal rule with n sample points exactly integrates the functions for . This could be checked by converting to exponential form and summing geometric progressions, but here is a visual explanation with n=4, where the interval is represented as upper semi-circle. The radius of each red circle indicates the weight placed at that point; the endpoints get 1/2 of the weight of other sample points. To integrate correctly, we must have the x-coordinate of the center of mass equal to zero, which is obviously the case.
Replacing by means multiplying the polar angle of each sample point by . This is what we get:
In all cases the x-coordinate of the center of mass is zero. With k=6 this breaks down, as all the weight gets placed in one point. And this is how it goes in general with integration of sines and cosines: equally spaced points work perfectly until they don’t work at all, which happens when the step size is equal to the period of the function. When , the period is equal to , the spacing of points in the trapezoidal rule.
The orthogonality of cosines has to do with the formula . Let be the measure expressing the trapezoidal rule on with sample points; so it’s the sum of point masses at . Then are orthogonal with respect to because any product with taken from this range will have . Consequently, we can compute the coefficients of any function in the cosine basis as
The above is what DCT-I (discrete cosine transform of type 1) does, up to normalization.
The DCT-II transform uses the Midpoint rule instead of the Trapezoidal rule. Let be the measure expressing the Midpoint rule on with sample points; it gives equal mass to the points for . These are spaced at and therefore the midpoint rule is exact for with which is better than what the trapezoidal rule does. Perhaps more significantly, by identifying the given data points with function values at the midpoints of subintervals we stay away from the endpoints where the cosines are somewhat restricted by having to have zero slope.
Let’s compare DCT-I and DCT-II on the same data set, . There are 9 numbers here. Following DCT-I we place them at the sample points of the trapezoidal rule, and expand into cosines using the inner product with respect to . Here is the plot of the resulting trigonometric polynomial: of course it interpolates the data.
But DCT-II does it better, despite having exactly the same cosine functions. The only change is that we use and so place the -values along its support.
Less oscillation means the high-degree coefficients are smaller, and therefore easier to discard in order to compress information. For example, drop the last two coefficients in each expansion, keeping 6 numbers instead of 8. DCT-II clearly wins in accuracy then.
Okay, so the Midpoint rule is better, no surprise. After all, it’s in general about twice as accurate as the Trapezoidal rule. What about Simpson’s rule, would it lead to some super-efficient form of DCT? That is, why don’t we let be the discrete measure that expresses Simpson’s rule and use the inner product for cosine expansion? Alas, Simpson’s rule on points is exact only for with , which is substantially worse than either Trapezoidal or Midpoint rules. As a result, we don’t get enough orthogonal cosines with respect to to have an orthogonal basis. Simpson’s rule has an advantage when dealing with algebraic polynomials, not with trigonometric ones.
Finally, the Python code used for the graphics; I did not use SciPy’s DCT method (which is of course more efficient) to keep the relation to numerical integration explicit in the code. The method
trapz implements the trapezoidal rule, and the midpoint rule is just the summation of sampled values. In both cases there is no need to worry about factor dx, since it cancels out when we divide one numerical integral by the other.
import numpy as np import matplotlib.pyplot as plt # Setup y = np.sqrt(np.arange(9)) c = np.zeros_like(y) n = y.size # DCT-I, trapezoidal x = np.arange(n)*np.pi/(n-1) for k in range(n): c[k] = np.trapz(y*np.cos(k*x))/np.trapz(np.cos(k*x)**2) t = np.linspace(0, np.pi, 500) yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1) plt.plot(x, y, 'ro') plt.plot(t, yy) plt.show() # DCT-II, midpoint x = np.arange(n)*np.pi/n + np.pi/(2*n) for k in range(n): c[k] = np.sum(y*np.cos(k*x))/np.sum(np.cos(k*x)**2) t = np.linspace(0, np.pi, 500) yy = np.sum(c*np.cos(np.arange(9)*t.reshape(-1, 1)), axis=1) plt.plot(x, y, 'ro') plt.plot(t, yy) plt.show()