Having considered the SMBC version of the Fourier transform, it is time to take a look at the traditional one:
(I am not going to worry about the convergence of any integrals in this post.) It is obvious that for any
which can be tersely stated as using the norm notation. A less obvious, but more important, relation is . Interpolating between and we obtain the Hausdorff-Young inequality for . Here and in what follows .
Summarizing the above, the function does not exceed on the interval and attains the value at . This brings back the memories of Calculus I and the fun we had finding the absolute maximum of a function on a closed interval. More specifically, it brings the realization that . (I do not worry about differentiability either.)
What does the inequality tell us about ? When writing it out, it is better to work with , avoiding another memory of Calculus I: the Quotient Rule.
To differentiate this, we have to recall , but nothing more unpleasant happens:
Here the integral with gets the minus sign from the chain rule: at . In terms of the Shannon entropy , the inequality becomes simply
Inequality (1) was proved by I. Hirschman in 1957, and I followed his proof above. The left side of (1) is known as the entropic uncertainty (or Hirschman uncertainty) of . As Hirschman himself conjectured, (1) is not sharp: it can be improved to
The reason is that the Hausdorff-Young inequality is itself not sharp for . It took about twenty years until W. Beckner proved the sharp form of the Hausdorff-Young inequality in his Ph.D. thesis (1975):
Here is the plot of the upper bound in (3):
Since the graph of stays below this curve and touches it at , the derivative is no less than the slope of the curve at , which is . Recalling that , we arrive at (2).
The best known form of the uncertainty principle is due to H. Weyl:
Although (4) can be derived from (2), this route is rather inefficient: Beckner’s theorem is hard, while a direct proof of (4) takes only a few lines: integration by parts , chain rule and the Cauchy-Schwarz inequality.
But we can take another direction and use (1) (not the hard, sharp form (2)) to obtain the following inequality, also due to Hirschman: for every there is such that
It is convenient to normalize so that . This makes a probability distribution (and as well). Our goal is to show that for any probability distribution
where depends only on . Clearly, (1) and (6) imply (5).
A peculiar feature of (6) is that appears in the integral on the right, but not on the left. This naturally makes one wonder how (6) behaves under scaling . Well, wonder no more—
Thus, both sides of (6) change by . The inequality passed the scaling test, and now we turn scaling to our advantage by making . This reduces (6) to .
Now comes a clever trick (due to Beckner): introduce another probability measure where is a normalizing factor. Let , so that . By Jensen’s inequality,
On the other hand,
and we have desired bound .
Halmos photographed analyst Isidore Hirschman (1922-1990) in June of 1960. Hirschman earned his Ph.D. in 1947 from Harvard with the dissertation “Some Representation and Inversion Problems for the Laplace Transform,” written under David Widder. After writing ten papers together, Hirschman and Widder published the book The Convolution Transform in 1955 (Princeton University Press; now available from Dover Publications). Hirschman spent most of his career (1949-1978) at Washington University in St. Louis, Missouri, where he published mainly in harmonic analysis and operator theory.
- Beckner, William. “Inequalities in Fourier analysis.” Ann. of Math.(2) 102.1 (1975): 159-182.
- Cowling, Michael G., and John F. Price. “Bandwidth versus time concentration: the Heisenberg-Pauli-Weyl inequality.” SIAM Journal on Mathematical Analysis 15.1 (1984): 151-165.
- Folland, Gerald B., and Alladi Sitaram. “The uncertainty principle: a mathematical survey.” Journal of Fourier Analysis and Applications 3.3 (1997): 207-238.
- Hirschman Jr, I. I. “A note on entropy.” American Journal of Mathematics 79.1 (1957): 152-156.