Nonlinear Closed Graph Theorem

If a function {f\colon {\mathbb R}\rightarrow {\mathbb R}} is continuous, then its graph {G_f = \{(x,f(x))\colon x\in{\mathbb R}\}} is closed. The converse is false: a counterexample is given by any extension of {y=\tan x} to the real line.

Closed graph, not continuous
Closed graph, not continuous

The Closed Graph Theorem of functional analysis states that a linear map between Banach spaces is continuous whenever its graph is closed. Although the literal extension to nonlinear maps fails, it’s worth noting that linear maps are either continuous or discontinuous everywhere. Hence, if one could show that a nonlinear map with a closed graph has at least one point of continuity, this would be a nonlinear version of the Closed Graph Theorem.

Here is an example of a function with a closed graph and an uncountable set of discontinuities. Let {C\subset {\mathbb R}} be a closed set with empty interior, and define

\displaystyle    f(x) = \begin{cases} 0,\quad & x\in C \\ \textrm{dist}\,(x,C)^{-1},\quad & x\notin C \end{cases}

For a general function, the set of discontinuities is an {F_\sigma} set. When the graph is closed, we can say more: the set of discontinuities is closed. Indeed, suppose that a function {f} is bounded in a neighborhood of {a} but is not continuous at {a}. Then there are two sequences {x_n\rightarrow a} and {y_n\rightarrow a} such that both sequences {f(x_n)} and {f(y_n)} converge but have different limits. Since at least one of these limits must be different from {f(a)}, the graph of {f} is not closed. Conclusion: a function with a closed graph is continuous at {a} if and only if it is bounded in a neighborhood of {a}. In particular, the set of discontinuities is closed.

Furthermore, the set of discontinuities has empty interior. Indeed, suppose that {f} is discontinuous at every point of a nontrivial closed interval {[a,b]}. Let {A_n = G_f \cap ([a,b]\times [-n,n])}; this is a closed bounded set, hence compact. Its projection onto the {x}-axis is also compact, and this projection is exactly the set {B_n=\{x\in [a,b] : |f(x)|\le n\}}. Thus, {B_n} is closed. The set {B_n} has empty interior, since otherwise {f} would be continuous at its interior points. Finally, {\bigcup B_n=[a,b]}, contradicting the Baire Category theorem.

Summary: for closed-graph functions on {\mathbb R}, the sets of discontinuity are precisely the closed sets with empty interior. In particular, every such function has a point of continuity. The proof works just as well for maps from {\mathbb R^n} to any metric space.

However, the above result does not extend to the setting of Banach spaces. Here is an example of a map {F\colon X\rightarrow X} on a Banach space {X} such that {\|F(x)-F(y)\|=1} whenever {x\ne y}; this property implies that the graph is closed, despite {F} being discontinuous everywhere.

Let {X} the space of all bounded functions {\phi \colon (0,1]\rightarrow\mathbb R} with the supremum norm. Let {(q_n)_{n=1}^\infty} be an enumeration of all rational numbers. Define the function {\psi =F(\phi )} separately on each subinterval {(2^{-n},2^{1-n}]}, {n=1,2,\dots} as

{\displaystyle    \psi(t) = \begin{cases} 1 \quad &\text{if } \phi (2^nt-1) > q_n \\   0 \quad &\text{if } \phi (2^n t-1)\le q_n\end{cases}}

For any two distinct elements {\phi_1,\phi_2} of {X} there is a point {s\in (0,1]} and a number {n\in\mathbb N} such that {q_n} is strictly between {\phi_1(s)} and {\phi_2(s)}. According to the definition of {F} this implies that the functions {F(\phi_1)} and {F(\phi_2)} take on different values at the point {t=2^{-n}(s+1)}. Thus the norm of their difference is {1}.

So much for Nonlinear Closed Graph Theorem. However, the space {X} in the above example is nonseparable. Is there an nowhere continuous map between separable Banach spaces such that its graph is closed?

Institutions ranked by the number of AMS Fellows: 2015 update

Adding 2015 Fellows changed a few things compared to 2014 rankings. UCLA and Rutgers rise from 2nd and 3rd to tie Berkeley for the first place. Syracuse loses nine places, dropping from #334(tie) to #343(tie).

Podium 231

  • 1. Rutgers The State University of New Jersey New Brunswick: 34
  • 1. University of California, Los Angeles: 34
  • 1. University of California, Berkeley: 34
  • 4. University of Michigan: 32
  • 5. Massachusetts Institute of Technology: 29
  • 6. University of Wisconsin, Madison: 23
  • 7. Cornell University: 22
  • 7. New York University, Courant Institute: 22
  • 7. University of Illinois, Urbana-Champaign: 22
  • 10. University of Texas at Austin: 21
  • 10. University of Chicago: 21
  • 10. Princeton University: 21
  • 13. University of Washington: 20
  • 14. University of California, San Diego: 19
  • 14. Stanford University: 19
  • 16. University of Pennsylvania: 17
  • 16. University of Minnesota-Twin Cities: 17
  • 18. University of California, Santa Barbara: 16
  • 18. Pennsylvania State University: 16
  • 18. Stony Brook University: 16
  • 18. Brown University: 16
  • 22. Purdue University: 15
  • 22. University of Maryland: 15
  • 24. University of California, Irvine: 14
  • 24. Duke University: 14
  • 26. Ohio State University, Columbus: 13
  • 26. Eidgenössische Technische Hochschule Zürich (ETH Zürich): 13
  • 26. University of Illinois at Chicago: 13
  • 29. Northwestern University: 12
  • 29. Georgia Institute of Technology: 12
  • 29. Johns Hopkins University, Baltimore: 12
  • 29. Texas A&M University: 12
  • 33. University of Toronto: 11
  • 33. Harvard University: 11
  • 33. Indiana University, Bloomington: 11
  • 36. University of North Carolina at Chapel Hill: 10
  • 36. University of British Columbia: 10
  • 36. The Graduate Center: 10
  • 36. Rice University: 10
  • 40. Vanderbilt University: 9
  • 41. University of Utah: 8
  • 41. Boston University: 8
  • 41. California Institute of Technology: 8
  • 41. University of Nebraska-Lincoln: 8
  • 41. Institute for Advanced Study: 8
  • 41. University of Notre Dame: 8
  • 47. University of Georgia: 7
  • 47. University of Southern California: 7
  • 47. University of Virginia: 7
  • 47. University of California, Davis: 7
  • 47. University of Oregon: 7
  • 47. Brandeis University: 7
  • 47. Microsoft Research: 7
  • 54. University of Oxford: 6
  • 54. Université Pierre et Marie Curie (Paris VI): 6
  • 54. University of Arizona: 6
  • 54. Carnegie Mellon University: 6
  • 54. Michigan State University: 6
  • 54. Columbia University: 6
  • 54. The Hebrew University of Jerusalem: 6
  • 54. Williams College: 6
  • 54. North Carolina State University: 6
  • 54. Tel Aviv University: 6
  • 64. University of Rochester: 5
  • 64. Lehman College: 5
  • 64. NYU Polytechnic School of Engineering: 5
  • 67. University of California, Riverside: 4
  • 67. Ecole Polytechnique Fédérale de Lausanne (EPFL): 4
  • 67. Université Paris-Diderot: 4
  • 67. Florida State University: 4
  • 67. Harvey Mudd College: 4
  • 67. University of Tennessee, Knoxville: 4
  • 67. Yale University: 4
  • 67. Louisiana State University, Baton Rouge: 4
  • 67. Northeastern University: 4
  • 67. Virginia Polytechnic Institute and State University: 4
  • 67. University of Colorado, Boulder: 4
  • 78. Tsinghua University: 3
  • 78. Norwegian University of Science and Technology: 3
  • 78. McGill University: 3
  • 78. University of Cambridge: 3
  • 78. University of Memphis: 3
  • 78. University of Melbourne: 3
  • 78. KU Leuven: 3
  • 78. Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences: 3
  • 78. KTH Royal Institute of Technology: 3
  • 78. Københavns Universitet: 3
  • 78. The City College: 3
  • 78. Università degli Studi di Milano: 3
  • 78. University of Warwick: 3
  • 78. University of Connecticut, Storrs: 3
  • 78. Barnard College, Columbia University: 3
  • 78. Australian National University: 3
  • 78. Université Paris-Sud (Paris XI): 3
  • 78. Washington University: 3
  • 78. Weizmann Institute of Science: 3
  • 78. Mathematical Institute, Oxford University: 3
  • 98. Bar-Ilan University: 2
  • 98. Rheinische Friedrich-Wilhelms-Universität Bonn: 2
  • 98. American Mathematical Society: 2
  • 98. Emory University: 2
  • 98. Rutgers The State University of New Jersey Newark: 2
  • 98. Sapienza – Università di Roma: 2
  • 98. Shanghai Jiao Tong University: 2
  • 98. Smith College: 2
  • 98. Lund University: 2
  • 98. University of Florida: 2
  • 98. University of Freiburg: 2
  • 98. Steklov Institute of Mathematics of the Russian Academy of Sciences: 2
  • 98. University of Heidelberg: 2
  • 98. Boston College: 2
  • 98. Tata Institute of Fundamental Research: 2
  • 98. University of Iowa: 2
  • 98. University of Kansas: 2
  • 98. University of Kentucky: 2
  • 98. University of Manchester: 2
  • 98. Technical University of Denmark: 2
  • 98. University of Massachusetts, Amherst: 2
  • 98. Case Western Reserve University: 2
  • 98. Temple University: 2
  • 98. Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV): 2
  • 98. Académie des sciences, Institut de France: 2
  • 98. University of Missouri-Columbia: 2
  • 98. CNRS, École Normale Supérieure de Lyon: 2
  • 98. University of New Hampshire: 2
  • 98. IDA Center for Communications Research: 2
  • 98. Imperial College: 2
  • 98. University of Oklahoma: 2
  • 98. The Fields Institute: 2
  • 98. University of Oslo: 2
  • 98. Indian Institute of Technology: 2
  • 98. Indiana University-Purdue University Indianapolis: 2
  • 98. Tulane University: 2
  • 98. Universidad Nacional Autónoma de México: 2
  • 98. Auburn University: 2
  • 98. Universität Wien: 2
  • 98. Université de Genève: 2
  • 98. Université de Montréal: 2
  • 98. Institut des Hautes Etudes Scientifiques (IHES): 2
  • 98. The City University of New York: 2
  • 98. Institute of Mathematics, University of Paderborn: 2
  • 98. University of Alabama at Birmingham: 2
  • 98. Oklahoma State University: 2
  • 98. Vienna University of Technology: 2
  • 98. University of Bristol: 2
  • 98. Oregon State University: 2
  • 98. Wayne State University: 2
  • 98. Dartmouth College: 2
  • 98. Wesleyan University: 2
  • 98. Brigham Young University: 2
  • 151. Silesian University of Technology: 1
  • 151. Pontifícia Universidade Católica do Rio de Janeiro: 1
  • 151. Carleton University: 1
  • 151. Grambling State University: 1
  • 151. Queen Mary, University of London: 1
  • 151. Queen’s University: 1
  • 151. Reed College: 1
  • 151. Rensselaer Polytechnic Institute: 1
  • 151. Research Institute for Mathematical Sciences, Johannes Kepler University Linz: 1
  • 151. Research Institute for Mathematical Sciences, Kyoto University: 1
  • 151. B.I.Verkin Institute for Low Temperature Physics and Engineering: 1
  • 151. American University: 1
  • 151. Roskilde University: 1
  • 151. Rutgers The State University of New Jersey Camden: 1
  • 151. Haverford College: 1
  • 151. Hillman University: 1
  • 151. Saint Petersburg State University: 1
  • 151. San Francisco State University: 1
  • 151. Hong Kong University of Science and Technology: 1
  • 151. Seoul National University: 1
  • 151. IBM Research: 1
  • 151. Aarhus University: 1
  • 151. Center for Communications Research, Princeton, New Jersey: 1
  • 151. Sobolev Institute of Mathematics of the Siberian Branch of the Russian Academy of Sciences: 1
  • 151. Society for Industrial and Applied Mathematics (SIAM): 1
  • 151. Spelman College: 1
  • 151. St Louis University: 1
  • 151. St Olaf College: 1
  • 151. St. Petersburg Department of Steklov Institute of Mathematics of Russian Academy of Sciences: 1
  • 151. IMFUFA, Roskilde University: 1
  • 151. Centre for Quantum Geometry of Moduli Spaces (QGM), Aarhus University: 1
  • 151. Stevens Institute of Technology: 1
  • 151. Centre of Mathematics for Applications, University of Oslo: 1
  • 151. SUNY, Maritime College: 1
  • 151. Amgen: 1
  • 151. Centrum Wiskunde & Informatica and University of Amsterdam: 1
  • 151. Technion – Israel Institute of Technology: 1
  • 151. Technische Universität Berlin: 1
  • 151. Technische Universität Darmstadt: 1
  • 151. Institució Catalana de Recerca i Estudis Avançats (ICREA): 1
  • 151. Institut de Mecanique Celeste et de Calcul des Ephemerides (IMCCE): 1
  • 151. Chalmers University of Technology: 1
  • 151. The Abdus Salam International Centre for Theoretical Physics: 1
  • 151. The Chinese University of Hong Kong: 1
  • 151. Institut mathématique de Jussieu: 1
  • 151. Institut Universitaire de France: 1
  • 151. Chennai Mathematical Institute: 1
  • 151. Institute for Information Transmission Problems: 1
  • 151. Institute for Quantum Computing, Waterloo: 1
  • 151. Institute of Mathematical Sciences, The Chinese University of Hong Kong: 1
  • 151. The Institute for System Research of the Russian Academy of Sciences: 1
  • 151. The Institute of Mathematical Sciences, Chennai, India: 1
  • 151. The OEIS Foundation Incorporated: 1
  • 151. The University of British Columbia: 1
  • 151. The University of Liverpool: 1
  • 151. The University of Western Australia: 1
  • 151. Tomakomai National College of Technology: 1
  • 151. Institute of Mathematics, Academia Sinica, ROC: 1
  • 151. TU Dortmund University: 1
  • 151. Tufts University: 1
  • 151. Chern Institute of Mathematics, Nankai University: 1
  • 151. Univeristy of Edinburgh: 1
  • 151. Universidad Autónoma de Madrid: 1
  • 151. Universidad de Concepción: 1
  • 151. Universidad de Valladolid: 1
  • 151. Universidad del País Vasco: 1
  • 151. Instituto de Matemáticas Universidad Nacional Autónoma de México: 1
  • 151. Instituto de Matematica Pura e Aplicada (IMPA): 1
  • 151. Universität Bielefeld: 1
  • 151. Universität des Saarlandes: 1
  • 151. Universität Konstanz, Germany: 1
  • 151. Universität Regensburg: 1
  • 151. Instituto de Matematicas, Universidad de Talca: 1
  • 151. Universität Zürich: 1
  • 151. Université Bordeaux 1: 1
  • 151. Université de Caen: 1
  • 151. Jacobs University: 1
  • 151. Chinese Academy of Sciences: 1
  • 151. Université du Québec à Montréal: 1
  • 151. City University of Hong Kong: 1
  • 151. Université Paris-Nord (Paris XIII): 1
  • 151. Kent State University, Kent: 1
  • 151. King Abdullah University of Science and Technology: 1
  • 151. Universitat Politecnica de Catalunya: 1
  • 151. Universiteit Gent: 1
  • 151. Universiteit Leiden: 1
  • 151. Universiteit Utrecht: 1
  • 151. Universiteit van Amsterdam: 1
  • 151. University at Buffalo: 1
  • 151. University College London: 1
  • 151. University of Aberdeen: 1
  • 151. King’s College London: 1
  • 151. University of Alaska Fairbanks: 1
  • 151. King Saud University: 1
  • 151. University of Auckland: 1
  • 151. University of Basel: 1
  • 151. Kobe University: 1
  • 151. Korea Institute for Advanced Study: 1
  • 151. Cleveland State University: 1
  • 151. CMLA, École Normale Supérieure de Cachan: 1
  • 151. Kyoto University: 1
  • 151. 1
  • 151. Lawrence Berkeley National Laboratory: 1
  • 151. Lehigh University: 1
  • 151. CNRS and Université Paris-Sud (Paris XII): 1
  • 151. Loyola University of Chicago: 1
  • 151. University of Central Florida: 1
  • 151. Ludwig-Maximilians-Universität München (LMU München): 1
  • 151. University of Cincinnati: 1
  • 151. Baylor University: 1
  • 151. Macalester College: 1
  • 151. University of Edinburgh: 1
  • 151. CNRS, Institut de Mathématiques de Toulouse: 1
  • 151. Massey University: 1
  • 151. Math for America: 1
  • 151. University of Hawaii at Manoa: 1
  • 151. Mathematical Institute, Leiden University: 1
  • 151. University of Helsinki: 1
  • 151. University of Houston: 1
  • 151. Mathematical Institute, Linkoeping University: 1
  • 151. Collège de France: 1
  • 151. Mathematics Institute, Freiburg University: 1
  • 151. Max Planck Institute for Gravitational Physics (Albert Einstein Institute): 1
  • 151. Max Planck Institute for Mathematics in the Sciences: 1
  • 151. University of Leeds: 1
  • 151. University of Liverpool: 1
  • 151. Beijing Normal University: 1
  • 151. McMaster University: 1
  • 151. University of Massachusetts Boston: 1
  • 151. Bielefeld University: 1
  • 151. Croatian Academy of Sciences and Arts: 1
  • 151. Montana State University: 1
  • 151. University of Miami: 1
  • 151. Moravian College: 1
  • 151. University of Michoacan: 1
  • 151. University of Minnesota Rochester: 1
  • 151. University of Minnesota-Duluth: 1
  • 151. Mount Holyoke College: 1
  • 151. Muenster University: 1
  • 151. Nagoya University: 1
  • 151. University of Newcastle: 1
  • 151. Nanyang Technological University: 1
  • 151. University of New Mexico: 1
  • 151. University of New South Wales: 1
  • 151. University of Nice: 1
  • 151. National Science Foundation: 1
  • 151. University of North Carolina at Charlotte: 1
  • 151. National Security Agency: 1
  • 151. National Taiwan University: 1
  • 151. National Tsing Hua University, Taiwan: 1
  • 151. New Jersey Institute of Technology: 1
  • 151. New Mexico State University, Las Cruces: 1
  • 151. Amherst College: 1
  • 151. University of Pittsburgh: 1
  • 151. Nihon University: 1
  • 151. University of San Francisco: 1
  • 151. University of South Carolina: 1
  • 151. Arizona State University: 1
  • 151. Eötvös Loránd University: 1
  • 151. Northern Illinois Univeresity: 1
  • 151. University of Texas at Dallas: 1
  • 151. University of Texas at San Antonio: 1
  • 151. University of Tokyo: 1
  • 151. University of Toledo: 1
  • 151. Northrop Grumman Corporation: 1
  • 151. Ecole des hautes études en sciences sociales (EHESS): 1
  • 151. University of Valencia: 1
  • 151. University of Vermont: 1
  • 151. University of Victoria: 1
  • 151. Ateneo de Manila University: 1
  • 151. University of Warsaw: 1
  • 151. Åbo Akademi University: 1
  • 151. El Colegio Nacional: 1
  • 151. Albert-Ludwigs-Universitat: 1
  • 151. University of Wisconsin, Milwaukee: 1
  • 151. Utah State University: 1
  • 151. Open University, U.K.: 1
  • 151. Victoria University of Wellington: 1
  • 151. Austrian Academy of Sciences: 1
  • 151. Osaka University: 1
  • 151. Wake Forest University: 1
  • 151. Waseda University: 1
  • 151. Freie Universität Berlin: 1
  • 151. Philipps-Universität Marburg: 1
  • 151. Pitzer College: 1
  • 151. Pohang University of Science and Technology (POSTECH): 1
  • 151. Western Washington University: 1
  • 151. Westfälische Wilhelms-Universität Münster: 1
  • 151. Polish Academy of Sciences: 1
  • 151. Wolfram Research: 1
  • 151. Pomona College: 1
  • 151. York University: 1
  • 343. Syracuse University: 0

To create this list, I parsed the AMS page with the following JavaScript code:

var b = document.getElementById('amsContentDiv').children;
var inst = []; 
var j = -1; 
for (var i=10; i<b.length; i++) {
  if (b[i].tagName=='SPAN') {
    if (b[i].style['font-weight']=='bold') {
      inst[j] = {}; 
      inst[j].name = b[i].textContent; 
      inst[j].count = 0;
    else {
inst.sort(function(a,b) {return b.count-a.count});
var html = [];
var place = 1;
for (var i=0; i<inst.length; i++) {
  html.push('<li><strong>'+place+'.</strong> '+inst[i].name+': <em>'+inst[i].count+'</em></li>');
  if (inst[i+1] && inst[i+1].count<inst[i].count) {
    place = i+2;

Completely monotone imitation of 1/x

I wanted an example of a function {f} that behaves mostly like {1/x} (the product {xf(x)} is bounded between two positive constants), but such that {xf(x)} does not have a limit as {x\rightarrow 0}.

The first thing that comes to mind is {(2+\sin(1/x))/x}, but this function does not look very much like {1/x}.

sin(1/x) makes it too wiggly
sin(1/x) makes it too wiggly

Then I tried {f(x)=(2+\sin\log x)/x}, recalling an example from Linear Approximation and Differentiability. It worked well:

I can't believe it's not a hyperbola!
I can’t believe it’s not a hyperbola!

In fact, it worked much better than I expected. Not only if {f'} of constant sign, but so are {f''} and {f'''}. Indeed,

\displaystyle    f'(x) = \frac{\cos \ln x - \sin \log x - 2}{x^2}

is always negative,

\displaystyle    f''(x) = \frac{4 -3\cos \log x + \sin \log x}{x^3}

is always positive,

\displaystyle    f'''(x) = \frac{10\cos \log x -12}{x^4}

is always negative. The sign becomes less obvious with the fourth derivative,

\displaystyle    f^{(4)}(x) = \frac{48-40\cos\log x - 10\sin \cos \ln x}{x^5}

because the triangle inequality isn’t conclusive now. But the amplitude of {A\cos t+B\sin t} is {\sqrt{A^2+B^2}}, and {\sqrt{40^2+10^2}<48}.

So, it seems that {f} is completely monotone, meaning that {(-1)^n f^{(n)}(x)\ge 0} for all {x>0} and for all {n=0,1,2,\dots}. But we already saw that this sign pattern can break after many steps. So let’s check carefully.

Direct calculation yields the neat identity

\displaystyle    \left(\frac{1+a\cos \log x+b\sin\log x}{x^n}\right)' = -n\,\frac{1+(a-b/n)\cos\log x+(b+a/n) \sin\log x}{x^{n+1}}

With its help, the process of differentiating the function {f(x) = (1+a\cos \log x+b\sin\log x)/x} can be encoded as follows: {a_1=a}, {b_1=b}, then {a_{n+1}=a_n-b_n/n} and {b_{n+1} = b_n+a_n/n}. The presence of {1/n} is disconcerting because the harmonic series diverges. But orthogonality helps: the added vector {(-b_n/n, a_n/n)} is orthogonal to {(a_n,b_n)}.

The above example, rewritten as {f(x)=(1+\frac12\sin\log x)/x}, corresponds to starting with {(a,b) = (0,1/2)}. I calculated and plotted {10000} iterations: the points {(a_n,b_n)} are joined by piecewise linear curve.

Harmonic spiral
Harmonic spiral

The total length of this curve is infinite, since the harmonic series diverges. The question is, does it stay within the unit disk? Let’s find out. By the above recursion,

\displaystyle    a_{n+1}^2 + b_{n+1}^2 = \left(1+\frac{1}{n^2}\right) (a_n^2+b_n^2)

Hence, the squared magnitude of {(a_n,b_n)} will always be less than

\displaystyle    \frac14 \prod_{n=1}^\infty \left(1+\frac{1}{n^2}\right)

with {1/4} being {a^2+b^2}. The infinite product evaluates to {\frac{\sinh \pi}{\pi}a\approx 3.7} (explained here), and thus the polygonal spiral stays within the disk of radius {\frac12 \sqrt{\frac{\sinh \pi}{\pi}}\approx 0.96}. In conclusion,

\displaystyle    (-1)^{n} \left(\frac{1+(1/2)\sin\log x}{x}\right)^{(n)} = n!\,\frac{1+a_{n+1}\cos\log x+b_{n+1} \sin\log x}{x^{n+1}}

where the trigonometric function {a_{n+1}\cos\log x+b_{n+1} \sin\log x} has amplitude strictly less than {1}. Since the expression on the right is positive, {f} is completely monotone.

The plot was generated in Sage using the code below.

a,b,c,d = var('a b c d')
a = 0
b = 1/2
l = [(a,b)]
for k in range(1,10000):
    c = a-b/k
    d = b+a/k
    a = c
    b = d

2014 syllabus

This is a sample of what you could have learned by taking Calculus VII in 2014. One post from each month.

Alternating lacunary series and 1-1+1-1+1-1+…

The series {1-1+1-1+1-1+\cdots} diverges. A common way to make it convergent is to replace each {1} with a power of {x}; the new series will converge when {|x|<1} and maybe its sum will have a limit as {x\rightarrow 1}. And indeed,

\displaystyle    x-x^2+x^3-x^4+x^5-x^6+\cdots = \frac{x}{1+x}

which tends to {1/2} as {x} approaches {1} from the left.

x/(1+x) has limit 1/2
x/(1+x) has limit 1/2

Things get more interesting if instead of consecutive integers as exponents, we use consecutive powers of {2}:

\displaystyle    f(x) = x-x^2+x^4-x^8+x^{16} -x^{32} +\cdots

On most of the interval {(0,1)} it behaves just like the previous one:

Lacunary series on (0,1)
Lacunary series on (0,1)

But there appears to be a little blip near {1}. Let’s zoom in:

Lacunary series near 1
… near 1

And zoom in more:

... and very near 1
… and very near 1

Still there.

This function was considered by Hardy in 1907 paper Theorems concerning infinite series. On pages 92–93 he shows that it “oscillates between finite limits of indetermination for {x=1}“. There is also a footnote: “The simple proof given above was shown to be by Mr. J. H. Maclagan-Wedderburn. I had originally obtained the result by means of a contour integral.”

Okay, but what are these “finite limits of indetermination”? The Alternating Series Estimation shows {0<f(x)<1} for {x\in (0,1)}, but the above plots suggest that {f} oscillates between much tighter bounds. Let’s call them {A= \liminf_{x\rightarrow 1-} f(x)} and {B=\limsup_{x\rightarrow 1-} f(x)}.

Since {f(x)+f(x^2)\equiv x^2}, it follows that {f(x)+f(x^2)\rightarrow 1} as {x\rightarrow 1^-}. Hence, {B = \limsup_{x\rightarrow 1-}(1-f(x^2)) = 1-A}. In other words, {A} and {B} are symmetric about {1/2}. But what are they?

I don’t have an answer, but here is a simple estimate. Let {g(x)=x-x^2} and observe that

\displaystyle    f(x) = g(x) + g(x^4)+g(x^{16}) + g(x^{64})+\dots \qquad \qquad (1)

The function {g} is not hard to understand: its graph is a parabola.

Yes, a parabola
Yes, a parabola

Since {g} is positive on {(0,1)}, any of the terms in the sum (1) gives a lower bound for {f}. Each individual term is useless for this purpose, since it vanishes at {1}. But we can pick {n} in {g(x^{4^n})} depending on {x}.

Let {x_0\in(0,1)} be the unique solution of the equation {x_0^4=1-x_0}. It could be written down explicitly, but this is not a pleasant experience; numerically {x_0\approx 0.7245}. For every {x>x_0} there is an integer {n\ge 1} such that {x^{4^n}\in [1-x_0,x_0]}, namely the smallest integer such that {x^{4^n} \le x_0}. Hence,

\displaystyle f(x)> g(x^{4^n})\ge \min_{[x_0,1-x_0]}g = x_0-x_0^2>0.1996 \qquad \qquad (2)

which gives a nontrivial lower bound {A>0.1996} and symmetrically {B<0.8004}. Frustratingly, this falls just short of neat {1/5} and {4/5}.

One can do better than (2) by using more terms of the series (1). For example, study the polynomial {g(t)+g(t^4)} and find a suitable interval {[t_0^4,t_0]} on which its minimum is large (such an interval will no longer be symmetric). Or use {3,4,5...} consecutive terms of the series… which quickly gets boring. This approach gives arbitrarily close approximations to {A} and {B}, but does not tell us what these values really are.

Tossing a continuous coin

To generate a random number {X} uniformly distributed on the interval {[0,1]}, one can keep tossing a fair coin, record the outcomes as an infinite sequence {(d_k)} of 0s and 1s, and let {X=\sum_{k=1}^\infty 2^{-k} d_k}. Here is a histogram of {10^6} samples from the uniform distribution… nothing to see here, except maybe an incidental interference pattern.

Sampling the uniform distribution
Sampling the uniform distribution

Let’s note that {X=\frac12 (d_1+x_1)} where {X_1=\sum_{k=1}^\infty 2^{-k} d_{k+1}} has the same distribution as {X} itself, and {d_1} is independent of {X_1}. This has an implication for the (constant) probability density function of {X}:

\displaystyle    p(x) = p(2x) + p(2x-1)

because {2 p(2x)} is the p.d.f. of {\frac12 X_1} and {2p(2x-1)} is the p.d.f. of {\frac12(1+X_1)}. Simply put, {p} is equal to the convolution of the rescaled function {2p(2x)} with the discrete measure {\frac12(\delta_0+\delta_1)}.

Let’s iterate the above construction by letting each {d_k} be uniformly distributed on {[0,1]} instead of being constrained to the endpoints. This is like tossing a “continuous fair coin”. Here is a histogram of {10^6} samples of {X=\sum_{k=1}^\infty 2^{-k} d_k}; predictably, with more averaging the numbers gravitate toward the middle.

Sampling the Fabius distribution
Sampling the Fabius distribution

This is not a normal distribution; the top is too flat. The plot was made with this Scilab code, putting n samples put into b buckets:

n = 1e6
b = 200
z = zeros(1,n)
for i = 1:10
    z = z + rand(1,n)/2^i
c = histplot(b,z)

If this plot too jagged, look at the cumulative distribution function instead:

Fabius function
Fabius function

It took just more line of code: plot(linspace(0,1,b),cumsum(c)/sum(c))

Compare the two plots: the c.d.f. looks very similar to the left half of the p.d.f. It turns out, they are identical up to scaling.

Let’s see what is going on here. As before, {X=\frac12 (d_1+X_1)} where {X_1=\sum_{k=1}^\infty 2^{-k} d_{k+1}} has the same distribution as {X} itself, and the summands {d_1,X_1} are independent. But now that {d_1} is uniform, the implication for the p.d.f of {X} is different:

\displaystyle    p(x) = \int_0^{1} 2p(2x-t)\,dt

This is a direct relation between {p} and its antiderivative. Incidentally, if shows that {p} is infinitely differentiable because the right hand side always has one more derivative than the left hand side.

To state the self-similarity property of {X} in the cleanest way possible, one introduces the cumulative distribution function {F} (the Fabius function) and extends it beyond {[0,1]} by alternating even and odd reflections across the right endpoint. The resulting function satisfies the delay-differential equation {F\,'(x)=2F(2x)}: the derivative is a rescaled copy of the function itself.

Since {F} vanishes at the even integers, it follows that at every dyadic rational, all but finitely many derivatives of {F} are zero. The Taylor expansion at such points is a polynomial, while {F} itself is not. Thus, {F} is nowhere analytic despite being everywhere {C^\infty}.

This was, in fact, the motivation for J. Fabius to introduce this construction in 1966 paper Probabilistic Example of a Nowhere Analytic {C^\infty}-Function.

Linear approximation and differentiability

If a function {f\colon \mathbb R\rightarrow \mathbb R} is differentiable at {a\in \mathbb R}, then it admits good linear approximation at small scales. Precisely: for every {\epsilon>0} there is {\delta>0} and a linear function {\ell(x)} such that {|f(x)-\ell(x)|<\epsilon \,\delta} for all {|x|<\delta}. Having {\delta} multiplied by {\epsilon} means that the deviation from linearity is small compared to the (already small) scale {\delta} on which the function is considered.

For example, this is a linear approximation to {f(x)=e^x} near {0} at scale {\delta=0.1}.

Linear approximation to exponential function
Linear approximation to exponential function

As is done on this graph, we can always take {\ell} to be the secant line to the graph of {f} based on the endpoints of the interval of consideration. This is because if {L} is another line for which {|f(x)-L(x)|<\epsilon \,\delta} holds, then {|\ell-L|\le \epsilon \,\delta} at the endpoints, and therefore on all of the interval (the function {x\mapsto |\ell(x)-L(x)|} is convex).

Here is a non-differentiable function that obviously fails the linear approximation property at {0}.

Self-similar graph
Self-similar graph

(By the way, this post is mostly about me trying out SageMathCloud.) A nice thing about {f(x)=x\sin \log |x|} is self-similarity: {f(rx)=rf(x)} with the similarity factor {r=e^{2\pi}}. This implies that no matter how far we zoom in on the graph at {x=0}, the graph will not get any closer to linear.

I like {x\sin \log |x|} more than its famous, but not self-similar, cousin {x\sin(1/x)}, pictured below.

Standard example from intro to real analysis
Standard example from intro to real analysis

Interestingly, linear approximation property does not imply differentiability. The function {f(x)=x\sin \sqrt{-\log|x|}} has this property at {0}, but it lacks derivative there since {f(x)/x} does not have a limit as {x\rightarrow 0}. Here is how it looks.

Now with the square root!
Now with the square root!

Let’s look at the scale {\delta=0.1}

scale 0.01
scale 0.1

and compare to the scale {\delta=0.001}

scale 0.001
scale 0.001

Well, that was disappointing. Let’s use math instead. Fix {\epsilon>0} and consider the function {\phi(\delta)=\sqrt{-\log \delta}-\sqrt{-\log (\epsilon \delta)}}. Rewriting it as

\displaystyle    \frac{\log \epsilon}{\sqrt{-\log \delta}+\sqrt{-\log (\epsilon \delta)}}

shows {\phi(\delta)\rightarrow 0} as {\delta\rightarrow 0}. Choose {\delta} so that {|\phi(\delta)|<\epsilon} and define {\ell(x)=x\sqrt{-\log \delta}}. Then for {\epsilon \,\delta\le |x|< \delta} we have {|f(x)-\ell(x)|\le \epsilon |x|<\epsilon\,\delta}, and for {|x|<\epsilon \delta} the trivial bound {|f(x)-\ell(x)|\le |f(x)|+|\ell(x)|} suffices.

Thus, {f} can be well approximated by linear functions near {0}; it’s just that the linear function has to depend on the scale on which approximation is made: its slope {\sqrt{-\log \delta}} does not have a limit as {\delta\to0}.

The linear approximation property does not become apparent until extremely small scales. Here is {\delta = 10^{-30}}.

Scale 1e-30
Scale 1e-30

Nodal lines

Wikipedia article on nodes offers this 1D illustration: a node is an interior point at which a standing wave does not move.

Standing wave and its nodes
Standing wave and its nodes

(At the endpoints the wave is forced to stay put, so I would not count them as nodes despite being marked on the plot.)

A standing wave in one dimension is described by the equation {f''+\omega^2 f=0}, where {\omega} is its (angular) frequency. The function {u(x,t) = f(x)\cos \omega t} solves the wave equation {u_{tt}=u_{xx}}: the wave vibrates without moving, hence the name. In mathematics, these are the (Dirichlet) eigenfunctions of the Laplacian.

Subject to boundary conditions {f(0)=0 = f(\pi)} (fixed ends), all standing waves on the interval {(0,\pi)} are of the form {\sin nx} for {n=1,2,3,\dots}. Their eigenvalues are exactly the perfect squares, and the nodes are equally spaced on the interval.

Things get more interesting in two dimensions. For simplicity consider the square {Q=(0,\pi)\times (0,\pi)}. Eigenfunctions with zero value on the boundary are of the form {f(x,y) = \sin mx \sin ny} for positive integers {m,n}. The set of eigenvalues has richer structure, it consists of the integers that can be expressed as the sum of two positive squares: 2, 5, 8, 10, 13, 17,…

The zero sets of eigenfunctions in two dimensions are called nodal lines. At a first glance it may appear that we have nothing interesting: the zero set of {\sin mx \sin ny} is a union of {n-1} equally spaced horizontal lines, and {m-1} equally spaced vertical lines:

Boring nodal lines
This is a square, not a tall rectangle

But there is much more, because a sum of two eigenfunctions with the same eigenvalue is also an eigenfunction. To begin with, we can form linear combinations of {\sin mx \sin ny} and {\sin nx \sin my}. Here are two examples from Partial Differential Equations by Walter Strauss:

When {f(x,y) = \sin 12x \sin y+\sin x \sin 12y }, the square is divided by nodal lines into 12 nodal domains:

Frequency 145, twelve nodal domains
Eigenvalue 145, twelve nodal domains

After slight perturbation {f(x,y) = \sin 12x \sin y+0.9\sin x \sin 12y } there is a single nodal line dividing the square into two regions of intricate geometry:

Also frequency 145, but two  nodal domains
Also eigenvalue 145, but two nodal domains

And then there are numbers that can be written as sums of squares in two different ways. The smallest is {50=1^2+7^2 = 5^2+5^2}, with eigenfunctions such as

\displaystyle    f(x,y) = \sin x\sin 7y +2\sin 5x \sin 5y+\sin 7x\sin y

pictured below.

Frequency 50
Frequency 50

This is too good not to replicate: the eigenfunctions naturally extend as doubly periodic functions with anti-period {\pi}.

Periodic extension
Periodic extension

Binary intersection property, and not fixing what isn’t broken

A metric space has the binary intersection property if every collection of closed balls has nonempty intersection unless there is a trivial obstruction: the distance between centers of two balls exceeds the sum of their radii. In other words, for every family of points {x_\alpha\in X} and numbers {r_\alpha>0} such that {d(x_\alpha,x_\beta)\le r_\alpha+r_\beta} for all {\alpha,\beta} there exists {x\in X} such that {d(x_\alpha,x)\le r_\alpha} for all {\alpha}.

For example, {\mathbb R} has this property: {x=\inf_\alpha (x_\alpha+r_\alpha)} works. But {\mathbb R^2} does not:

Failure of the binary intersection property
Failure of the binary intersection property

The space of bounded sequences {\ell^\infty} has the binary intersection property, and so does the space {B[0,1]} of all bounded functions {f:[0,1]\rightarrow\mathbb R} with the supremum norm. Indeed, the construction for {\mathbb R} generalizes: given a family of bounded functions {f_\alpha} and numbers {r_\alpha>0} as in the definition, let {f(x)=\inf_\alpha (f_\alpha(x)+r_\alpha)}.

The better known space of continuous functions {C[0,1]} has the finite version of binary intersection property, because for a finite family, the construction {\inf_\alpha (f_\alpha(x)+r_\alpha)} produces a continuous function. However, the property fails without finiteness, as the following example shows.

Example. Let {f_n\in C[0,1]} be a function such that {f_n(x)=-1} for {x\le \frac12-\frac1n}, {f_n(x)=1} for {x\ge \frac12+\frac1n}, and {f_n} is linear in between.

Since {\|f_n-f_m\| \le 1} for all {n,m}, we can choose {r_n=1/2} for all {n}. But if a function {f} is such that {\|f-f_n\|\le \frac12} for all {n}, then {f(x) \le -\frac12} for {x<\frac12} and {f(x) \ge \frac12} for {x>\frac12}. There is no continuous function that does that.

More precisely, for every {f\in C[0,1]} we have {\liminf_{n\to\infty} \|f-f_n\|\ge 1 } because {f(x)\approx f(1/2)} in a small neighborhood of {1/2}, while {f_n} change from {1} to {-1} in the same neighborhood when {n} is large.

Given a discontinuous function, one can approximate it with a continuous function in some way: typically, using a mollifier. But such approximations tend to change the function even if it was continuous to begin with. Let’s try to not fix what isn’t broken: look for a retraction of {B[0,1]} onto {C[0,1]}, that is a map {\Phi:B[0,1]\rightarrow C[0,1]} such that {\Phi(f)=f} for all {f\in C[0,1]}.

The failure of binary intersection property, as demonstrated by the sequence {(f_n)} above, implies that {\Phi} cannot be a contraction. Indeed, let {f(x)= \frac12 \,\mathrm{sign}\,(x-1/2)}. This is a discontinuous function such that {\|f-f_n\|\le 1/2} for all {n}. Since {\liminf_{n\to\infty} \|\Phi(f)-f_n\|\ge 1}, it follows that {\Phi} cannot be {L}-Lipschitz with a constant {L<2}.

It is known that there is a retraction from {B[0,1]} onto {C[0,1]} with the Lipschitz constant at most {20}: see Geometric Nonlinear Functional Analysis by Benyamini and Lindenstrauss. The gap appears to remain at present; at least I don’t know the smallest Lipschitz constant required to retract bounded functions onto continuous ones.

Graphical embedding

This post continues the theme of operating with functions using their graphs. Given an integrable function {f} on the interval {[0,1]}, consider the region {R_f} bounded by the graph {y=f(x)}, the axis {y=0}, and the vertical lines {x=0}, {x=1}.

Total area under and over the graph is the L1 norm
Total area under and over the graph is the L1 norm

The area of {R_f} is exactly {\int_0^1 |f(x)|\,dx}, the {L^1} norm of {f}. On the other hand, the area of a set is the integral of its characteristic function,

\displaystyle    \chi_f = \begin{cases}1, \quad x\in R_f, \\ 0,\quad x\notin R_f \end{cases}

So, the correspondence {f\mapsto \chi_f } is a map from the space of integrable functions on {[0,1]}, denoted {L^1([0,1])}, to the space of integrable functions on the plane, denoted {L^1(\mathbb R^2)}. The above shows that this correspondence is norm-preserving. It also preserves the metric, because integration of {|\chi_f-\chi_g|} gives the area of the symmetric difference {R_f\triangle R_g}, which in turn is equal to {\int_0^1 |f-g| }. In symbols:

\displaystyle    \|\chi_f-\chi_g\|_{L^1} = \int |\chi_f-\chi_g| = \int |f-g| = \|f-g\|_{L^1}

Distance between two functions in terms of their graphs
Distance between two functions in terms of their graphs

The map {f\mapsto \chi_f} is nonlinear: for example {2f} is not mapped to {2 \chi_f} (the function that is equal to 2 on the same region) but rather to a function that is equal to 1 on a larger region.

So far, this nonlinear embedding did not really offer anything new: from one {L^1} space we got into another. It is more interesting (and more difficult) to embed things into a Hilbert space such as {L^2(\mathbb R^2)}. But for the functions that take only the values {0,1,-1}, the {L^2} norm is exactly the square root of the {L^1} norm. Therefore,

\displaystyle    \|\chi_f-\chi_g\|_{L^2} = \sqrt{\int |\chi_f-\chi_g|^2} =    \sqrt{\int |\chi_f-\chi_g|} = \sqrt{\|f-g\|_{L^1}}

In other words, raising the {L^1} metric to power {1/2} creates a metric space that is isometric to a subset of a Hilbert space. The exponent {1/2} is sharp: there is no such embedding for the metric {d(f,g)=\|f-g\|_{L^1}^{\alpha} } with {\alpha>1/2}. The reason is that {L^1}, having the Manhattan metric, contains geodesic squares: 4-cycles where the distances between adjacent vertices are 1 and the diagonal distances are equal to 2. Having such long diagonals is inconsistent with the parallelogram law in Hilbert spaces. Taking the square root reduces the diagonals to {\sqrt{2}}, which is the length they would have in a Hilbert space.

This embedding, and much more, can be found in the ICM 2010 talk by Assaf Naor.