Measuring nonlinearity and reducing it

How to measure the nonlinearity of a function {f\colon I\to \mathbb R} where {I\subset \mathbb R} is an interval? A natural way is to consider the smallest possible deviation from a line {y=kx+b}, that is {\inf_{k, b}\sup_{x\in I}|f(x)-kx-b|}. It turns out to be convenient to divide this by {|I|}, the length of the interval {I}. So, let {\displaystyle NL(f;I) = \frac{1}{|I|} \inf_{k, b}\sup_{x\in I}|f(x)-kx-b|}. (This is similar to β-numbers of Peter Jones, except the deviation from a line is measured only in the vertical direction.)

NL-def
NL(f; I) is the maximal vertical distance between red and black, divided by the length of I

Relation with derivatives

The definition of derivative immediately implies that if {f'(a)} exists, then {NL(f;I)\to 0} as {I} shrinks to {a} (that is, gets smaller while containing {a}). A typical construction of a nowhere differentiable continuous function is based on making {NL(f;I)} bounded from below; it is enough to do this for dyadic intervals, and that can be done by adding wiggly terms like {2^{-n}\mathrm{dist}\,(x, 2^{-n}\mathbb Z)}: see the blancmange curve.

The converse is false: if {NL(f; I)\to 0} as {I} shrinks to {a}, the function {f} may still fail to be differentiable at {a}. The reason is that the affine approximation may have different slopes at different scales. An example is {f(x)=x \sin \sqrt{-\log |x|}} in a neighborhood of {0}. Consider a small interval {[-\delta, \delta]}. The line {y = kx} with {k=\sin\sqrt{-\log \delta}} is a good approximation to {f} because {f(x)/x\approx k} on most of the interval except for a very small part near {0}, and on that part {f} is very close to {0} anyway.

Why the root of logarithm? Because {\sin \log |x|} has a fixed amount of change on a fixed proportion of  {[-\delta, \delta]}, independently of {\delta}. We need a function slower than the logarithm, so that as {\delta} decreases, there is a smaller amount of change on a larger part of the interval {[-\delta, \delta]}.

Nonlinearity of Lipschitz functions

Suppose {f} is a Lipschitz function, that is, there exists a constant {L} such that {|f(x)-f(y)|\le L|x-y|} for all {x, y\in I}. It’s easy to see that {NL(f;I)\le L/2}, by taking the mid-range approximation {y=\frac12 (\max_I f + \min_I f)}. But the sharp bound is {NL(f;I)\le L/4} whose proof is not as trivial. The sharpness is shown by {f(x)=|x|} with {I=[-1,1]}.

sharpness
With the maximum difference 1/2 and the length of interval 2, we get NL(f; [-1, 1]) = 1/4
Proof. Let {k} be the slope of the linear function that agrees with {f} at the endpoints of {I}. Subtracting this linear function from {f} gives us a Lipschitz function {g} such that {-L-k\le g'\le L-k} and {\int_I g'= 0}. Let {A = \int_I (g')^+ = \int_I (g')^-}. Chebyshev’s inequality gives lower bounds for the measures of the sets {g'>0} and {g'<0}: namely, {|g'>0|\ge A/(L-k)} and {|g'<0|\le A/(L+k)}. By adding these, we find that {|I| \ge 2LA/(L^2-k^2)\ge 2A/L}. Since {\max _I g - \min_I g \le A}, the mid-range approximation to {g} has error at most {A/2 \le |I|L/4}. Hence {NL(f; I) = NL(g; I) \le L/4}.

Reducing nonlinearity

Turns out, the graph of every Lipschitz function has relatively large almost-flat pieces.  That is, there are subintervals of nontrivial size where the measure of nonlinearity is much smaller than the Lipschitz constant. This result is a special (one-dimensional) case of Theorem 2.3 in Affine approximation of Lipschitz functions and nonlinear quotients by Bates, Johnson, Lindenstrauss, Preiss, and Schechtman.

Theorem AA (for “affine approximation”): For every {\epsilon>0} there exists {\delta>0} with the following property. If {f\colon I\to \mathbb R} is an {L}-Lipschitz function, then there exists an interval {J\subset I} with {|J|\ge \delta |I|} and {NL(f; J)\le \epsilon L}.

Theorem AA should not be confused with Rademacher’s theorem which says that a Lipschitz function is differentiable almost everywhere. The point here is a lower bound on the size of the interval {J}. Differentiability does not provide that. In fact, if we knew that {f} is smooth, or even a polynomial, the proof of Theorem AA would not become any easier.

Proof of Theorem AA

We may assume {I=[-1, 1]} and {L=1}. For {t\in (0, 2]} let {L(t) = \sup \{|f(x)-f(y)|/|x-y| \colon x, y\in I, \ |x-y|\ge t\}}. That is, {L(t)} is the restricted Lipschitz constant, one that applies for distances at least {t}. It is a decreasing function of {t}, and {L(0+)=1}.

Note that {|f(-1)-f(1)|\le 2L(1)} and that every value of {f} is within {2L(1)} of either {f(-1)} or {f(1)}. Hence, the oscillation of {f} on {I} is at most {6L(1)}. If {L(1) \le \epsilon/3}, then the constant mid-range approximation on {I} gives the desired conclusion, with {J=I}. From now on {L(1) > \epsilon/3}.

The sequence {L_k = L(4^{-k})} is increasing toward {L(0+)=1}, which implies {L_{k+1}\le (1+\epsilon) L_k} for some {k}. Pick an interval {[a, b]\subset I} that realizes {L_k}, that is {b-a\ge 4^{-k}} and {|f(b)-f(a)| = 4^{-k}L_k}. Without loss of generality {f(b)>f(a)} (otherwise consider {-f}). Let {J = [(3a+b)/4, (a+3b)/4]} be the middle half of {[a. b]}. Since each point of {J} is within distance {\ge 4^{-k-1}} of both {a} and {b}, it follows that {\displaystyle f(b) + L_{k+1}(x-b) \le f(x) \le f(a) + L_{k+1}(x-a) } for all {x \in J}.

proof
The green lines have slope L_{k+1} which is close to the slope L_k of the secant line through (a, f(a)) and (b, f(b)). The graph of f is pinched between these green lines, except near a or b.

So far we have pinched {f} between two affine functions of equal slope. Let us consider their difference:
{\displaystyle (f(a) + L_{k+1}(x-a)) - (f(b) + L_{k+1}(x-b)) = (L_{k+1}-L_k) (b-a)}. Recall that {L_{k+1}\le (1+\epsilon) L_k}, which gives a bound of {\epsilon L_k(b-a) \le 2\epsilon L |J|} for the difference. Approximating {f} by the average of the two affine functions we conclude that {NL(f;J)\le \epsilon L} as required.

It remains to consider the size of {J}, about which we only know {|J|\ge 4^{-k}/2} so far. Naturally, we want to take the smallest {k} such that {L_{k+1}\le (1+\epsilon) L_k} holds. Let {m} be this value; then {L_m > (1+\epsilon)^{m} L_0}. Here {L_m\le 1} and {L_0 = L(1)> \epsilon/3 }. The conclusion is that {(1+\epsilon)^m < 3/\epsilon}, hence {m< \log(3/\epsilon)/\log(1+\epsilon)}. This finally yields {\displaystyle \delta = 4^{-\log(3/\epsilon)/\log(1+\epsilon)}/2} as an acceptable choice, completing the proof of Theorem AA.

A large amount of work has been done on quantifying {\delta} in various contexts; for example Heat flow and quantitative differentiation by Hytönen and Naor.

2019 Formula One season

This is now a separate post from Graph theory in Formula 1 so that the evolution of the graph of 1-2 finishes can be tracked. The graphs are shown as they were after the race mentioned in the subheading.

Australia

year2019australia

Obviously, there is only one edge after the first race of the season, a Mercedes 1-2. This turned out to be the beginning of a series of five 1-2 for Mercedes, so the graph did not change again until Monaco.

Monaco year2019

At Monaco, Mercedes drivers took “only” the first and third place, as Vettel appeared in top 2.

Austria

year2019aus

It began with the youngest ever front row of the F1 grid: Leclerc and Verstappen. And ended with the youngest ever 1-2 finish (represented by an edge here) in Formula One: Verstappen and Leclerc. For the moment, the graph is disconnected.

Two predictions: (1) the components will get connected; (2) the graph will stay with 5 vertices, setting a new record for the fewest number of vertices. Which is a way of saying, I don’t expect either Gasly or anyone outside of top 3 teams to finish in top two for the rest of the season.

Graph theory in Formula 1

This post attempts to visualize Formula 1 championships (1985-2018) by way of graphs: the outcome of each race is represented by an edge between the drivers who finished #1 and #2. The graph is undirected (no distinction between the winner and 2nd place is made), and simple (no record of multiple edges is kept). This erases some of the information, but depending on how much you care about F1, the graphs may still be enough to bring back some memories.

All graph-theoretical “records” are based on 1985-2018 data only, 2019 season being the subject of a separate post. Some highlights:

  • Most vertices: 12 in 1997
  • Fewest vertices: 5 in 2000 and 2011
  • Most edges: 16 in 2012
  • Fewest edges: 6 in 1988, 2002, 2011, and 2015
  • Largest maximal degree: 6 in 1990, 1997, 2004, and 2012
  • Smallest maximal degree: 3 in 1996
  • Largest minimal degree: 2 in 1989, 2016, and 2018
  • Largest diameter: 6 in 2009
  • Smallest diameter: 2 in 1993, 2000, 2001, 2002, 2007, 2011, and 2016
  • Disconnected: 1985, 1991, 1996, 1998, 1999, 2006, and 2008
  • Isomorphic seasons: 1991 and 1998
  • Hamiltonian cycle: 2016 and 2018
  • Triangle-free: none (hence no trees and no bipartite graphs)

Appropriately, both Hamiltonian cycles include Hamilton.

1985 season

 

year1985

This was the year of Senna’s first race victory, but the championship went to Prost, who shared maximal vertex degree (4) with Rosberg (Keke Rosberg, of course, not his son Nico Rosberg). This is also one of the few seasons with a disconnected graph. A small connected component, such as Angelis-Boutsen here, likely indicates something weird… in this case, the 1985 San Marino Grand Prix at Imola where Senna ran out of fuel and Prost was disqualified.

1986 season

year1986

Prost won again, this time with vertex degree 5.

1987 season

year1987

The four-way battle between Mansell, Piquet, Prost, and Senna fell just short of creating a complete subgraph on four vertices. Their best chance of creating {K_4} was at Detroit, where Senna won and Prost was 3rd. Piquet won the championship.

1988 season

year1988

The graph is smaller than the previous ones, but is actually larger than one would expect, considering that Senna and Prost combined for 15 wins in 16 races. Berger extended this graph by his win at Monza, in the season otherwise dominated by McLaren. The graph also suggests that Prost should win the championship, and he would have if the champion was determined by the total of all points earned as it is now. But only the best 11 results counted then, and Senna won by that metric.

1989 season

year1989

Again just an edge short of {K_4} subgraph, but this time it was not a four-way battle at all. Berger only finished 3 races (but in top two every time). Senna and Mansell also had too many retirements to challenge Prost for the championship. This is the first time we see a graph with no vertices of degree 1. But there is no Hamiltonian cycle here.

1990 season

year1990

The first time we see a degree of vertex 6, and the second time Senna is the champion.

1991 season

year1991

Another disconnected graph, with Piquet scoring his last career victory in Canada under strange circumstances: Mansell’s car stopped on the last lap when he led by almost a minute and was already waving to the crowd.

If such a mishap also happened at Silverstone, where Mansell, Berger, and Prost finished 1-2-3, we would have {K_4} as a subgraph. Senna won the championship for the last time.

1992 season

year1992

Sorry about Schumacher’s name being cut off… this was the year of his first race win, at Spa-Francorchamps. Meanwhile, Mansell utterly dominated the championship.

1993 season

year1993

The first time we get a graph of diameter 2. It suggests Hill was the winner, but in reality he finished third in the championship, with Prost winning for the last time in his career.

1994 season

year1994

The year of Senna’s death; he does not appear on the graph. Hill has the vertex degree of 5, but Schumacher won the championship by 1 point after their controversial collision at Adelaide.

1995 season

year1995

That’s pretty close to the wheel graph on six vertices – the only missing edge is Häkkinen-Coulthard. They would score a lot of 1-2 finishes for McLaren in the years to come, but at this time they were not teammates yet. At the center of the incomplete wheel, Schumacher won the championship by a wide margin.

1996 season

year1996

Another small component, another highly unusual race: wet Monaco Grand Prix, where only three cars made it to the finish and Panis scored the only victory of his career.

Hill won the championship in which no driver had vertex degree greater than 3, the only such season in our record.

1997 season

year1997

This season holds the record for the number of vertices (12). Two vertices have degree 6 (Villeneuve and Schumacher) but surprisingly, there is no edge between them. Although one of them was on the podium in every race except Italy, they were never on the podium together. Their infamous collision in the season finale at Jerez led to Schumacher being disqualified from the championship.

Villeneuve became the last non-European F1 champion to date.

1998 season

year1998

The small component is due to Carmageddon on the first lap of very wet Belgian Grand Prix.

This is where my decision to include only driver’s last names backfires: Ralf Schumacher gets to keep his initial. In other news, Williams suddenly faded from the picture and McLaren re-emerged with Häkkinen and Coulthard finishing 1-2 in five races. Häkkinen won the championship.

The seasons 1991 and 1998 is the only pair of isomorphic graphs in this collection. An isomorphism maps Schumacher and Häkkinen to Senna and Mansell.

1999 season

year1999

The small component is contributed by the partially wet Nürburgring race, where multiple retirements among the leaders left Herbert to score his last Grand Prix victory.

Schumacher’s injury at Silverstone took him out of contention. Still, the second championship of Häkkinen was a lot closer than the first one: he won by 2 points over Irvine.

2000 season

year2000

Finally, we get a complete subgraph on four vertices: the Ferrari and McLaren drivers. The sole appearance of a driver outside of these two teams was at Brazilian Grand Prix, where Fisichella finished 3rd but was promoted to 2nd after Coulthard’s disqualification. If not for this incident, we would have a regular graph in this collection, a rather unlikely event. Even so, this season set the record for fewest vertices (5). A closely fought championship ended with Schumacher collecting his third title.

2001 season

year2001

This was not close at all: the driver at the center of this diameter 2 graph won with a lot of room to spare.

2002 season

year2002

Another season of diameter 2. Schumacher finished every race in top two, except for the Malaysian Grand Prix, narrowly missing an opportunity to create a tree (a star graph). This season ties the fewest edges record (6) which was set in 1998.

2003 season

year2003

More vertices and larger diameter indicates a more interesting season. Schumacher won again, but by mere 2 points over Räikkönen.

2004 season

year2004

The final season of Schumacher/Ferrari dominance, in which Schumacher won 13 races and achieved the vertex degree of 6.

2005 season

year2005

This looks like it was between Alonso and Räikkönen – and it was, with Alonso becoming the youngest F1 champion yet.

2006 season

year2006

Button’s first career win (wet Hungarian Grand Prix) created the small component.

The large component has diameter 2, with Alonso (the champion) in its center. This is also the last graph in which Schumacher appears.

2007 season

year2007

As in 2000, Ferrari and McLaren combine to form a complete subgraph on four vertices. But this championship fight was as close as one could imagine, with three drivers finishing within one point: Räikkönen 110, Hamilton 109, Alonso 109. And this was Hamilton’s first season in F1.

2008 season

year2008

For the first time, we have a small component with more than two vertices. Kovalainen’s only F1 victory came in Hungary, where Glock took second place. More notable was Vettel’s first victory, which came in Monza and made him the youngest driver to win a F1 race [up to that time]. Even more notably, Hamilton won the championship by one point, at the end of the final lap of the final race, and became the youngest F1 champion at that time. Here is the Glock’s view of the action, his car slip-sliding on dry-weather tyres.

On the graph, “Jr.” is Piquet Jr. who took second place in Germany but his brief stint in Formula 1 would be remembered for an entirely different reason.

 

2009 season

year2009

The graph of largest diameter (6) captures a strange season after major rule changes. It is so close to being a complex tree, but the 3-cycle was completed at Istanbul, where the polesitter Vettel lost the lead on the first lap and then fell behind his Red Bull teammate Webber as well, finishing just 0.7 seconds behind in the 3rd place. If Vettel was first or second in Turkey, we would have a tree. Button won the championship on the strength of the first half of the season.

2010 season

year2010

The third time we see a {K_4} subgraph, but the first time that it involves more than two teams: the vertices come from Red Bull (Vettel and Webber), McLaren (Hamilton), and Ferrari (Alonso). Although Vettel’s vertex degree is only 3, trailing Hamilton’s 4 and Alonso’s 5, he became the youngest F1 champion in history, a record he still holds.

2011 season

year2011

The season tied 2000 for the fewest vertices, with 5. The fewest edges record (6) is tied as well: it was McLaren in 1988 and Ferrari in 2002; this time it is Red Bull’s turn. Vettel won the championship by 122 points but it’s not all in the car; his teammate Webber finished only third.

2012 season

year2012

With 16 edges, this season beat the previous record set by 1997 season, even though there are fewer vertices here. The two degree-6 vertices led the way in the championship, with Vettel beating Alonso by 3 points. Was this the last great season to watch?

2013 season

year2013

Vettel over Alonso again, but by 155 points this time. This was the last season of V8 engines, and last season of Red Bull domination. Hamilton appears on the graph only because of his victory in Hungary, after which Vettel won the remaining 9 races. The season opener turned out to be the last race [at the time of writing] won by someone not driving Mercedes, Ferrari, or Red Bull:

2014 season

year2014

The beginning of a new era: V6 hybrid engine, Mercedes, and Hamilton. Also the last time we see a McLaren driver (Magnussen) on the graph: he appears because of the 2nd place in the dramatic season opener.

In a brief moment of Williams resurgence, Bottas took 2nd place in Britain and Germany, forming a cycle with the Mercedes drivers. If not for him, we would have a tree.

2015 season

year2015

Another 6-edge graph, another season without much competition. Vettel was the only driver to challenge Mercedes on occasions, thus contributing a cycle to the graph. The entire graph is formed by Mercedes, Ferrari, and Red Bull. Hamilton won the championship again.

2016 season

year2016

The first time we get a Hamiltonian cycle, for example: Hamilton, Vettel, Rosberg, Räikkönen, Verstappen, Ricciardo, and back to Hamilton. Another 6-vertex graph formed by Mercedes, Ferrari, and Red Bull exclusively. Among them, Mercedes and Red Bull drivers form a complete subgraph. With Ferrari fading to third, neither Vettel nor Räikkönen had enough success to extend {K_4} to {K_5} and thus create the first non-planar season. We would have {K_5} if (a) Räikkönen overtook Verstappen in Austria (he was 0.3s behind), after Hamilton and Rosberg collided on the last lap:

and (b) Räikkönen finished 2nd instead of the 4th in Malaysia, where Hamilton’s engine went up in smoke, costing him the championship.

As it happened, we did not get {K_5} and Hamilton did not get the championship, which went to Rosberg instead. But Verstappen got his first victory at Barcelona and still remains the youngest driver ever to win an F1 race.

2017 season

year2017

Once again, it is all about Mercedes, Ferrari, and Red Bull, with the Mercedes drivers enjoying higher vertex degree. But this time Ferrari drivers are connected by an edge. The last 1-2 finish of Ferrari to date was in Hungary, arguably their high point of the season.

It was all about Hamilton the rest of the season.

2018 season

year2018

Second time a Hamiltonian cycle appears, for example: Hamilton, Räikkönen, Verstappen, Vettel, Ricciardo, Bottas, and back to Hamilton. Fourth year in a row that only Mercedes, Ferrari, and Red Bull drivers appear on the graph. Second year in a row that Hamilton wins, and his fifth time overall.

 

Laplacian spectrum of small graphs

This is a collection of entirely unoriginal remarks about Laplacian spectrum of graphs. For an accessible overview of the subject I recommend the M.S. thesis The Laplacian Spectrum of Graphs by Michael William Newman. It also includes a large table of graphs with their spectra. Here I will avoid introducing matrices and enumerating vertices.

Let {V} be the vertex set of a graph. Write {u\sim v} if {u, v} are adjacent vertices. Given a function {f\colon V\to \mathbb R}, define {L f(v) = \sum_{u\colon u\sim v}(f(v)-f(u))}.
This is a linear operator (the graph Laplacian) on the Euclidean space {\ell^2(V)} of all functions {f\colon V\to \mathbb R} with the norm {\|f\|^2 = \sum_{v\in V} f(v)^2}. It is symmetric: {\langle L f, g\rangle = \langle f, L g\rangle } and positive semidefinite: {\langle L f, f\rangle = \frac12 \sum_{u\sim v}(f(u)-f(v))^2\ge 0}. Since equality is attained for constant {f}, 0 is always an eigenvalue of {L}.

This is the standard setup, but I prefer to change things a little and replace {\ell^2(V)} by the smaller space {\ell^2_0(V)} of functions with zero mean: {\sum_{v\in V}f(v)=0}. Indeed, {L} maps {\ell^2(V)} to {\ell^2_0(V)} anyway, and since it kills the constants, it makes sense to focus on {\ell^2_0(V)}. It is a vector space of dimension {n-1} where {n=|V|}.

One advantage is that the smallest eigenvalue is 0 if and only if the graph is disconnected: indeed, {\langle L f, f\rangle=0} is equivalent to {f} being constant on each connected component. We also gain better symmetry between {L} and the Laplacian of the graph complement, denoted {L'}. Indeed, since {L' f(v) = \sum_{u\colon u\not \sim v}(f(v)-f(u))}, it follows that {(L+L')f(v) = \sum_{u\colon u\ne v} (f(v)-f(u)) = n f(v)} for every {f\in \ell^2_0(V)}. So, the identity {L+L' = nI} holds on {\ell^2_0(V)} (it does not hold on {\ell^2(V)}). Hence the eigenvalues of {L'} are obtained by subtracting the eigenvalues of {L} from {n}. As a corollary, the largest eigenvalue of {L} is at most {n}, with equality if and only if the graph complement is disconnected. More precisely, the multiplicity of eigenvalue {n} is one less than the number of connected components of the graph complement.

Let {D} denote the diameter of the graph. Then the number of distinct Laplacian eigenvalues is at least {D}. Indeed, let {u, v} be two vertices at distance {D} from each other. Define {f_0(u) = 1} and {f_0=0} elsewhere. Also let {f_k=L^k f_0} for {k=1, 2, \dots}. Note that {f_k\in \ell_0^2(V)} for all {k\ge 1}. One can prove by induction that {f_k(w)=0} when the distance from {w} to {u} is greater than {k}, and {(-1)^k f_k(w) > 0} when the distance from {w} to {u} is equal to {k}. In particular, {f_k(v) = 0} when {k<D} and {f_D(v)\ne 0}. This shows that {f_D} is not a linear combination of {f_1, \dots, f_{D-1}}. Since {f_k = L^{k-1}f_1}, it follows that {L^{D-1}} is not a linear combination of {L^0, L^1, \dots, L^{D-2}}. Hence the minimal polynomial of {L} has degree at least {D}, which implies the claim.

Let’s consider a few examples of connected graphs.

3 vertices

There are two connected graphs: the 3-path (D=2) and the 3-cycle (D=1). In both cases we get D distinct eigenvalues. The spectra are [1, 3] and [3, 3], respectively.

4 vertices

  • One graph of diameter 3, the path. Its spectrum is {[2-\sqrt{2}, 2, 2+\sqrt{2}]}.
  • One graph of diameter 1, the complete graph. Its spectrum is {[4, 4, 4]}. This pattern continues for other complete graphs: since the complement is the empty graph ({n} components), all {n-1} eigenvalues are equal to {n}.
  • Four graphs of diameter 2, which are shown below, with each caption being the spectrum.
4-0
1, 1, 4
4-2
1, 3, 4
4-3
2, 2, 4
4-4
2, 4, 4

Remarks:

  • The graph [1, 3, 4] has more distinct eigenvalues than its diameter.
  • The graph [2, 2, 4] is regular (all vertices have the same degree).
  • The smallest eigenvalue of graphs [1, 1, 4] and [2, 2, 4] is multiple, due to the graphs having a large group of automorphisms (here rotations); applying some of these automorphisms to an eigenfunctions for the smallest eigenvalue yields another eigenfunction.
  • [1, 3, 4] and [2, 4, 4] also have automorphisms, but their automorphisms preserve the eigenfunction for the lowest eigenvalue, up to a constant factor.

5 vertices

  • One graph of diameter 4, the path. Its spectrum is related to the golden ratio: it consists of {(3\pm \sqrt{5})/2, (5\pm \sqrt{5})/2}.
  • One graph of diameter 1, the complete one: [5, 5, 5, 5]
  • Five graphs of diameter 3. All have connected complement, with the highest eigenvalue strictly between 4 and 5. None are regular. Each has 4 distinct eigenvalues.
  • 14 graphs of diameter 2. Some of these are noted below.

Two have connected complement, so their eigenvalues are less than 5 (spectrum shown on hover):

One has both integers and non-integers in its spectrum, the smallest such graph. Its eigenvalues are {3\pm \sqrt{2}, 3, 5}.

5-15
1.585786, 3, 4.414214, 5

Two have eigenvalues of multiplicity 3, indicating a high degree of symmetry (spectrum shown on hover).

Two have all eigenvalues integer and distinct:

 

The 5-cycle and the complete graph are the only regular graphs on 5 vertices.

6 vertices

This is where we first encounter isospectral graphs: the Laplacian spectrum cannot tell them apart.

Both of these have spectrum {3\pm \sqrt{5}, 2, 3, 3} but they are obviously non-isomorphic (consider the vertex degrees):

Both have these have spectrum {3\pm \sqrt{5}, 3, 3, 4} and are non-isomorphic.

Indeed, the second pair is obtained from the first by taking graph complement.

Also notable are regular graphs on 6 vertices, all of which have integer spectrum.

6-46
1, 1, 3, 3, 4
6-71
3, 3, 3, 3, 6
6-98
2, 3, 3, 5, 5
6-108
4, 4, 4, 6, 6
6-111
6, 6, 6, 6, 6

Here [3, 3, 3, 3, 6] (complete bipartite) and [2, 3, 3, 5, 5] (prism) are both regular of degree 3, but the spectrum allows us to tell them apart.

The prism is the smallest regular graph for which the first eigenvalue is a simple one. It has plenty of automorphisms, but the relevant eigenfunction (1 on one face of the prism, -1 on the other face) is compatible with all of them.

7 vertices

There are four regular graphs on 7 vertices. Two of them are by now familiar: 7-cycle and complete graph. Here are the other two, both regular of degree 4 but with different spectra.

7-720
3, 4, 4, 5, 5, 7
7-832
3.198, 3.198, 4.555, 4.555, 6.247, 6.247

There are lots of isospectral pairs of graphs on 7 vertices, so I will list only the isospectral triples, of which there are five.

Spectrum 0.676596, 2, 3, 3.642074, 5, 5.681331:

Spectrum 0.726927, 2, 3.140435, 4, 4, 6.132637:

Spectrum 0.867363, 3, 3, 3.859565, 5, 6.273073:

Spectrum 1.318669, 2, 3.357926, 4, 5, 6.323404:

All of the triples mentioned so far have connected complement: for example, taking the complement of the triple with the spectrum [0.676596, 2, 3, 3.642074, 5, 5.681331] turns it into the triple with the spectrum [1.318669, 2, 3.357926, 4, 5, 6.323404].

Last but not least, an isospectral triple with an integer spectrum: 3, 4, 4, 6, 6, 7. This one has no counterpart since the complement of each of these graphs is disconnected.

8 vertices

Regular graphs, excluding the cycle (spectrum 0.585786, 0.585786, 2, 2, 3.414214, 3.414214, 4) and the complete one.

Degree 3 regular:

8-4326
2, 2, 2, 4, 4, 4, 6
8-6409
0.763932, 2, 4, 4, 4, 4, 5.236068
8-6579
1.438447, 2.381966, 2.381966, 3, 4.618034, 4.618034
8-6716
1.267949, 2, 2.585786, 4, 4, 4.732051,
5.414214
8-8725
2, 2, 2.585786, 2.585786, 4, 5.414214,
5.414214

Degree 4 regular

8-4575
4, 4, 4, 4, 4, 4, 8
8-9570
2.763932, 4, 4, 4, 4, 6, 7.236068
8-10188
2.438447, 3.381966, 3.381966, 5, 5.618034, 5.618034, 6.561553
8-10202
2.585786, 3.267949, 4, 4, 5.414214, 6, 6.732051
8-10819
2.585786, 2.585786, 4, 5.414214, 5.414214, 6, 6
8-10888
2, 4, 4, 4, 6, 6, 6

Degree 5 regular

8-10481
4.381966, 4.381966, 5, 5, 6.618034, 6.618034, 8
8-10975
4, 4, 6, 6, 6, 6, 8
8-11082
4, 4.585786, 4.585786, 6, 6, 7.414214, 7.414214

Degree 6 regular

8-11112
6, 6, 6, 6, 8, 8, 8

Credits: the list of graphs by Brendan McKay and NetworkX library specifically the methods read_graph6 (to read the files provided by Prof. McKay), laplacian_spectrum, diameter, degree, and draw.

Extremal Taylor polynomials

Suppose {f(z)=a_0+a_1z+a_2z^2+\cdots} is a holomorphic function in the unit disk {|z|<1} such that {|f|\le 1} in the disk. How large can its Taylor polynomial {T_n(z)=a_0+a_1z+\cdots +a_n z^n} be in the disk?

We should not expect {T_n} to be bounded by 1 as well. Indeed, the Möbius transformation {f(z)=(z+1/2)/(1+z/2)} has Taylor expansion {(z+1/2)(1-z/2+O(z^2)) = 1/2 + (3/4)z + O(z^2)}, so {T_1(1)=5/4} in this case. This turns out to be the worst case: in general {T_1} is bounded by 5/4 in the disk.

For the second-degree polynomial {T_2} the sharp bound is {89/64}, attained when {f(z) = (8z^2 + 4z + 3)/(3z^2 + 4z + 8)}; the image of the unit circle under the extremal {T_2} is shown below. Clearly, there is something nontrivial going on.

T2
Extremal T_2 attains 89/64 > 1.39

Edmund Landau established the sharp bound for {|T_n|} in his paper Abschätzung der Koeffizientensumme einer Potenzreihe, published in Archiv der Mathematik und Physik (3) 21 in 1913. Confusingly, there are two papers with the same title in the same issue of the journal: one on pages 42-50, the other on pages 250-255, and they appear in different volumes of Landau’s Collected Works. The sharp bound is in the second paper.

First steps

By rotation, it suffices to bound {|T_n(1)|}, which is {|a_0+\cdots +a_n|}. As is often done, we rescale {f} a bit so that it’s holomorphic in a slightly larger disk, enabling the use of the Cauchy integral formula on the unit circle {\mathbb T}. The Cauchy formula says {2\pi i a_k = \int_{\mathbb T} z^{-k-1} f(z) \,dz}. Hence

{\displaystyle 2\pi |T_n(1)| = \left| \int_{\mathbb T} z^{-n-1}(1+z+\dots+z^n) f(z) \,dz \right|}

It is natural to use {|f(z)|\le 1} now, which leads to

{\displaystyle 2\pi |T_n(1)| \le \int_{\mathbb T} |1+z+\dots+z^n|\, |dz| }

Here we can use the geometric sum formula and try to estimate the integral of {|(1-z^{n+1})/(1-z)|} on the unit circle. This is what Landau does in the first of two papers; the result is {O(\log n)} which is the correct rate of growth (this is essentially the Dirichlet kernel estimate from the theory of Fourier series). But there is a way to do better and get the sharp bound.

Key ideas

First idea: the factor {1+z+\dots+z^n} could be replaced by any polynomial {Q} as long as the coefficients of powers up to {n} stay the same. Higher powers contribute nothing to the integral that evaluates {T_n(1)}, but they might reduce the integral of {|Q|}.

Second idea: we should choose {Q} to be the square of some polynomial, {Q=P^2}, because {(2\pi)^{-1}\int_{\mathbb T} |P(z)|^2\, |dz|} can be computed exactly: it is just the sum of squares of the coefficients of {P}, by Parseval’s formula.

Implementation

Since {1+z+\dots+z^n} is the {n}-th degree Taylor polynomial of {(1-z)^{-1}}, it is natural to choose {P} to be the {n}-th degree Taylor polynomial of {(1-z)^{-1/2}}. Indeed, if {P_n(z) = (1-z)^{-1/2} + O(z^{n+1})}, then {P_n(z)^2 = (1-z)^{-1} + O(z^{n+1}) = 1+z+\dots+z^n + O(z^{n+1})} as desired (asymptotics as {z\to 0}). The binomial formula tells us that
{\displaystyle P_n(z)=\sum_{k=0}^n (-1)^k\binom{-1/2}{k}z^k }

The coefficient of {z^k} here can be written out as {(2k-1)!!/(2k)!!} or rewritten as {4^{-k}\binom{2k}{k}} which shows that in lowest terms, its denominator is a power of 2. To summarize, {|T_n(1)|} is bounded by the sum of squares of the coefficients of {P_n}. Such sums are referred to as the Landau constants,

{\displaystyle G_n = 1+ \left(\frac{1}{2}\right)^2 + \left(\frac{1\cdot 3}{2\cdot 4}\right)^2 + \cdots + \left(\frac{(2n-1)!!}{(2n)!!}\right)^2 }

A number of asymptotic and non-asymptotic formulas have been derived for {G_n}, for example Brutman (1982) shows that {G_n - (1/\pi)\log(n+1)} is between 1 and 1.0663.

Sharpness

To demonstrate the sharpness of the bound {|T_n|\le G_n}, we want {|f|\equiv 1} and {P_n(z)^2f(z)/z^n\ge 0} on the unit circle. Both are arranged by taking {f(z) = z^n P_n(1/z) / P_n(z)} which is a Blaschke product of degree {n}. Note that the term {P_n(1/z)} can also be written as {\overline{P_n(1/\bar z)}}. Hence {P_n(z)^2f(z)/z^n = P_n(z) \overline{P_n(1/\bar z)}} which is simply {|P_n(z)|^2} when {|z|=1}. Equality holds in all the estimates above, so they are sharp.

Here are the images of the unit circle under extremal Taylor polynomials {T_5} and {T_{20}}.

T5
Extremal Taylor polynomial of 5th degree
T20
Extremal Taylor polynomial of 20th degree

These polynomials attain large values only on a short subarc of the circle; most of the time they oscillate at levels less than 1. Indeed, the mean value of {|T_n|^2} cannot exceed the mean of {|f|^2} which is at most 1. Here is the plot of the roots of extremal {T_n}:  they are nearly uniform around the circle, except for a gap near 1.

Troots10
Roots of extremail T_10
Troots20
Roots of extremal T_20

But we are not done…

Wait a moment. Does {f(z) = z^n P_n(1/z) / P_n(z)} define a holomorphic function in the unit disk? We are dividing by {P_n} here. Fortunately, {P_n} has no zeros in the unit disk, because its coefficients are positive and decreasing as the exponent {k} increases. Indeed, if {p(z)=c_0+c_1z+\cdots + c_nz^n} with {c_0>c_1>\dots>c_n > 0}, then {(1-z)p(z)} has constant term {c_0} and other coefficients {c_1-c_0}, {c_2-c_1}, … {c_n-c_{n-1}}, {-c_n}. Summing the absolute values of the coefficients of nonconstant terms we get {c_0}. So, when these coefficients are attached to {z^k} with {|z|<1}, the sum of nonconstant terms is strictly less than {c_0} in absolute value. This proves {P_n\ne 0} in the unit disk. Landau credits Adolf Hurwitz with this proof.

In fact, the zeros of {P_n} (Taylor polynomials of {(1-z)^{-1/2}}) lie just outside of the unit disk.

roots20
Zeros of P_20
roots50
Zeros of P_50

The zeros of the Blaschke products formed from {P_n} are the reciprocals of the zeros of  {P_n}, so they lie just inside the unit circle, much like the zeros of {T_n} (though they are different).

The distribution of pow(2, n, n)

For a positive integer {n} let {f(n)} be the remainder of the division of {2^n} by {n}. This is conveniently computed in Python as pow(2, n, n). For example, the first 20 values are returned by

[pow(2, n, n) for n in range(1, 21)]

and they are:

[0, 0, 2, 0, 2, 4, 2, 0, 8, 4, 2, 4, 2, 4, 8, 0, 2, 10, 2, 16]

Obviously, {f(n)=0} if and only if {n} is a power of {2}. It also looks like {f} is always even, with powers of 2 dominating the list of its values. But {f} does take on odd values, although this does not happen often: only 6 times in the first 100 integers.

[(n, pow(2, n, n)) for n in range(1, 101) if pow(2, n, n) % 2]

returns

[(25, 7), (45, 17), (55, 43), (91, 37), (95, 13), (99, 17)]

It is conjectured that the range of {f} consists of all nonnegative integers except 1. Here is a proof that {f(n)\ne 1}, provided by Max Alexeyev on the OEIS page for A036236

Suppose {2^n \equiv 1 \bmod n} for some {n>1}. Let {p} be the smallest prime divisor of {n}. Then {2^n\equiv 1 \bmod p}. This means that the order of {2} in the multiplicative group {(\mathbb Z/ p \mathbb Z)^*} is a divisor of {n}. But this group has {p-1} elements, and the only divisor of {n} that is smaller than {p} is {1}. Thus, the order of {2} is {1}, which means {2\equiv 1 \bmod p}, which is absurd.

The fact that the smallest {n} with {f(n) = 3} is 4700063497 deserves to be mentioned here. But I would like to consider the most frequent values of {f} instead. Experimentally, they are found like this:

from collections import Counter
freq = Counter(pow(2, n, n) for n in range(1, 100000000 + 1))
print(freq.most_common(20))

which prints 20 pairs (value, frequency):

(2, 5763518),
(4, 3004047),
(8, 2054167), 
(16, 1569725), 
(64, 1076293), 
(256, 824438), 
(512, 737450), 
(1024, 668970), 
(32, 638063), 
(4096, 569705), 
(128, 466015), 
(65536, 435306), 
(262144, 389305), 
(1048576, 351723), 
(2097152, 326926), 
(16384, 246533), 
(16777216, 243841), 
(32768, 232037), 
(2048, 153537), 
(8192, 131614)

These are all powers of 2… but not exactly in the order one might expect. I repeated the experiment for the first {10^8k} integers with {k=1, 2, 3, 4, 5, 6}. The frequency order remained stable at the top. These are the most common 11 values, with their frequency (in %) listed for the aforementioned six ranges.

2, [5.8, 5.5, 5.4, 5.3, 5.3, 5.2]
4, [3.0, 2.9, 2.8, 2.8, 2.7, 2.7]
8, [2.1, 2.0, 1.9, 1.9, 1.9, 1.8]
16, [1.6, 1.5, 1.5, 1.4, 1.4, 1.4]
64, [1.1, 1.0, 1.0, 1.0, 1.0, 1.0]
256, [0.8, 0.8, 0.8, 0.8, 0.7, 0.7]
512, [0.7, 0.7, 0.7, 0.7, 0.7, 0.7]
1024, [0.7, 0.6, 0.6, 0.6, 0.6, 0.6]
32, [0.6, 0.6, 0.6, 0.6, 0.6, 0.6]
4096, [0.6, 0.5, 0.5, 0.5, 0.5, 0.5]
128, [0.5, 0.4, 0.4, 0.4, 0.4, 0.4]

It seems that 32 and 128 are consistently less common than one might expect.

The most common value, 2, is contributed by primes and pseudoprimes to base 2. The value of 4 appears when {n = 2p} with {p} prime (but not only then). Still, the primes having zero density makes it difficult to explain the frequency pattern based on primality. A more convincing reason could be: when {n=2k} is even, we are computing {4^k \bmod 2k} and that is likely to be a power of {4}. This boosts the frequency of the even (generally, composite) powers of 2 in the sequence.

Magic angles

The function {u(x, y, z) = 2z^2 - x^2 - y^2} is notable for the following combination of properties:

  1. It is a harmonic function: {\Delta u = -2 - 2 + 4 = 0}
  2. It vanishes at the origin {(0, 0, 0)} together with its gradient.
  3. It is positive on the cone {C=\{(x, y, z) : z > \sqrt{(x^2+y^2)/2}\}}

The cone C has the opening angle {\theta_3 = \cos^{-1}(1/\sqrt{3}) \approx 57.4^\circ} which is known as the magic angle in the context of NMR spectroscopy. Let us consider the mathematical side of its magic.

If C is replaced by any larger cone, the properties 1-2-3 cannot be satisfied by a harmonic function in a neighborhood of the origin. That is, C is the largest cone such that a harmonic function can have a critical point at its vertex which is also a point of its extremum on the cone. Why is that?

Let {u} be a harmonic function in some neighborhood of {(0,0,0)} and suppose {u(0,0,0)=0}, {\nabla u(0,0,0) = 0}, and {u>0} on some cone {C = \{(x, y, z) \colon z>c \sqrt{x^2+y^2+z^2} \}}. Expand {u} into a sum of polynomials {p_2+p_3+\cdots } where each {p_k} is a harmonic homogeneous polynomial of degree {k}. Let {m} be the smallest integer such that {p_m} is not identically zero. Then {p_m} has the same properties as {u} itself, since it dominates the other terms near the origin. We may as well replace {u} by {p_m}: that is, {u} is a spherical harmonic from now on.

Rotating {u} around the {z}-axis preserves all the properties of interest: harmonic, positive on the cone, zero gradient at the origin. Averaging over all these rotations we get a rotationally symmetric function known as a zonal spherical harmonic. Up to a constant factor, such a function is given by {P_m(\cos \phi)} where {\phi} is a spherical coordinate (angle with the {z}-axis) and {P_m} is the Legendre polynomial of degree {m}.

The positivity condition requires {P_m(t) > 0} for {t>c}. In other words, the bound on {\theta} comes from the greatest zero of the Legendre polynomial. As is true for orthogonal polynomials in general, the zeros are interlaced: that is, the zeros of {P_m=0} appear strictly between any two consecutive zeros of {P_{m+1}}. It follows that the value of the greatest zero grows with {m}. Thus, it is smallest when {m=2}. Since {P_2(t) = (3t^2-1)/2}, the zero of interest is {1/\sqrt{3}}, and we conclude that {c \ge  1/\sqrt{3}}. Hence the magic angle.

cone
A piece of magic cone

The magic angle is easy to visualize: it is formed by an edge of a cube and its space diagonal. So, the magic cone with vertex at (0,0,0) is the one that passes through (1, 1, 1), as shown above.

In other dimensions the zonal harmonics are expressed in terms of Gegenbauer polynomials (which reduce to Chebyshev polynomials in dimensions 2 and 4). The above argument applies to them just as well. The relevant Gegenbauer polynomial of degree {2} is {nx^2-1} up to a constant. Thus, in {\mathbb R^n} the magic angle is {\cos^{-1}(1/\sqrt{n})}, illustrated by the harmonic function {u(x)=nx_n^2 - |x|^2}.

This analysis is reminiscent of the Hopf Lemma which asserts that if a positive harmonic function in a smooth domain has boundary value 0 at some point, the normal derivative cannot vanish at that point. The smoothness requirement is typically expressed as the interior ball condition: one can touch every boundary point by a ball contained in the domain. The consideration of magic angles shows that if the function is also harmonic in a larger domain, the interior ball condition can be replaced by the interior cone condition, provided that the opening angle of the cone is greater than {\cos^{-1}(1/\sqrt{n})}.