Zeros of Taylor polynomials of (1+z)^p

This is post is related to Extremal Taylor polynomials where it was important to observe that the Taylor polynomials of the function {(1+z)^{-1/2}} do not have zeros in the unit disk. Let’s see how far this generalizes.

The function {f(z)=(1+z)^{-1}} has the rare property that all zeros of its Taylor polynomial have unit modulus. This is clear from

{\displaystyle T_n(z) = \sum_{k=0}^n (-z)^k = (1-(-z)^{n+1})/(1+z)}.

p-33
The Taylor zeros of (1+z)^(-1)

In this and subsequent illustrations, the zeros of the first 50 Taylor polynomials are shown as blue dots, with the unit circle in red for reference.

When the exponent is less than -1, the zeros move inside the unit disk and begin forming nice patterns in there.

p-43
(1+z)^(-4/3)
p-63
(1+z)^(-2)
p-93
(1+z)^(-3)
p-123
(1+z)^(-4)

When the exponent is strictly between -1 and 1, the zeros are all outside of the unit disk. Some of them get quite large, forcing a change of scale in the image.

p-23
(1+z)^(-2/3)
p-13
(1+z)^(-1/3)
p13
(1+z)^(1/3)
p23
(1+z)^(2/3)

Why does this happen when the exponent approaches 1? The function {1+z} is its own Taylor polynomial, and has the only zero at -1.  So, when {p\approx 1}, the Taylor polynomials are small perturbations of {1+z}. These perturbations of coefficients have to create additional zeros, but being small, they require a large value of {z} to help them.

For a specific example, the quadratic Taylor polynomial of {(1+z)^p} is {1 + pz + p(p-1)z^2/2}, with roots {(1\pm \sqrt{(2-p)/p})/(1-p) }. When {p\approx 1}, one of these roots is near {-1} (as it has to be) and the other is large.

Finally, when {p>1} and is not an integer, we get zeros on both sides of the unit circle. The majority of them are still outside. A prominent example of an interior zero is {-1/p} produced by the first-degree polynomial {1 + pz}.

p43
(1+z)^(4/3)
p73
(1+z)^(7/3)
p103
(1+z)^(10/3)

Another related post: Real zeros of sine Taylor polynomials.

Measuring the regularity of a graph by its Laplacian eigenvalues

Let {G} be a graph with vertices {1, 2, \dots, n}. The degree of vertex {i} is denoted {d_i}. Let {L} be the Laplacian matrix of {G}, so that {L_{ii}=d_i}, {L_{ij}} is {-1} when the vertices {i, j} are adjacent, and is {0} otherwise. The eigenvalues of {L} are written as {\lambda_1\le \dots \le \lambda_n}.

The graph is regular if all vertices have the same degree: {d_1=\cdots = d_n}. How can this property be seen from its Laplacian eigenvalues {\lambda_1, \dots, \lambda_n}?

Since the sum of eigenvalues is equal to the trace, we have {\sum \lambda_i = \sum d_i}. Moreover, {\sum \lambda_i^2} is the trace of {L^2}, which is equal to the sum of the squares of all entries of {L}. This sum is {\sum d_i^2 + \sum d_i} because the {i}th row of {L} contains one entry equal to {d_i} and {d_i} entries equal to {-1}. In conclusion, {\sum d_i^2 = \sum \lambda_i^2 - \sum\lambda_i}.

The Cauchy-Schwarz inequality says that {n\sum d_i^2 \ge \left(\sum d_i \right)^2} with equality if and only if all numbers {d_i} are equal, i.e., the graph is regular. In terms of eigenvalues, this means that the difference
{\displaystyle D =n\sum d_i^2 - \left(\sum d_i \right)^2 = n\sum (\lambda_i^2 - \lambda_i) - \left( \sum\lambda_i \right)^2 }
is always nonnegative, and is equal to zero precisely when the graph is regular. This is how one can see the regularity of a graph from its Laplacian spectrum.

As an aside, {D } is an even integer. Indeed, the sum {\sum d_i} is even because it double-counts the edges. Hence the number of vertices of odd degree is even, which implies that {\sum d_i^k } is even for every positive integer  {k }.

Up to a constant factor, {D} is simply the degree variance: the variance of the sequence {d_1, \dots, d_n}. What graph maximizes it for a given {n}? We want to have some very large degrees and some very small ones.

Let {G_{m, n}} be the union of the complete graph {K_m} on {m} vertices and {(n-m)} isolated vertices. The sum of degrees is {m(m-1)} and the sum of squares of degrees is {m(m-1)^2}. Hence,

{D = nm(m-1)^2 - (m(m-1))^2 = m(m-1)^2(n-m)}

For {n=3, 4, 5, 6} the maximum is attained by {m=n-1}, that is there is one isolated vertex. For {n=7, 8, 9, 10} the maximum is {m=n-2}. In general it is attained by {m^*=\lfloor (3n+2)/4 \rfloor}.

The graph {G_{m, n}} is disconnected. But any graph has the same degree variance as its complement. And the complement {G^c(m, n)} is always connected: it consists of a “center”, a complete graph on {n-m} vertices, and “periphery”, a set of {m} vertices that are connected to each central vertex. Put another way, {G^c(m, n)} is obtained from the complete bipartite graph {K_{m, n-m}} by connecting all vertices of the {n-m} group together.

Tom A. B. Snijders (1981) proved that {G(m^*, n)} and {G^c(m^*, n)} are the only graphs maximizing the degree variance; in particular, {G^c(m^*, n)} is the unique maximizer among the connected graphs. It is pictured below for {n=4, \dots, 9}.

The displacement set of nonlinear maps in vector spaces

Given a vector space {V} and a map {f\colon V\to V} (linear or not), consider the displacement set of {f}, denoted {D(f) = \{f(x)-x\colon x\in V\}}. For linear maps this is simply the range of the operator {f-I} and therefore is a subspace.

The essentially nonlinear operations of taking the inverse or composition of maps become almost linear when the displacement set is considered. Specifically, if {f} has an inverse, then {D(f^{-1}) = -D(f)}, which is immediate from the definition. Also, {D(f\circ g)\subset D(f)+D(g)}.

When {V} is a topological vector space, the maps for which {D(f)} has compact closure are of particular interest: these are compact perturbations of the identity, for which degree theory can be developed. The consideration of {D(f)} makes it very clear that if {f} is an invertible compact perturbation of the identity, then {f^{-1}} is in this class as well.

It is also of interest to consider the maps for which {D(f)} is either bounded, or is bounded away from {0}. Neither case can occur for linear operators, so this is essentially nonlinear analysis. In the nonlinear case, the boundedness assumption for linear operators is usually replaced by the Lipschitz condition. Let us say that {f} is {(L, \ell)}-bi-Lipschitz if {\ell\|x-y\|\le \|f(x)-f(y)\|\le L\|x-y\|} for all {x, y} in the domain of {f}.

Brouwer’s fixed point theorem fails in infinite-dimensional Hilbert spaces, but it not yet clear how hard it can fail. The strongest possible counterexample would be a bi-Lipschitz automorphism of the unit ball with displacement bounded away from 0. The existence of such a map is unknown. If it does not exist, that would imply that the unit ball and the unit sphere in the Hilbert space are not bi-Lipschitz equivalent, because the unit sphere does have such an automorphism: {x\mapsto -x}.

Concerning the maps with bounded displacement, here is a theorem from Patrick Biermann’s thesis (Theorem 3.3.2): if {f} is an {(L, \ell)}-bi-Lipschitz map in a Hilbert space, {L/\ell < \pi/\sqrt{8}}, and {f} has bounded displacement, then {f} is onto. The importance of bounded displacement is illustrated by the forward shift map {S(x_1, x_2, \dots) = (0, x_1, x_2, \dots)} for which {L=\ell=1} but surjectivity nonetheless fails.

It would be nice to get rid of the assumption {L/\ell < \pi/\sqrt{8}} in the preceding paragraph. I guess any bi-Lipschitz map with bounded displacement should be surjective, at least in Hilbert spaces, but possibly in general Banach spaces as well.

Orthogonality in normed spaces

For a vector {x} in a normed space {X}, define the orthogonal complement {x^\perp} to be the set of all vectors {y} such that {\|x+ty\|\ge \|x\|} for all scalars {t}. In an inner product space (real or complex), this agrees with the normal definition of orthogonality because {\|x+ty\|^2 - \|x\|^2 = 2\,\mathrm{Re}\,\langle x, ty\rangle + o(t)} as {t\to 0}, and the right hand side can be nonnegative only if {\langle x, y\rangle=0}.

Let’s see what properties of orthogonal complement survive in a general normed space. For one thing, {x^\perp=X} if and only if {x=0}. Another trivial property is that {0\in x^\perp} for all {x}. More importantly, {x^\perp} is a closed set that contains some nonzero vectors.

  •  Closed because the complement is open: if {\|x+ty\| < \|x\|} for some {t}, the same will be true for vectors close to {y}.
  • Contains a nonzero vector because the Hahn-Banach theorem provides a norming functional for {x}, i.e., a unit-norm linear functional {f\in X^*} such that {f(x)=\|x\|}. Any {y\in \ker f} is orthogonal to {x}, because {\|x+ty\|\ge f(x+ty) = f(x) = \|x\|}.

In general, {x^\perp} is not a linear subspace; it need not even have empty interior. For example, consider the orthogonal complement of the first basis vector in the plane with {\ell_1} (taxicab) metric: it is \{(x, y)\colon |y|\ge |x|\}.

download
The orthogonal complement of a horizontal vector in the taxicab plane

This example also shows that orthogonality is not symmetric in general normed spaces: {(1,1)\in (1,0)^\perp} but {(1,0)\notin (1,1)^\perp}. This is why I avoid using notation {y \perp x} here.

In fact, {x^\perp} is the union of kernels of all norming functionals of {x}, so it is only a linear subspace when the norming functional is unique. Containment in one direction was already proved. Conversely, suppose {y\in x^\perp} and define a linear functional {f} on the span of {x,y} so that {f(ax+by) = a\|x\|}. By construction, {f} has norm 1. Its Hahn-Banach extension is a norming functional for {x} that vanishes on {y}.

Consider {X=L^p[0,1]} as an example. A function {f} satisfies {1\in f^\perp} precisely when its {p}th moment is minimal among all translates {f+c}. This means, by definition, that its “{L^p}-estimator” is zero. In the special cases {p=1,2,\infty} the {L^p} estimator is known as the median, mean, and midrange, respectively. Increasing {p} gives more influence to outliers, so {1\le p\le 2} is the more useful range for it.

Measuring nonlinearity and reducing it

How to measure the nonlinearity of a function {f\colon I\to \mathbb R} where {I\subset \mathbb R} is an interval? A natural way is to consider the smallest possible deviation from a line {y=kx+b}, that is {\inf_{k, b}\sup_{x\in I}|f(x)-kx-b|}. It turns out to be convenient to divide this by {|I|}, the length of the interval {I}. So, let {\displaystyle NL(f;I) = \frac{1}{|I|} \inf_{k, b}\sup_{x\in I}|f(x)-kx-b|}. (This is similar to β-numbers of Peter Jones, except the deviation from a line is measured only in the vertical direction.)

NL-def
NL(f; I) is the maximal vertical distance between red and black, divided by the length of I

Relation with derivatives

The definition of derivative immediately implies that if {f'(a)} exists, then {NL(f;I)\to 0} as {I} shrinks to {a} (that is, gets smaller while containing {a}). A typical construction of a nowhere differentiable continuous function is based on making {NL(f;I)} bounded from below; it is enough to do this for dyadic intervals, and that can be done by adding wiggly terms like {2^{-n}\mathrm{dist}\,(x, 2^{-n}\mathbb Z)}: see the blancmange curve.

The converse is false: if {NL(f; I)\to 0} as {I} shrinks to {a}, the function {f} may still fail to be differentiable at {a}. The reason is that the affine approximation may have different slopes at different scales. An example is {f(x)=x \sin \sqrt{-\log |x|}} in a neighborhood of {0}. Consider a small interval {[-\delta, \delta]}. The line {y = kx} with {k=\sin\sqrt{-\log \delta}} is a good approximation to {f} because {f(x)/x\approx k} on most of the interval except for a very small part near {0}, and on that part {f} is very close to {0} anyway.

Why the root of logarithm? Because {\sin \log |x|} has a fixed amount of change on a fixed proportion of  {[-\delta, \delta]}, independently of {\delta}. We need a function slower than the logarithm, so that as {\delta} decreases, there is a smaller amount of change on a larger part of the interval {[-\delta, \delta]}.

Nonlinearity of Lipschitz functions

Suppose {f} is a Lipschitz function, that is, there exists a constant {L} such that {|f(x)-f(y)|\le L|x-y|} for all {x, y\in I}. It’s easy to see that {NL(f;I)\le L/2}, by taking the mid-range approximation {y=\frac12 (\max_I f + \min_I f)}. But the sharp bound is {NL(f;I)\le L/4} whose proof is not as trivial. The sharpness is shown by {f(x)=|x|} with {I=[-1,1]}.

sharpness
With the maximum difference 1/2 and the length of interval 2, we get NL(f; [-1, 1]) = 1/4
Proof. Let {k} be the slope of the linear function that agrees with {f} at the endpoints of {I}. Subtracting this linear function from {f} gives us a Lipschitz function {g} such that {-L-k\le g'\le L-k} and {\int_I g'= 0}. Let {A = \int_I (g')^+ = \int_I (g')^-}. Chebyshev’s inequality gives lower bounds for the measures of the sets {g'>0} and {g'<0}: namely, {|g'>0|\ge A/(L-k)} and {|g'<0|\le A/(L+k)}. By adding these, we find that {|I| \ge 2LA/(L^2-k^2)\ge 2A/L}. Since {\max _I g - \min_I g \le A}, the mid-range approximation to {g} has error at most {A/2 \le |I|L/4}. Hence {NL(f; I) = NL(g; I) \le L/4}.

Reducing nonlinearity

Turns out, the graph of every Lipschitz function has relatively large almost-flat pieces.  That is, there are subintervals of nontrivial size where the measure of nonlinearity is much smaller than the Lipschitz constant. This result is a special (one-dimensional) case of Theorem 2.3 in Affine approximation of Lipschitz functions and nonlinear quotients by Bates, Johnson, Lindenstrauss, Preiss, and Schechtman.

Theorem AA (for “affine approximation”): For every {\epsilon>0} there exists {\delta>0} with the following property. If {f\colon I\to \mathbb R} is an {L}-Lipschitz function, then there exists an interval {J\subset I} with {|J|\ge \delta |I|} and {NL(f; J)\le \epsilon L}.

Theorem AA should not be confused with Rademacher’s theorem which says that a Lipschitz function is differentiable almost everywhere. The point here is a lower bound on the size of the interval {J}. Differentiability does not provide that. In fact, if we knew that {f} is smooth, or even a polynomial, the proof of Theorem AA would not become any easier.

Proof of Theorem AA

We may assume {I=[-1, 1]} and {L=1}. For {t\in (0, 2]} let {L(t) = \sup \{|f(x)-f(y)|/|x-y| \colon x, y\in I, \ |x-y|\ge t\}}. That is, {L(t)} is the restricted Lipschitz constant, one that applies for distances at least {t}. It is a decreasing function of {t}, and {L(0+)=1}.

Note that {|f(-1)-f(1)|\le 2L(1)} and that every value of {f} is within {2L(1)} of either {f(-1)} or {f(1)}. Hence, the oscillation of {f} on {I} is at most {6L(1)}. If {L(1) \le \epsilon/3}, then the constant mid-range approximation on {I} gives the desired conclusion, with {J=I}. From now on {L(1) > \epsilon/3}.

The sequence {L_k = L(4^{-k})} is increasing toward {L(0+)=1}, which implies {L_{k+1}\le (1+\epsilon) L_k} for some {k}. Pick an interval {[a, b]\subset I} that realizes {L_k}, that is {b-a\ge 4^{-k}} and {|f(b)-f(a)| = 4^{-k}L_k}. Without loss of generality {f(b)>f(a)} (otherwise consider {-f}). Let {J = [(3a+b)/4, (a+3b)/4]} be the middle half of {[a. b]}. Since each point of {J} is within distance {\ge 4^{-k-1}} of both {a} and {b}, it follows that {\displaystyle f(b) + L_{k+1}(x-b) \le f(x) \le f(a) + L_{k+1}(x-a) } for all {x \in J}.

proof
The green lines have slope L_{k+1} which is close to the slope L_k of the secant line through (a, f(a)) and (b, f(b)). The graph of f is pinched between these green lines, except near a or b.

So far we have pinched {f} between two affine functions of equal slope. Let us consider their difference:
{\displaystyle (f(a) + L_{k+1}(x-a)) - (f(b) + L_{k+1}(x-b)) = (L_{k+1}-L_k) (b-a)}. Recall that {L_{k+1}\le (1+\epsilon) L_k}, which gives a bound of {\epsilon L_k(b-a) \le 2\epsilon L |J|} for the difference. Approximating {f} by the average of the two affine functions we conclude that {NL(f;J)\le \epsilon L} as required.

It remains to consider the size of {J}, about which we only know {|J|\ge 4^{-k}/2} so far. Naturally, we want to take the smallest {k} such that {L_{k+1}\le (1+\epsilon) L_k} holds. Let {m} be this value; then {L_m > (1+\epsilon)^{m} L_0}. Here {L_m\le 1} and {L_0 = L(1)> \epsilon/3 }. The conclusion is that {(1+\epsilon)^m < 3/\epsilon}, hence {m< \log(3/\epsilon)/\log(1+\epsilon)}. This finally yields {\displaystyle \delta = 4^{-\log(3/\epsilon)/\log(1+\epsilon)}/2} as an acceptable choice, completing the proof of Theorem AA.

A large amount of work has been done on quantifying {\delta} in various contexts; for example Heat flow and quantitative differentiation by Hytönen and Naor.

Laplacian spectrum of small graphs

This is a collection of entirely unoriginal remarks about Laplacian spectrum of graphs. For an accessible overview of the subject I recommend the M.S. thesis The Laplacian Spectrum of Graphs by Michael William Newman. It also includes a large table of graphs with their spectra. Here I will avoid introducing matrices and enumerating vertices.

Let {V} be the vertex set of a graph. Write {u\sim v} if {u, v} are adjacent vertices. Given a function {f\colon V\to \mathbb R}, define {L f(v) = \sum_{u\colon u\sim v}(f(v)-f(u))}.
This is a linear operator (the graph Laplacian) on the Euclidean space {\ell^2(V)} of all functions {f\colon V\to \mathbb R} with the norm {\|f\|^2 = \sum_{v\in V} f(v)^2}. It is symmetric: {\langle L f, g\rangle = \langle f, L g\rangle } and positive semidefinite: {\langle L f, f\rangle = \frac12 \sum_{u\sim v}(f(u)-f(v))^2\ge 0}. Since equality is attained for constant {f}, 0 is always an eigenvalue of {L}.

This is the standard setup, but I prefer to change things a little and replace {\ell^2(V)} by the smaller space {\ell^2_0(V)} of functions with zero mean: {\sum_{v\in V}f(v)=0}. Indeed, {L} maps {\ell^2(V)} to {\ell^2_0(V)} anyway, and since it kills the constants, it makes sense to focus on {\ell^2_0(V)}. It is a vector space of dimension {n-1} where {n=|V|}.

One advantage is that the smallest eigenvalue is 0 if and only if the graph is disconnected: indeed, {\langle L f, f\rangle=0} is equivalent to {f} being constant on each connected component. We also gain better symmetry between {L} and the Laplacian of the graph complement, denoted {L'}. Indeed, since {L' f(v) = \sum_{u\colon u\not \sim v}(f(v)-f(u))}, it follows that {(L+L')f(v) = \sum_{u\colon u\ne v} (f(v)-f(u)) = n f(v)} for every {f\in \ell^2_0(V)}. So, the identity {L+L' = nI} holds on {\ell^2_0(V)} (it does not hold on {\ell^2(V)}). Hence the eigenvalues of {L'} are obtained by subtracting the eigenvalues of {L} from {n}. As a corollary, the largest eigenvalue of {L} is at most {n}, with equality if and only if the graph complement is disconnected. More precisely, the multiplicity of eigenvalue {n} is one less than the number of connected components of the graph complement.

Let {D} denote the diameter of the graph. Then the number of distinct Laplacian eigenvalues is at least {D}. Indeed, let {u, v} be two vertices at distance {D} from each other. Define {f_0(u) = 1} and {f_0=0} elsewhere. Also let {f_k=L^k f_0} for {k=1, 2, \dots}. Note that {f_k\in \ell_0^2(V)} for all {k\ge 1}. One can prove by induction that {f_k(w)=0} when the distance from {w} to {u} is greater than {k}, and {(-1)^k f_k(w) > 0} when the distance from {w} to {u} is equal to {k}. In particular, {f_k(v) = 0} when {k<D} and {f_D(v)\ne 0}. This shows that {f_D} is not a linear combination of {f_1, \dots, f_{D-1}}. Since {f_k = L^{k-1}f_1}, it follows that {L^{D-1}} is not a linear combination of {L^0, L^1, \dots, L^{D-2}}. Hence the minimal polynomial of {L} has degree at least {D}, which implies the claim.

Let’s consider a few examples of connected graphs.

3 vertices

There are two connected graphs: the 3-path (D=2) and the 3-cycle (D=1). In both cases we get D distinct eigenvalues. The spectra are [1, 3] and [3, 3], respectively.

4 vertices

  • One graph of diameter 3, the path. Its spectrum is {[2-\sqrt{2}, 2, 2+\sqrt{2}]}.
  • One graph of diameter 1, the complete graph. Its spectrum is {[4, 4, 4]}. This pattern continues for other complete graphs: since the complement is the empty graph ({n} components), all {n-1} eigenvalues are equal to {n}.
  • Four graphs of diameter 2, which are shown below, with each caption being the spectrum.
4-0
1, 1, 4
4-2
1, 3, 4
4-3
2, 2, 4
4-4
2, 4, 4

Remarks:

  • The graph [1, 3, 4] has more distinct eigenvalues than its diameter.
  • The graph [2, 2, 4] is regular (all vertices have the same degree).
  • The smallest eigenvalue of graphs [1, 1, 4] and [2, 2, 4] is multiple, due to the graphs having a large group of automorphisms (here rotations); applying some of these automorphisms to an eigenfunctions for the smallest eigenvalue yields another eigenfunction.
  • [1, 3, 4] and [2, 4, 4] also have automorphisms, but their automorphisms preserve the eigenfunction for the lowest eigenvalue, up to a constant factor.

5 vertices

  • One graph of diameter 4, the path. Its spectrum is related to the golden ratio: it consists of {(3\pm \sqrt{5})/2, (5\pm \sqrt{5})/2}.
  • One graph of diameter 1, the complete one: [5, 5, 5, 5]
  • Five graphs of diameter 3. All have connected complement, with the highest eigenvalue strictly between 4 and 5. None are regular. Each has 4 distinct eigenvalues.
  • 14 graphs of diameter 2. Some of these are noted below.

Two have connected complement, so their eigenvalues are less than 5 (spectrum shown on hover):

One has both integers and non-integers in its spectrum, the smallest such graph. Its eigenvalues are {3\pm \sqrt{2}, 3, 5}.

5-15
1.585786, 3, 4.414214, 5

Two have eigenvalues of multiplicity 3, indicating a high degree of symmetry (spectrum shown on hover).

Two have all eigenvalues integer and distinct:

 

The 5-cycle and the complete graph are the only regular graphs on 5 vertices.

6 vertices

This is where we first encounter isospectral graphs: the Laplacian spectrum cannot tell them apart.

Both of these have spectrum {3\pm \sqrt{5}, 2, 3, 3} but they are obviously non-isomorphic (consider the vertex degrees):

Both have these have spectrum {3\pm \sqrt{5}, 3, 3, 4} and are non-isomorphic.

Indeed, the second pair is obtained from the first by taking graph complement.

Also notable are regular graphs on 6 vertices, all of which have integer spectrum.

6-46
1, 1, 3, 3, 4
6-71
3, 3, 3, 3, 6
6-98
2, 3, 3, 5, 5
6-108
4, 4, 4, 6, 6
6-111
6, 6, 6, 6, 6

Here [3, 3, 3, 3, 6] (complete bipartite) and [2, 3, 3, 5, 5] (prism) are both regular of degree 3, but the spectrum allows us to tell them apart.

The prism is the smallest regular graph for which the first eigenvalue is a simple one. It has plenty of automorphisms, but the relevant eigenfunction (1 on one face of the prism, -1 on the other face) is compatible with all of them.

7 vertices

There are four regular graphs on 7 vertices. Two of them are by now familiar: 7-cycle and complete graph. Here are the other two, both regular of degree 4 but with different spectra.

7-720
3, 4, 4, 5, 5, 7
7-832
3.198, 3.198, 4.555, 4.555, 6.247, 6.247

There are lots of isospectral pairs of graphs on 7 vertices, so I will list only the isospectral triples, of which there are five.

Spectrum 0.676596, 2, 3, 3.642074, 5, 5.681331:

Spectrum 0.726927, 2, 3.140435, 4, 4, 6.132637:

Spectrum 0.867363, 3, 3, 3.859565, 5, 6.273073:

Spectrum 1.318669, 2, 3.357926, 4, 5, 6.323404:

All of the triples mentioned so far have connected complement: for example, taking the complement of the triple with the spectrum [0.676596, 2, 3, 3.642074, 5, 5.681331] turns it into the triple with the spectrum [1.318669, 2, 3.357926, 4, 5, 6.323404].

Last but not least, an isospectral triple with an integer spectrum: 3, 4, 4, 6, 6, 7. This one has no counterpart since the complement of each of these graphs is disconnected.

8 vertices

Regular graphs, excluding the cycle (spectrum 0.585786, 0.585786, 2, 2, 3.414214, 3.414214, 4) and the complete one.

Degree 3 regular:

8-4326
2, 2, 2, 4, 4, 4, 6
8-6409
0.763932, 2, 4, 4, 4, 4, 5.236068
8-6579
1.438447, 2.381966, 2.381966, 3, 4.618034, 4.618034
8-6716
1.267949, 2, 2.585786, 4, 4, 4.732051,
5.414214
8-8725
2, 2, 2.585786, 2.585786, 4, 5.414214,
5.414214

Degree 4 regular

8-4575
4, 4, 4, 4, 4, 4, 8
8-9570
2.763932, 4, 4, 4, 4, 6, 7.236068
8-10188
2.438447, 3.381966, 3.381966, 5, 5.618034, 5.618034, 6.561553
8-10202
2.585786, 3.267949, 4, 4, 5.414214, 6, 6.732051
8-10819
2.585786, 2.585786, 4, 5.414214, 5.414214, 6, 6
8-10888
2, 4, 4, 4, 6, 6, 6

Degree 5 regular

8-10481
4.381966, 4.381966, 5, 5, 6.618034, 6.618034, 8
8-10975
4, 4, 6, 6, 6, 6, 8
8-11082
4, 4.585786, 4.585786, 6, 6, 7.414214, 7.414214

Degree 6 regular

8-11112
6, 6, 6, 6, 8, 8, 8

Credits: the list of graphs by Brendan McKay and NetworkX library specifically the methods read_graph6 (to read the files provided by Prof. McKay), laplacian_spectrum, diameter, degree, and draw.

Extremal Taylor polynomials

Suppose {f(z)=a_0+a_1z+a_2z^2+\cdots} is a holomorphic function in the unit disk {|z|<1} such that {|f|\le 1} in the disk. How large can its Taylor polynomial {T_n(z)=a_0+a_1z+\cdots +a_n z^n} be in the disk?

We should not expect {T_n} to be bounded by 1 as well. Indeed, the Möbius transformation {f(z)=(z+1/2)/(1+z/2)} has Taylor expansion {(z+1/2)(1-z/2+O(z^2)) = 1/2 + (3/4)z + O(z^2)}, so {T_1(1)=5/4} in this case. This turns out to be the worst case: in general {T_1} is bounded by 5/4 in the disk.

For the second-degree polynomial {T_2} the sharp bound is {89/64}, attained when {f(z) = (8z^2 + 4z + 3)/(3z^2 + 4z + 8)}; the image of the unit circle under the extremal {T_2} is shown below. Clearly, there is something nontrivial going on.

T2
Extremal T_2 attains 89/64 > 1.39

Edmund Landau established the sharp bound for {|T_n|} in his paper Abschätzung der Koeffizientensumme einer Potenzreihe, published in Archiv der Mathematik und Physik (3) 21 in 1913. Confusingly, there are two papers with the same title in the same issue of the journal: one on pages 42-50, the other on pages 250-255, and they appear in different volumes of Landau’s Collected Works. The sharp bound is in the second paper.

First steps

By rotation, it suffices to bound {|T_n(1)|}, which is {|a_0+\cdots +a_n|}. As is often done, we rescale {f} a bit so that it’s holomorphic in a slightly larger disk, enabling the use of the Cauchy integral formula on the unit circle {\mathbb T}. The Cauchy formula says {2\pi i a_k = \int_{\mathbb T} z^{-k-1} f(z) \,dz}. Hence

{\displaystyle 2\pi |T_n(1)| = \left| \int_{\mathbb T} z^{-n-1}(1+z+\dots+z^n) f(z) \,dz \right|}

It is natural to use {|f(z)|\le 1} now, which leads to

{\displaystyle 2\pi |T_n(1)| \le \int_{\mathbb T} |1+z+\dots+z^n|\, |dz| }

Here we can use the geometric sum formula and try to estimate the integral of {|(1-z^{n+1})/(1-z)|} on the unit circle. This is what Landau does in the first of two papers; the result is {O(\log n)} which is the correct rate of growth (this is essentially the Dirichlet kernel estimate from the theory of Fourier series). But there is a way to do better and get the sharp bound.

Key ideas

First idea: the factor {1+z+\dots+z^n} could be replaced by any polynomial {Q} as long as the coefficients of powers up to {n} stay the same. Higher powers contribute nothing to the integral that evaluates {T_n(1)}, but they might reduce the integral of {|Q|}.

Second idea: we should choose {Q} to be the square of some polynomial, {Q=P^2}, because {(2\pi)^{-1}\int_{\mathbb T} |P(z)|^2\, |dz|} can be computed exactly: it is just the sum of squares of the coefficients of {P}, by Parseval’s formula.

Implementation

Since {1+z+\dots+z^n} is the {n}-th degree Taylor polynomial of {(1-z)^{-1}}, it is natural to choose {P} to be the {n}-th degree Taylor polynomial of {(1-z)^{-1/2}}. Indeed, if {P_n(z) = (1-z)^{-1/2} + O(z^{n+1})}, then {P_n(z)^2 = (1-z)^{-1} + O(z^{n+1}) = 1+z+\dots+z^n + O(z^{n+1})} as desired (asymptotics as {z\to 0}). The binomial formula tells us that
{\displaystyle P_n(z)=\sum_{k=0}^n (-1)^k\binom{-1/2}{k}z^k }

The coefficient of {z^k} here can be written out as {(2k-1)!!/(2k)!!} or rewritten as {4^{-k}\binom{2k}{k}} which shows that in lowest terms, its denominator is a power of 2. To summarize, {|T_n(1)|} is bounded by the sum of squares of the coefficients of {P_n}. Such sums are referred to as the Landau constants,

{\displaystyle G_n = 1+ \left(\frac{1}{2}\right)^2 + \left(\frac{1\cdot 3}{2\cdot 4}\right)^2 + \cdots + \left(\frac{(2n-1)!!}{(2n)!!}\right)^2 }

A number of asymptotic and non-asymptotic formulas have been derived for {G_n}, for example Brutman (1982) shows that {G_n - (1/\pi)\log(n+1)} is between 1 and 1.0663.

Sharpness

To demonstrate the sharpness of the bound {|T_n|\le G_n}, we want {|f|\equiv 1} and {P_n(z)^2f(z)/z^n\ge 0} on the unit circle. Both are arranged by taking {f(z) = z^n P_n(1/z) / P_n(z)} which is a Blaschke product of degree {n}. Note that the term {P_n(1/z)} can also be written as {\overline{P_n(1/\bar z)}}. Hence {P_n(z)^2f(z)/z^n = P_n(z) \overline{P_n(1/\bar z)}} which is simply {|P_n(z)|^2} when {|z|=1}. Equality holds in all the estimates above, so they are sharp.

Here are the images of the unit circle under extremal Taylor polynomials {T_5} and {T_{20}}.

T5
Extremal Taylor polynomial of 5th degree
T20
Extremal Taylor polynomial of 20th degree

These polynomials attain large values only on a short subarc of the circle; most of the time they oscillate at levels less than 1. Indeed, the mean value of {|T_n|^2} cannot exceed the mean of {|f|^2} which is at most 1. Here is the plot of the roots of extremal {T_n}:  they are nearly uniform around the circle, except for a gap near 1.

Troots10
Roots of extremail T_10
Troots20
Roots of extremal T_20

But we are not done…

Wait a moment. Does {f(z) = z^n P_n(1/z) / P_n(z)} define a holomorphic function in the unit disk? We are dividing by {P_n} here. Fortunately, {P_n} has no zeros in the unit disk, because its coefficients are positive and decreasing as the exponent {k} increases. Indeed, if {p(z)=c_0+c_1z+\cdots + c_nz^n} with {c_0>c_1>\dots>c_n > 0}, then {(1-z)p(z)} has constant term {c_0} and other coefficients {c_1-c_0}, {c_2-c_1}, … {c_n-c_{n-1}}, {-c_n}. Summing the absolute values of the coefficients of nonconstant terms we get {c_0}. So, when these coefficients are attached to {z^k} with {|z|<1}, the sum of nonconstant terms is strictly less than {c_0} in absolute value. This proves {P_n\ne 0} in the unit disk. Landau credits Adolf Hurwitz with this proof.

In fact, the zeros of {P_n} (Taylor polynomials of {(1-z)^{-1/2}}) lie just outside of the unit disk.

roots20
Zeros of P_20
roots50
Zeros of P_50

The zeros of the Blaschke products formed from {P_n} are the reciprocals of the zeros of  {P_n}, so they lie just inside the unit circle, much like the zeros of {T_n} (though they are different).