Weak convergence in metric spaces

Weak convergence in a Hilbert space is defined as pointwise convergence of functionals associated to the elements of the space. Specifically, {x_n\rightarrow x} weakly if the associated functionals {f_n(y) = \langle x_n, y \rangle} converge to {f(y) = \langle x_n, y \rangle} pointwise.

How could this idea be extended to metric spaces without linear structure? To begin with, {f_n(y)} could be replaced with {\|x_n-y\|^2-\|x_n\|^2}, since this agrees with original {f_n} up to some constant terms. Also, {\|x_n\|^2} here could be {\|x_n-z\|^2} for any fixed {z}; to avoid introducing another variable here, let’s use {x_1} for the purpose of fixed reference point {z}. Now we have a metric space version of the weak convergence: the functions
{f_n(y) = d(x_n,y)^2 - d(x_n,x_1)^2}
must converge pointwise to
{f(y) = d(x,y)^2 - d(x,x_1)^2}

The triangle inequality shows that strong convergence {d(x_n,x)\rightarrow 0} implies weak convergence, as expected. And the converse is not necessarily true, as the example of a Hilbert space shows.


Aside: the above is not the only way to define weak convergence in metric spaces. Another approach is to think of {\langle x_n , y\rangle} in terms of projection onto a line through {y}. A metric space version of this concept is the nearest-point projection onto a geodesic curve. This is a useful approach, but it is only viable for metric spaces with additional properties (geodesic, nonpositive curvature).

Also, both of these approaches take the Hilbert space case as the point of departure, and do not necessarily capture weak convergence in other normed spaces.


Let’s try this out in {\ell^1} with the standard basis sequence {e_n}. Here {f_n(y) = \|e_n-y\|^2 - 4 \rightarrow (\|y\| + 1)^2 - 4}. Is there an element {x\in \ell^1} such that

{\|x-y\|^2 - \|x-e_1\|^2= (\|y\|+1)^2 - 4} for all {y}?

Considering both sides as functions of one variable {y_n}, for a fixed {n}, shows that {x_n=0} for {n}, because the left hand side is non-differentiable at {y_n=x_n} while the right hand side is non-differentiable at {y_n=0}. Then the desired identity simplifies to {\|y\|^2 - 1 = (\|y\|+1)^2 - 4} which is false. Oh well, that sequence wasn’t weakly convergent to begin with: by Schur’s theorem, every weakly convergent sequence in {\ell^1} also converges strongly.

This example also shows that not every bounded sequence in a metric space has a weakly convergent subsequence, unlike the way it works in Hilbert spaces.

Graphs with an invertible incidence matrix

Graphs are often encoded by their adjacency matrix: a symmetric matrix where {1} in the entry {(i,j)} means there is an edge between vertices labeled {i} and {j}. Another useful encoding is the incidence matrix, in which rows correspond to edges and columns to vertices (or the other way around). Having {1} in the entry {(i,j)} means that {i}th edge is incident to {j}th vertex. Simply put, each row contains exactly two nonzero entries, which indicate the endpoints of an edge.

An incidence matrix is generally rectangular. It is the square matrix when the graph has as many edges as vertices. For example, the 4-cycle has incidence matrix

{M = \displaystyle \begin{pmatrix} 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 1 \end{pmatrix}}

(up to relabeling of vertices). This matrix is not invertible. Indeed, {M(1,-1,1,-1)^T=0}, which is a reflection of the fact that the graph is bipartite. Indeed, for any bipartite graph the {\pm 1} vector describing a 2-coloring of the vertices belongs to the kernel of the incidence matrix.

The incidence matrix of an odd cycle is invertible, however. What other such “invertible” graphs exist? Clearly, the disjoint union of “invertible” graphs is itself invertible, so we can focus on connected graphs.

There are no invertible graphs on 1 or 2 vertices. For 3 through 8 vertices, exhaustive search yields the following counts:

  • 1 graph on 3 vertices (3-cycle)
  • 1 graph on 4 vertices (3-cycle with an edge attached)
  • 4 graphs on 5 vertices
  • 8 graphs on 6 vertices
  • 23 graphs on 7 vertices
  • 55 graphs on 8 vertices

The numbers match the OEIS sequence A181688 but it’s not clear whether this is for real or a coincidence. The brute force search takes long for more than 8 vertices. It’s probably better to count such graphs by using their (folklore?) characterization as “trees planted on an odd cycle”, described by Chris Godsil on MathOverflow.

Some examples:

graph7_4

graph8_52

graph8_7

The examples were found and plotted by this Python script.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

def is_invertible(graph, n):
    matrix = []
    for i in range(n):
        row = [0]*n
        row[graph[i][0]] = 1
        row[graph[i][1]] = 1
        matrix.append(row)
    return abs(np.linalg.det(matrix)) > 0.5

n = 6   # number of vertices
graphs = combinations(combinations(range(n), 2), n)
inv = []
for g in graphs:
    if is_invertible(g, n):
        gr = nx.Graph(g)
        if nx.is_connected(gr) and not any([nx.is_isomorphic(gr, gr2) for gr2 in inv]):
            inv.append(gr)
            nx.draw_networkx(gr)
            plt.savefig("graph{}_{}.png".format(n, len(inv)))
            plt.clf()

print("There are {} connected invertible graphs with {} vertices".format(len(inv), n))

f(f(x)) = 4x

There are plenty of continuous functions {f} such that {f(f(x)) \equiv x}. Besides the trivial examples {f(x)=x} and {f(x)=-x}, one can take any equation {F(x,y)=0} that is symmetric in {x,y} and has a unique solution for one variable in terms of the other. For example: {x^3+y^3-1 =0 } leads to {f(x) = (1-x^3)^{1/3}}.

x3y3
x^3+y^3 = 1

I can’t think of an explicit example that is also differentiable, but implicitly one can be defined by {x^3+y^3+x+y=1}, for example. In principle, this can be made explicit by solving the cubic equation for {x}, but I’d rather not.

implicit
x^3+y^3+x+y = 1

I don’t know of any diffeomorphism {f\colon \mathbb R \rightarrow \mathbb R} such that both {f} and {f^{-1}} have a nice explicit form.


Let’s change the problem to {f(f(x))=4x}. There are still two trivial, linear solutions: {f(x)=2x} and {f(x)=-2x}. Any other? The new equation imposes stronger constraints on {f}: for example, it implies

\displaystyle f(4x) = f(f(f(x)) = 4f(x)

But here is a reasonably simple nonlinear continuous example: define

\displaystyle f(x) = \begin{cases} 2^x,\quad & 1\le x\le 2 \\ 4\log_2 x,\quad &2\le x\le 4 \end{cases}

and extend to all {x} by {f(\pm 4x) = \pm 4f(x)}. The result looks like this, with the line {y=2x} drawn in red for comparison.

ff4x
f(f(x)) = 4x

To check that this works, notice that {2^x} maps {[1,2]} to {[2,4]}, which the function {4\log_2 x} maps to {[4,8]}, and of course {4\log _2 2^x = 4x}.

From the plot, this function may appear to be differentiable for {x\ne 0}, but it is not. For example, at {x=2} the left derivative is {4\ln 2 \approx 2.8} while the right derivative is {2/\ln 2 \approx 2.9}.
This could be fixed by picking another building block instead of {2^x}, but not worth the effort. After all, the property {f(4x)=4f(x)} is inconsistent with differentiability at {0} as long as {f} is nonlinear.

The plots were made in Sage, with the function f define thus:

def f(x):
    if x == 0:
        return 0
    xa = abs(x)
    m = math.floor(math.log(xa, 2))
    if m % 2 == 0:
        return math.copysign(2**(m + xa/2**m), x)
    else:
        return math.copysign(2**(m+1) * (math.log(xa, 2)-m+1), x)

Random graphs with two-sided degree bounds

I wanted some random examples of graphs where every vertex has degree between two given parameters, minDegree and maxDegree. Like this one:

15 vertices, degree from 3 to 4
15 vertices, degree from 3 to 4

The approach I took was very simple (and not suitable for construction of very large or very regular graphs). Each edge appears with probability p. If the minimal degree is too small, this probability is multiplied by 1.1. If the maximal degree is too big, the probability is divided by 1.1. Either way, the process repeats until success.

g20-1-5
20 vertices, degree from 1 to 5

So far, this blog had too much Matlab/Scilab and not enough Python. I’ll try to restore the balance. Here, numpy generates random matrices and takes care of degree restrictions; networkx lays out the graph.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt

vertices = 15
minDegree = 3
maxDegree = 4
p = 0.5
success = False

while not success:
    a = (np.random.random((vertices, vertices)) < p).astype(int)
    a = np.maximum(a, np.matrix.transpose(a))
    np.fill_diagonal(a, 0)
    s = a.sum(axis=1)
    success = True
    if min(s) < minDegree:
        success = False
        p = p * 1.1
    if max(s) > maxDegree:
        success = False
        p = p / 1.1
g = nx.Graph(a)
nx.draw_networkx(g)
plt.show()

One more time:

g25-2-4
25 vertices, degree from 2 to 4

Classifying words: a tiny example of SVD truncation

Given a bunch of words, specifically the names of divisions of plants and bacteria, I’m going to use a truncated Singular Value Decomposition to separate bacteria from plants. This isn’t a novel or challenging task, but I like the small size of the example. A similar type of examples is classifying a bunch of text fragments by keywords, but that requires a lot more setup.

Here are 33 words to classify: acidobacteria, actinobacteria, anthocerotophyta, aquificae, bacteroidetes, bryophyta, charophyta, chlamydiae, chloroflexi, chlorophyta, chrysiogenetes, cyanobacteria, cycadophyta, deferribacteres, deinococcus-thermus, dictyoglomi, firmicutes, fusobacteria, gemmatimonadetes, ginkgophyta, gnetophyta, lycopodiophyta, magnoliophyta, marchantiophyta, nitrospirae, pinophyta, proteobacteria, pteridophyta, spirochaetes, synergistetes, tenericutes, thermodesulfobacteria, thermotogae.

As is, the task is too easy: we can recognize the -phyta ending in the names of plant divisions. Let’s jumble the letters within each word:

aaabccdeiiort, aaabcceiinortt, aacehhnoooprttty, aacefiiqu, abcdeeeiorstt, abhoprtyy, aachhoprty, aacdehilmy, cefhilloorx, achhlooprty, ceeeghinorssty, aaabcceinorty, aaccdhoptyy, abcdeeeefirrrst, cccdeehimnoorsstuu, cdgiilmooty, cefiimrstu, aabcefiorstu, aadeeegimmmnostt, agghiknopty, aeghnoptty, acdhilooopptyy, aaghilmnoopty, aaachhimnoprtty, aeiinoprrst, ahinoppty, aabceeiooprrtt, adehiopprtty, aceehioprsst, eeeginrssstty, ceeeinrsttu, aabcdeeefhilmoorrsttu, aeeghmoortt

Not so obvious anymore, is it? Recalling the -phyta ending, we may want to focus on the presence of letter y, which is not so common otherwise. Indeed, the count of y letters is a decent prediction: on the following plot, green asterisks are plants and red are bacteria, the vertical axis is the count of letter Y in each word.

svd1
Count of Y in each word

However, the simple count fails to classify several words: having 1 letter Y may or may not mean a plant. Instead, let’s consider the entire matrix {A} of letter counts (here it is in a spreadsheet: 33 rows, one for each word; 26 columns, one for each letter.) So far, we looked at its 25th column in isolation from the rest of the matrix. Truncated SVD uncovers the relations between columns that are not obvious but express patterns such as the presence of letters p,h,t,a along with y. Specifically, write {A = UDV^T} with {U,V} unitary and {D} diagonal. Replace all entries of {D}, except the four largest ones, by zeros. The result is a rank-4 diagonal matrix {D_4}. The product {A_4 = UD_4V^T} is a rank-4 matrix, which keeps some of the essential patterns in {A} but de-emphasizes the accidental.

The entries of {D_4} are no longer integers. Here is a color-coded plot of its 25th column, which still somehow corresponds to letter Y but takes into account the other letters with which it appears.

svd2
The same column of the letter-count matrix, after truncation

Plants are now cleanly separated from bacteria. Plots made in MATLAB as follows:


A=[3,1,2,1,1,0,0,0,2,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0;3,1,2,0,1,0,0,0,2,0,0,0,0,1,1,0,0,1,0,2,0,0,0,0,0,0;2,0,1,0,1,0,0,2,0,0,0,0,0,1,3,1,0,1,0,3,0,0,0,0,1,0;2,0,1,0,1,1,0,0,2,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0;1,1,1,1,3,0,0,0,1,0,0,0,0,0,1,0,0,1,1,2,0,0,0,0,0,0;1,1,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,2,0;2,0,1,0,0,0,0,2,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,1,0;2,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0;0,0,1,0,1,1,0,1,1,0,0,2,0,0,2,0,0,1,0,0,0,0,0,1,0,0;1,0,1,0,0,0,0,2,0,0,0,1,0,0,2,1,0,1,0,1,0,0,0,0,1,0;0,0,1,0,3,0,1,1,1,0,0,0,0,1,1,0,0,1,2,1,0,0,0,0,1,0;3,1,2,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,0,1,0;2,0,2,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,2,0;1,1,1,1,4,1,0,0,1,0,0,0,0,0,0,0,0,3,1,1,0,0,0,0,0,0;0,0,3,1,2,0,0,1,1,0,0,0,1,1,2,0,0,1,2,1,2,0,0,0,0,0;0,0,1,1,0,0,1,0,2,0,0,1,1,0,2,0,0,0,0,1,0,0,0,0,1,0;0,0,1,0,1,1,0,0,2,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0;2,1,1,0,1,1,0,0,1,0,0,0,0,0,1,0,0,1,1,1,1,0,0,0,0,0;2,0,0,1,3,0,1,0,1,0,0,0,3,1,1,0,0,0,1,2,0,0,0,0,0,0;1,0,0,0,0,0,2,1,1,0,1,0,0,1,1,1,0,0,0,1,0,0,0,0,1,0;1,0,0,0,1,0,1,1,0,0,0,0,0,1,1,1,0,0,0,2,0,0,0,0,1,0;1,0,1,1,0,0,0,1,1,0,0,1,0,0,3,2,0,0,0,1,0,0,0,0,2,0;2,0,0,0,0,0,1,1,1,0,0,1,1,1,2,1,0,0,0,1,0,0,0,0,1,0;3,0,1,0,0,0,0,2,1,0,0,0,1,1,1,1,0,1,0,2,0,0,0,0,1,0;1,0,0,0,1,0,0,0,2,0,0,0,0,1,1,1,0,2,1,1,0,0,0,0,0,0;1,0,0,0,0,0,0,1,1,0,0,0,0,1,1,2,0,0,0,1,0,0,0,0,1,0;2,1,1,0,2,0,0,0,1,0,0,0,0,0,2,1,0,2,0,2,0,0,0,0,0,0;1,0,0,1,1,0,0,1,1,0,0,0,0,0,1,2,0,1,0,2,0,0,0,0,1,0;1,0,1,0,2,0,0,1,1,0,0,0,0,0,1,1,0,1,2,1,0,0,0,0,0,0;0,0,0,0,3,0,1,0,1,0,0,0,0,1,0,0,0,1,3,2,0,0,0,0,1,0;0,0,1,0,3,0,0,0,1,0,0,0,0,1,0,0,0,1,1,2,1,0,0,0,0,0;2,1,1,1,3,1,0,1,1,0,0,1,1,0,2,0,0,2,1,2,1,0,0,0,0,0;1,0,0,0,2,0,1,1,0,0,0,0,1,0,2,0,0,1,0,2,0,0,0,0,0,0];
[U, D, V] = svd(A);
D4 = D .* (D >= D(4, 4));
A4 = U * D4 * V';
plants = (A4(:, 25) > 0.8);
bacteria = (A4(:, 25) <= 0.8);
  % the rest is output
words = 1:33;
hold on
plot(words(plants), A4(plants, 25), 'g*');
plot(words(bacteria), A4(bacteria, 25), 'r*');

Bi-Lipschitz equivalence and fixed points

Two metric spaces {X,Y} are bi-Lipschitz equivalent if there is a bijection {f:X\rightarrow Y} and a constant {L} such that
{L^{-1}d_X(a,b) \le d_Y(f(a),f(b)) \le L\,d_X(a,b)} for all {a,b\in X}. In other words, {f} must be Lipschitz with a Lipschitz inverse; this is asking more than just a homeomorphism (continuous with continuous inverse).

For example, the graph {y=\sqrt{|x|}} is not bi-Lipschitz equivalent to a line. The cusp is an obstruction:

Curve with a cusp

Indeed, the points {A,B = (\pm \epsilon,\sqrt{\epsilon})} are separated by {C=(0,0)} and lie at distance {\sim \sqrt{\epsilon}} from it. So, the images {f(A), f(B)} would lie on opposite sides of {f(C)}, at distance {\sim \sqrt{\epsilon}} from it. But then the distance between {f(A)} and {f(B)} would be of order {\sqrt{\epsilon}}, contradicting {d(A,B)\sim \epsilon}.

There are other pairs of spaces which are homeomorphic but not bi-Lipschitz equivalent, like a circle and Koch snowflake.


What is the simplest example of two spaces that are not known to be bi-Lipschitz equivalent? Probably: the unit ball {B} and the unit sphere {S} in an infinite-dimensional Hilbert space.

In finite dimensions these are not even homeomorphic: e.g., the sphere has a self-homeomorphism without fixed points, namely {R(x) = -x }, while the ball has no such thing due to Brouwer’s fixed point theorem. But in the infinite-dimensional case {B} and {S} are homeomorphic. Moreover, there exists a Lipschitz map {F\colon B\rightarrow B} such that the displacement function {\|F(x)-x\|} is bounded below by a positive constant: no hope for anything like Brouwer’s fixed point theorem.

It’s hard to see what an obstruction to bi-Lipschitz equivalence could be: there are no cusps, nothing is fractal-like, dimensions sort of agree (both infinite), topologically they are the same… Here is a possible direction of attack (from Geometric Nonlinear Functional Analysis by Benyamini and Lindenstrauss). If a bi-Lipschitz map {f\colon B\rightarrow S} exists, then the composition {f^{-1}\circ R\circ f} is a Lipschitz involution of {B} with displacement bounded from below. So, if such a map could be ruled out, the problem would be answered in the negative. As far as I know, it remains open.

Irrational sunflowers

A neat way to visualize a real number {\alpha} is to make a sunflower out of it. This is an arrangement of points with polar angles {2 \pi \alpha k} and polar radii {\sqrt{k}} (so that the concentric disks around the origin get the number of points proportional to their area). The prototypical sunflower has {\alpha=(\sqrt{5}+1)/2}, the golden ratio. This is about the most uniform arrangement of points within a disk that one can get.

Golden ratio sunflower
Golden ratio sunflower

But nothing stops us from using other numbers. The square root of 5 is not nearly as uniform, forming distinct spirals.

Square root of 5
Square root of 5

The number {e} begins with spirals, but quickly turns into something more uniform.

Euler's sunflower
Euler’s sunflower

The number {\pi} has stronger spirals: seven of them, due to {\pi\approx 22/7} approximation.

pi sunflower
pi sunflower

Of course, if {\pi} was actually {22/7}, the arrangement would have rays instead of spirals:

Rational sunflower: 22/7
Rational sunflower: 22/7

What if we used more points? The previous pictures have 500 points; here is {\pi} with {3000}. The new pattern has 113 rays: {\pi\approx 355/113}.

pi with 3000 points
pi with 3000 points

Apéry’s constant, after beginning with five spirals, refuses to form rays or spirals again even with 3000 points.

Apéry's constant, 3000 points
Apéry’s constant, 3000 points

The images were made with Scilab as follows, with an offset by 1/2 in the polar radius to prevent the center from sticking out too much.

n = 500
alpha = (sqrt(5)+1)/2
r = sqrt([1:n]-1/2)  
theta = 2*%pi*alpha*[1:n]
plot(r.*cos(theta), r.*sin(theta), '*');
set(gca(), "isoview", "on")