Bipartite metric spaces provide simple examples of highly non-Euclidean metric spaces. Let and be two disjoint abstract sets, say, with and points respectively. Define a metric on the union as follows, where are some positive numbers:

The distance between distinct elements of is .

The distance between distinct elements of is .

The distance between two elements from different sets is .

The triangle inequality is satisfied provided that . When is large, the space is not very interesting: it is approximately the union of two simplexes placed at some large distance from each other. On the other hand, when we encounter a bizarre situation: every point of is a “midpoint” between any two distinct points of , and vice versa.

A function is called a positive kernel if the matrix is positive semidefinite for any choice of and . A classical theorem of Schoenberg says that a metric space is isometric to a subset of a Hilbert space if and only if is a positive kernel for all . One may ask: is there any function such that is a positive kernel on for every metric space ?

Suppose is such a function. Applying it to the bipartite metric defined above, we find that the following matrix must be positive semidefinite (the case , shown; in general the diagonal blocks have sizes and ):

Test the PSD property against vectors of the form , that is entries equal to some number followed by entries equal to some number . The result is the quadratic form where , , and .

For this quadratic form to be positive semidefinite, we need and to hold. The inequality amounts to and , because can be any positive integer. Simply put, must be nonnegative.

The inequality simplifies to

When and , we obtain ; so, is maximized at . Letting yields . What does this mean?

If , the inequality amounts to log-convexity of , and there are lots of log-convex functions. But crucially, our only constraint on in the construction of a bipartite metric space is . In the special case we find that whenever (and ). On one hand, must be nonincreasing because all are allowed. On the other hand, implies that the sequence is nondecreasing for . These conditions can be satisfied together only if is constant on .

And there you have it, a complete description of functions that transform every distance matrix into a positive semidefinite matrix: on , and . It is immediate that such indeed works for all metric spaces, not just bipartite ones.

Graphs are often encoded by their adjacency matrix: a symmetric matrix where in the entry means there is an edge between vertices labeled and . Another useful encoding is the incidence matrix, in which rows correspond to edges and columns to vertices (or the other way around). Having in the entry means that th edge is incident to th vertex. Simply put, each row contains exactly two nonzero entries, which indicate the endpoints of an edge.

An incidence matrix is generally rectangular. It is the square matrix when the graph has as many edges as vertices. For example, the 4-cycle has incidence matrix

(up to relabeling of vertices). This matrix is not invertible. Indeed, , which is a reflection of the fact that the graph is bipartite. Indeed, for any bipartite graph the vector describing a 2-coloring of the vertices belongs to the kernel of the incidence matrix.

The incidence matrix of an odd cycle is invertible, however. What other such “invertible” graphs exist? Clearly, the disjoint union of “invertible” graphs is itself invertible, so we can focus on connected graphs.

There are no invertible graphs on 1 or 2 vertices. For 3 through 8 vertices, exhaustive search yields the following counts:

1 graph on 3 vertices (3-cycle)

1 graph on 4 vertices (3-cycle with an edge attached)

4 graphs on 5 vertices

8 graphs on 6 vertices

23 graphs on 7 vertices

55 graphs on 8 vertices

The numbers match the OEIS sequence A181688 but it’s not clear whether this is for real or a coincidence. The brute force search takes long for more than 8 vertices. It’s probably better to count such graphs by using their (folklore?) characterization as “trees planted on an odd cycle”, described by Chris Godsil on MathOverflow.

Some examples:

The examples were found and plotted by this Python script.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations
def is_invertible(graph, n):
matrix = []
for i in range(n):
row = [0]*n
row[graph[i][0]] = 1
row[graph[i][1]] = 1
matrix.append(row)
return abs(np.linalg.det(matrix)) > 0.5
n = 6 # number of vertices
graphs = combinations(combinations(range(n), 2), n)
inv = []
for g in graphs:
if is_invertible(g, n):
gr = nx.Graph(g)
if nx.is_connected(gr) and not any([nx.is_isomorphic(gr, gr2) for gr2 in inv]):
inv.append(gr)
nx.draw_networkx(gr)
plt.savefig("graph{}_{}.png".format(n, len(inv)))
plt.clf()
print("There are {} connected invertible graphs with {} vertices".format(len(inv), n))

Given a bunch of words, specifically the names of divisions of plants and bacteria, I’m going to use a truncated Singular Value Decomposition to separate bacteria from plants. This isn’t a novel or challenging task, but I like the small size of the example. A similar type of examples is classifying a bunch of text fragments by keywords, but that requires a lot more setup.

Here are 33 words to classify: acidobacteria, actinobacteria, anthocerotophyta, aquificae, bacteroidetes, bryophyta, charophyta, chlamydiae, chloroflexi, chlorophyta, chrysiogenetes, cyanobacteria, cycadophyta, deferribacteres, deinococcus-thermus, dictyoglomi, firmicutes, fusobacteria, gemmatimonadetes, ginkgophyta, gnetophyta, lycopodiophyta, magnoliophyta, marchantiophyta, nitrospirae, pinophyta, proteobacteria, pteridophyta, spirochaetes, synergistetes, tenericutes, thermodesulfobacteria, thermotogae.

As is, the task is too easy: we can recognize the -phyta ending in the names of plant divisions. Let’s jumble the letters within each word:

Not so obvious anymore, is it? Recalling the -phyta ending, we may want to focus on the presence of letter y, which is not so common otherwise. Indeed, the count of y letters is a decent prediction: on the following plot, green asterisks are plants and red are bacteria, the vertical axis is the count of letter Y in each word.

However, the simple count fails to classify several words: having 1 letter Y may or may not mean a plant. Instead, let’s consider the entire matrix of letter counts (here it is in a spreadsheet: 33 rows, one for each word; 26 columns, one for each letter.) So far, we looked at its 25th column in isolation from the rest of the matrix. Truncated SVD uncovers the relations between columns that are not obvious but express patterns such as the presence of letters p,h,t,a along with y. Specifically, write with unitary and diagonal. Replace all entries of , except the four largest ones, by zeros. The result is a rank-4 diagonal matrix . The product is a rank-4 matrix, which keeps some of the essential patterns in but de-emphasizes the accidental.

The entries of are no longer integers. Here is a color-coded plot of its 25th column, which still somehow corresponds to letter Y but takes into account the other letters with which it appears.

Plants are now cleanly separated from bacteria. Plots made in MATLAB as follows:

A=[3,1,2,1,1,0,0,0,2,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0;3,1,2,0,1,0,0,0,2,0,0,0,0,1,1,0,0,1,0,2,0,0,0,0,0,0;2,0,1,0,1,0,0,2,0,0,0,0,0,1,3,1,0,1,0,3,0,0,0,0,1,0;2,0,1,0,1,1,0,0,2,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0;1,1,1,1,3,0,0,0,1,0,0,0,0,0,1,0,0,1,1,2,0,0,0,0,0,0;1,1,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,2,0;2,0,1,0,0,0,0,2,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,1,0;2,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0;0,0,1,0,1,1,0,1,1,0,0,2,0,0,2,0,0,1,0,0,0,0,0,1,0,0;1,0,1,0,0,0,0,2,0,0,0,1,0,0,2,1,0,1,0,1,0,0,0,0,1,0;0,0,1,0,3,0,1,1,1,0,0,0,0,1,1,0,0,1,2,1,0,0,0,0,1,0;3,1,2,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,0,1,0;2,0,2,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,2,0;1,1,1,1,4,1,0,0,1,0,0,0,0,0,0,0,0,3,1,1,0,0,0,0,0,0;0,0,3,1,2,0,0,1,1,0,0,0,1,1,2,0,0,1,2,1,2,0,0,0,0,0;0,0,1,1,0,0,1,0,2,0,0,1,1,0,2,0,0,0,0,1,0,0,0,0,1,0;0,0,1,0,1,1,0,0,2,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0;2,1,1,0,1,1,0,0,1,0,0,0,0,0,1,0,0,1,1,1,1,0,0,0,0,0;2,0,0,1,3,0,1,0,1,0,0,0,3,1,1,0,0,0,1,2,0,0,0,0,0,0;1,0,0,0,0,0,2,1,1,0,1,0,0,1,1,1,0,0,0,1,0,0,0,0,1,0;1,0,0,0,1,0,1,1,0,0,0,0,0,1,1,1,0,0,0,2,0,0,0,0,1,0;1,0,1,1,0,0,0,1,1,0,0,1,0,0,3,2,0,0,0,1,0,0,0,0,2,0;2,0,0,0,0,0,1,1,1,0,0,1,1,1,2,1,0,0,0,1,0,0,0,0,1,0;3,0,1,0,0,0,0,2,1,0,0,0,1,1,1,1,0,1,0,2,0,0,0,0,1,0;1,0,0,0,1,0,0,0,2,0,0,0,0,1,1,1,0,2,1,1,0,0,0,0,0,0;1,0,0,0,0,0,0,1,1,0,0,0,0,1,1,2,0,0,0,1,0,0,0,0,1,0;2,1,1,0,2,0,0,0,1,0,0,0,0,0,2,1,0,2,0,2,0,0,0,0,0,0;1,0,0,1,1,0,0,1,1,0,0,0,0,0,1,2,0,1,0,2,0,0,0,0,1,0;1,0,1,0,2,0,0,1,1,0,0,0,0,0,1,1,0,1,2,1,0,0,0,0,0,0;0,0,0,0,3,0,1,0,1,0,0,0,0,1,0,0,0,1,3,2,0,0,0,0,1,0;0,0,1,0,3,0,0,0,1,0,0,0,0,1,0,0,0,1,1,2,1,0,0,0,0,0;2,1,1,1,3,1,0,1,1,0,0,1,1,0,2,0,0,2,1,2,1,0,0,0,0,0;1,0,0,0,2,0,1,1,0,0,0,0,1,0,2,0,0,1,0,2,0,0,0,0,0,0];
[U, D, V] = svd(A);
D4 = D .* (D >= D(4, 4));
A4 = U * D4 * V';
plants = (A4(:, 25) > 0.8);
bacteria = (A4(:, 25) <= 0.8);
% the rest is output
words = 1:33;
hold on
plot(words(plants), A4(plants, 25), 'g*');
plot(words(bacteria), A4(bacteria, 25), 'r*');

A matrix with real entries has positive characteristic polynomial if for all real . For example,

has this property: . For brevity, let’s say that is a PCP matrix.

Clearly, any PCP matrix must be of even size. Incidentally, this implies that the algebraists’ characteristic polynomial and the analysts’ characteristic polynomial coincide.

In general, there is no reason for the PCP property to be preserved under either addition or multiplication of matrices. But there are some natural rings of PCP matrices, such as

complex numbers

quaternions

A matrix is PCP if and only if its eigenvalues are either complex or repeated. This is equivalent to . In general, a matrix of even size is PCP if and only if it has no real eigenvalues of odd algebraic multiplicity.

Let’s say that a set contains an singular matrix if there exists an matrix with determinant zero whose entries are distinct elements of . For example, the set of prime numbers does not contain any singular matrices; however, for every it contains infinitely many singular matrices.

I don’t know of an elementary proof of the latter fact. By a 1939 theorem of van der Corput, the set of primes contains infinitely many progressions of length 3. (Much more recently, Green and Tao replaced 3 with an arbitrary .) If every row of a matrix begins with a 3-term arithmetic progression, the matrix is singular.