## Graphs with an invertible incidence matrix

Graphs are often encoded by their adjacency matrix: a symmetric matrix where ${1}$ in the entry ${(i,j)}$ means there is an edge between vertices labeled ${i}$ and ${j}$. Another useful encoding is the incidence matrix, in which rows correspond to edges and columns to vertices (or the other way around). Having ${1}$ in the entry ${(i,j)}$ means that ${i}$th edge is incident to ${j}$th vertex. Simply put, each row contains exactly two nonzero entries, which indicate the endpoints of an edge.

An incidence matrix is generally rectangular. It is the square matrix when the graph has as many edges as vertices. For example, the 4-cycle has incidence matrix

${M = \displaystyle \begin{pmatrix} 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 1 \end{pmatrix}}$

(up to relabeling of vertices). This matrix is not invertible. Indeed, ${M(1,-1,1,-1)^T=0}$, which is a reflection of the fact that the graph is bipartite. Indeed, for any bipartite graph the ${\pm 1}$ vector describing a 2-coloring of the vertices belongs to the kernel of the incidence matrix.

The incidence matrix of an odd cycle is invertible, however. What other such “invertible” graphs exist? Clearly, the disjoint union of “invertible” graphs is itself invertible, so we can focus on connected graphs.

There are no invertible graphs on 1 or 2 vertices. For 3 through 8 vertices, exhaustive search yields the following counts:

• 1 graph on 3 vertices (3-cycle)
• 1 graph on 4 vertices (3-cycle with an edge attached)
• 4 graphs on 5 vertices
• 8 graphs on 6 vertices
• 23 graphs on 7 vertices
• 55 graphs on 8 vertices

The numbers match the OEIS sequence A181688 but it’s not clear whether this is for real or a coincidence. The brute force search takes long for more than 8 vertices. It’s probably better to count such graphs by using their (folklore?) characterization as “trees planted on an odd cycle”, described by Chris Godsil on MathOverflow.

Some examples:

The examples were found and plotted by this Python script.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

def is_invertible(graph, n):
matrix = []
for i in range(n):
row = [0]*n
row[graph[i][0]] = 1
row[graph[i][1]] = 1
matrix.append(row)
return abs(np.linalg.det(matrix)) > 0.5

n = 6   # number of vertices
graphs = combinations(combinations(range(n), 2), n)
inv = []
for g in graphs:
if is_invertible(g, n):
gr = nx.Graph(g)
if nx.is_connected(gr) and not any([nx.is_isomorphic(gr, gr2) for gr2 in inv]):
inv.append(gr)
nx.draw_networkx(gr)
plt.savefig("graph{}_{}.png".format(n, len(inv)))
plt.clf()

print("There are {} connected invertible graphs with {} vertices".format(len(inv), n))

## Classifying words: a tiny example of SVD truncation

Given a bunch of words, specifically the names of divisions of plants and bacteria, I’m going to use a truncated Singular Value Decomposition to separate bacteria from plants. This isn’t a novel or challenging task, but I like the small size of the example. A similar type of examples is classifying a bunch of text fragments by keywords, but that requires a lot more setup.

Here are 33 words to classify: acidobacteria, actinobacteria, anthocerotophyta, aquificae, bacteroidetes, bryophyta, charophyta, chlamydiae, chloroflexi, chlorophyta, chrysiogenetes, cyanobacteria, cycadophyta, deferribacteres, deinococcus-thermus, dictyoglomi, firmicutes, fusobacteria, gemmatimonadetes, ginkgophyta, gnetophyta, lycopodiophyta, magnoliophyta, marchantiophyta, nitrospirae, pinophyta, proteobacteria, pteridophyta, spirochaetes, synergistetes, tenericutes, thermodesulfobacteria, thermotogae.

As is, the task is too easy: we can recognize the -phyta ending in the names of plant divisions. Let’s jumble the letters within each word:

Not so obvious anymore, is it? Recalling the -phyta ending, we may want to focus on the presence of letter y, which is not so common otherwise. Indeed, the count of y letters is a decent prediction: on the following plot, green asterisks are plants and red are bacteria, the vertical axis is the count of letter Y in each word.

However, the simple count fails to classify several words: having 1 letter Y may or may not mean a plant. Instead, let’s consider the entire matrix ${A}$ of letter counts (here it is in a spreadsheet: 33 rows, one for each word; 26 columns, one for each letter.) So far, we looked at its 25th column in isolation from the rest of the matrix. Truncated SVD uncovers the relations between columns that are not obvious but express patterns such as the presence of letters p,h,t,a along with y. Specifically, write ${A = UDV^T}$ with ${U,V}$ unitary and ${D}$ diagonal. Replace all entries of ${D}$, except the four largest ones, by zeros. The result is a rank-4 diagonal matrix ${D_4}$. The product ${A_4 = UD_4V^T}$ is a rank-4 matrix, which keeps some of the essential patterns in ${A}$ but de-emphasizes the accidental.

The entries of ${D_4}$ are no longer integers. Here is a color-coded plot of its 25th column, which still somehow corresponds to letter Y but takes into account the other letters with which it appears.

Plants are now cleanly separated from bacteria. Plots made in MATLAB as follows:


A=[3,1,2,1,1,0,0,0,2,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0;3,1,2,0,1,0,0,0,2,0,0,0,0,1,1,0,0,1,0,2,0,0,0,0,0,0;2,0,1,0,1,0,0,2,0,0,0,0,0,1,3,1,0,1,0,3,0,0,0,0,1,0;2,0,1,0,1,1,0,0,2,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0;1,1,1,1,3,0,0,0,1,0,0,0,0,0,1,0,0,1,1,2,0,0,0,0,0,0;1,1,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,2,0;2,0,1,0,0,0,0,2,0,0,0,0,0,0,1,1,0,1,0,1,0,0,0,0,1,0;2,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,0;0,0,1,0,1,1,0,1,1,0,0,2,0,0,2,0,0,1,0,0,0,0,0,1,0,0;1,0,1,0,0,0,0,2,0,0,0,1,0,0,2,1,0,1,0,1,0,0,0,0,1,0;0,0,1,0,3,0,1,1,1,0,0,0,0,1,1,0,0,1,2,1,0,0,0,0,1,0;3,1,2,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,0,0,1,0;2,0,2,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,2,0;1,1,1,1,4,1,0,0,1,0,0,0,0,0,0,0,0,3,1,1,0,0,0,0,0,0;0,0,3,1,2,0,0,1,1,0,0,0,1,1,2,0,0,1,2,1,2,0,0,0,0,0;0,0,1,1,0,0,1,0,2,0,0,1,1,0,2,0,0,0,0,1,0,0,0,0,1,0;0,0,1,0,1,1,0,0,2,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0;2,1,1,0,1,1,0,0,1,0,0,0,0,0,1,0,0,1,1,1,1,0,0,0,0,0;2,0,0,1,3,0,1,0,1,0,0,0,3,1,1,0,0,0,1,2,0,0,0,0,0,0;1,0,0,0,0,0,2,1,1,0,1,0,0,1,1,1,0,0,0,1,0,0,0,0,1,0;1,0,0,0,1,0,1,1,0,0,0,0,0,1,1,1,0,0,0,2,0,0,0,0,1,0;1,0,1,1,0,0,0,1,1,0,0,1,0,0,3,2,0,0,0,1,0,0,0,0,2,0;2,0,0,0,0,0,1,1,1,0,0,1,1,1,2,1,0,0,0,1,0,0,0,0,1,0;3,0,1,0,0,0,0,2,1,0,0,0,1,1,1,1,0,1,0,2,0,0,0,0,1,0;1,0,0,0,1,0,0,0,2,0,0,0,0,1,1,1,0,2,1,1,0,0,0,0,0,0;1,0,0,0,0,0,0,1,1,0,0,0,0,1,1,2,0,0,0,1,0,0,0,0,1,0;2,1,1,0,2,0,0,0,1,0,0,0,0,0,2,1,0,2,0,2,0,0,0,0,0,0;1,0,0,1,1,0,0,1,1,0,0,0,0,0,1,2,0,1,0,2,0,0,0,0,1,0;1,0,1,0,2,0,0,1,1,0,0,0,0,0,1,1,0,1,2,1,0,0,0,0,0,0;0,0,0,0,3,0,1,0,1,0,0,0,0,1,0,0,0,1,3,2,0,0,0,0,1,0;0,0,1,0,3,0,0,0,1,0,0,0,0,1,0,0,0,1,1,2,1,0,0,0,0,0;2,1,1,1,3,1,0,1,1,0,0,1,1,0,2,0,0,2,1,2,1,0,0,0,0,0;1,0,0,0,2,0,1,1,0,0,0,0,1,0,2,0,0,1,0,2,0,0,0,0,0,0];
[U, D, V] = svd(A);
D4 = D .* (D >= D(4, 4));
A4 = U * D4 * V';
plants = (A4(:, 25) > 0.8);
bacteria = (A4(:, 25) <= 0.8);
% the rest is output
words = 1:33;
hold on
plot(words(plants), A4(plants, 25), 'g*');
plot(words(bacteria), A4(bacteria, 25), 'r*');


## Angel dust of matrices

A matrix ${A}$ with real entries has positive characteristic polynomial if ${\det(tI-A)\ge 0}$ for all real ${t}$. For example,

$\displaystyle A=\begin{pmatrix} 1 & -5 \\ 3 & -2 \end{pmatrix}$

has this property: ${\det (tI-A)=(t-1)(t+2)+15=t^2+t+13}$. For brevity, let’s say that ${A}$ is a PCP matrix.

Clearly, any PCP matrix must be of even size. Incidentally, this implies that the algebraists’ characteristic polynomial ${\det (tI-A)}$ and the analysts’ characteristic polynomial ${\det (A-tI)}$ coincide.

In general, there is no reason for the PCP property to be preserved under either addition or multiplication of matrices. But there are some natural rings of PCP matrices, such as

• complex numbers ${\begin{pmatrix} a & b \\ -b & a \end{pmatrix} }$
• quaternions ${\begin{pmatrix} a & b & c & d \\ -b & a & -d & c \\ -c & d & a & -b \\ -d & -c & b & a \end{pmatrix}}$

A ${2\times 2}$ matrix is PCP if and only if its eigenvalues are either complex or repeated. This is equivalent to ${(\mathrm{tr}\, A)^2 \le 4\det A}$. In general, a matrix of even size is PCP if and only if it has no real eigenvalues of odd algebraic multiplicity.

Is this post merely an excuse for 1995 flashback: Ангельская Пыль by Ария?

## Integer sets without singular matrices

Let’s say that a set $A\subset \mathbb N$ contains an $n\times n$ singular matrix if there exists an $n\times n$ matrix with determinant zero whose entries are distinct elements of $A$. For example, the set of prime numbers does not contain any $2\times 2$ singular matrices; however, for every $n\ge 3$ it contains infinitely many $n\times n$ singular matrices.

I don’t know of an elementary proof of the latter fact. By a 1939 theorem of van der Corput, the set of primes contains infinitely many progressions of length 3. (Much more recently, Green and Tao replaced 3 with an arbitrary $k$.) If every row of a matrix begins with a 3-term arithmetic progression, the matrix is singular.

In the opposite direction, one may want to see an example of an infinite set $A\subset \mathbb N$ that contains no singular matrices of any size. Here is one:
Continue reading “Integer sets without singular matrices”