Let be a finite metric space, say of cardinality
. For
we cannot expect to have an isometric embedding of
into a Euclidean space; the 4-cycle with path metric cannot be so embedded. So we ask for an embedding
with reasonable distortion
. The distortion is defined as the product of the Lipschitz norms of
and of its inverse,
So, if and only if
scales all distances by the same factor (which we can then make equal to 1 by rescaling the target space).
It is easy to isometrically embed into the normed vector space
which consists of all functions
with the maximum norm. This (Kuratowski) embedding simply sends each point
into the function
. It is easy to see that
, attained when
. The identity map from
to
has distortion
, and this is the price
we have to pay.
In 1985 Bourgain achieved a dramatic improvement: , which is the best possible bound apart from the value of the absolute constant
. (It can be taken to be 128.) The starting point is remarkably simple: instead of
, which measures the distances of
to other points, Bourgain used
where
runs over all nonempty proper subsets of
. The fact that the target space now has
dimensions is of no theoretical consequence, since the image of
is always contained in an
dimensional hyperplane. (From the practical viewpoint, even dimension
is much too high, but the Johnson-Lindenstrauss flattening lemma reduces it greatly.) We can hope to improve upon the Kuratowski embedding because the geometry of
is now measured by more sophisticated probes than singletons.
As before, the map is easily seen to be an isometric embedding into
where
is the set of all nonempty proper subsets of
. But using the identity map onto
is out of question: its Lipschitz constant is almost
. Instead, Bourgain applies the diagonal operator
defined by
To understand this better, write as the direct sum of spaces
where
is the family of all
-point subsets of
. The dimension of
is exactly
, and the square root appears because we are after the
norm.
But it would be better to say that appears because
is halfway between
and
. Indeed, Bourgain applies
twice: first as a map
and then as
. The point is that
is easy to estimate: decomposing
by cardinality as above, we find that the
norm of
on
-subsets is at most
, due to the strategically placed factor
. Unfortunately, the harmonic series diverges, but fortunately it diverges slowly, yielding the estimate
. This is where the logarithm comes from.
The hard part of the proof is the lower bound with universal
. I will not say anything about that.
Let’s test this on the 4-cycle with the path metric. Split by subset cardinality
, as before. Let
denote the vertices. The image of the vertex
is:
- in
:
: this is the Kuratowski embedding. The singletons are ordered a,b,c,d.
- in
:
where the sets are ordered ab,ac,ad,bc,bd,cd.
- in
:
. Sets are ordered abc,abd,acd,bcd.
Now we apply , which means multiplication by
. The resulting vector
represents the vertex . In much the same way, the vertices b and c are mapped into
and
respectively. And so, while
. This is compared to
and
. The distortion
is only slightly worse than for the optimal embedding of
(into the vertices of a square), which has
. Well, we could hardly expect a general algorithm like this to find the absolutely best way to embed each finite metric space.
Apart from the constant, Bourgain’s theorem cannot be improved for general metric spaces. However, there are better embedding results for special classes of metric spaces (doubling, ultrametric etc.)