Let be a finite metric space, say of cardinality . For we cannot expect to have an isometric embedding of into a Euclidean space; the 4-cycle with path metric cannot be so embedded. So we ask for an embedding with reasonable distortion . The distortion is defined as the product of the Lipschitz norms of and of its inverse,

So, if and only if scales all distances by the same factor (which we can then make equal to 1 by rescaling the target space).

It is easy to isometrically embed into the normed vector space which consists of all functions with the maximum norm. This (Kuratowski) embedding simply sends each point into the function . It is easy to see that , attained when . The identity map from to has distortion , and this is the price we have to pay.

**In 1985 **Bourgain achieved a dramatic improvement: , which is the best possible bound apart from the value of the absolute constant . (It can be taken to be 128.) The starting point is remarkably simple: instead of , which measures the distances of to other points, Bourgain used where runs over all nonempty proper subsets of . The fact that the target space now has dimensions is of no theoretical consequence, since the image of is always contained in an dimensional hyperplane. (From the practical viewpoint, even dimension is much too high, but the Johnson-Lindenstrauss flattening lemma reduces it greatly.) We can hope to improve upon the Kuratowski embedding because the geometry of is now measured by more sophisticated probes than singletons.

As before, the map is easily seen to be an isometric embedding into where is the set of all nonempty proper subsets of . But using the identity map onto is out of question: its Lipschitz constant is almost . Instead, Bourgain applies the diagonal operator defined by

To understand this better, write as the direct sum of spaces where is the family of all -point subsets of . The dimension of is exactly , and the square root appears because we are after the norm.

But it would be better to say that appears because is halfway between and . Indeed, Bourgain applies **twice**: first as a map and then as . The point is that is easy to estimate: decomposing by cardinality as above, we find that the norm of on -subsets is at most , due to the strategically placed factor . Unfortunately, the harmonic series diverges, but fortunately it diverges slowly, yielding the estimate . This is where the logarithm comes from.

The hard part of the proof is the lower bound with universal . I will not say anything about that.

Let’s test this on the 4-cycle with the path metric. Split by subset cardinality , as before. Let denote the vertices. The image of the vertex is:

- in : : this is the Kuratowski embedding. The singletons are ordered a,b,c,d.
- in : where the sets are ordered ab,ac,ad,bc,bd,cd.
- in : . Sets are ordered abc,abd,acd,bcd.

Now we apply , which means multiplication by . The resulting vector

represents the vertex . In much the same way, the vertices b and c are mapped into

and

respectively. And so, while . This is compared to and . The distortion is only slightly worse than for the optimal embedding of (into the vertices of a square), which has . Well, we could hardly expect a general algorithm like this to find the absolutely best way to embed each finite metric space.

Apart from the constant, Bourgain’s theorem cannot be improved for general metric spaces. However, there are better embedding results for special classes of metric spaces (doubling, ultrametric etc.)