Measure-distanced subsets of an interval

Suppose {(X, d)} is a bounded metric space in which we want to find {n} points at safe distance {\delta} from one other: {d(x_j, x_k)\ge \delta} for all {j\ne k}. Let {\delta_n} be the greatest value of {\delta} for which this is possible (technically, the supremum which might not be attained). We have {\mathrm{diam}\,(X) = \delta_2 \ge \delta_3\ge \cdots}.

The sequence {\delta_n} tends to zero if and only if {X} is totally bounded. From now on, consider a specific metric space which is bounded but not totally bounded: the space {X} of all measurable subsets of {[0, 1]} with the metric {d(A, B)=|A\triangle B|}, the measure of the symmetric difference. More specifically, the elements of {X} are equivalence classes of sets that differ by a null subset. Another way to think of it is the subset of {L^1([0, 1])} which consists of functions that attain only the values {0} and {1}. In this form we identify a set {A} with its characteristic function {\chi_A}. The measure space {[0, 1]} could be replaced by any isomorphic probability space, of which there are many.

The space {X} contains an infinite subset {\{E_1, E_2, \dots \}} with {\delta=1/2} separation: namely, let {E_k} be the set of all numbers {x\in [0, 1]} such that the fractional part of {2^k x} is less than {1/2}; or equivalently, the {k}th binary digit of {x} is zero. Each symmetric difference {E_j\triangle E_k} has measure exactly {1/2}. Therefore, {\delta_n\ge 1/2} for every {n}.

Upper bound for distances

To get an upper bound for {\delta_n}, suppose {\{E_1, \dots, E_n\}} are {\delta}-separated. Let {\chi_k} be the characteristic function of {E_k}. By assumption, {\int_0^1 |\chi_k-\chi_j|\ge \delta} whenever {1\le k < j \le n}. Summing over all such pairs {(k, j)} shows that {\int_0^1 \sum_{k<j} |\chi_k-\chi_j|\ge \delta n(n-1)/2}. Fix a point {x\in [0, 1]} and let {p} be the number of sets among {E_1, \dots, E_n} that contain {x}. An easy counting argument shows {\sum_{k<j} |\chi_k(x)-\chi_j(x)| = p(n-p)}. The latter quantity is bounded by {r(n-r)} where {r = \lceil n/2\rceil}. Since this bound holds for every {x}, it follows that {\delta n(n-1)/2 \le r(n-r)}, which simplifies to

{\displaystyle \delta \le \frac{r}{2r-1}}, where {r = \lceil n/2\rceil}.

For example, {\delta_4\le \delta_3\le 2/3}, {\delta_6\le \delta_5\le 3/5}, {\delta_8\le \delta_7\le 4/7}, and so on. Are these estimates sharp? For an affirmative answer it would be enough to demonstrate that {\delta_{2r}\ge r/(2r-1)} by finding a suitable collection of {2r} subsets. Note that sharpness requires that almost every point of {[0, 1]} be covered by exactly {r} of these subsets, and that all pairwise distances between sets must be equal to {r/(2r-1)}.

Sharpness of the upper bound

Four subsets with {2/3} separation are easy to find: {[0,1/3]}, {[1/3, 2/3]}, {[2/3, 1]}, and {[0, 1]}. Thus, {\delta_4=2/3}.

To find six subsets with {3/5} separation, some preparation is helpful.

For any set {A}, the map {E\mapsto E\triangle A} is an isometry of {X} onto {X}. So, there is no loss of generality in assuming that one of our {2r} sets is {[0, 1]}. Then we are to find {E_1,\dots, E_{2r-1}} of measure {1-r/(2r-1) = (r-1)/(2r-1)}. In order to have the specified distances, the measure of intersection of any two sets must be {(r-2)/(4r-2)}.

For example, if {r=2}, the above says we need three disjoint sets of measure {1/3}, which is a trivial thing to find. For {r>2} the sets will no longer be disjoint.

Discretization of the problem

A natural way to proceed is to partition {[0, 1]} into {4r-2} equal subintervals (or other sets of equal measure). Then we look for {2r-1} sets, each of which consists of {(4r-2)(r-1)/(2r-1) = 2r-2} subintervals. Their pairwise intersections must consist of {(4r-2)(r-2)/(4r-2) = r-2} subintervals.

The case {r=2} was done above. If {r=3}, we want to use {10} subintervals to form {5} sets with {4} subintervals in each. Pairwise intersections must consist of precisely {1} subinterval. Encoding subintervals by their indices 0-9, we can describe such sets as 0123, 3456, 6780, 1479, 2589. And since 10 is a triangular number, there is a neat illustration of these five sets: three sides of the triangle, and two “stars” connecting its center to the sides.

I admit it could be neater

There is no need to translate this back into sets formed by subintervals: as long as the combinatorial problem has a solution, we have the desired conclusion. In this case, the conclusion is {\delta_6=3/5}.

Next up, {r=4}. Following the above method, we partition {[0, 1]} into {4r-2 = 14} equal subintervals and look for {7} sets, each of which consists of {6} subintervals. Pairwise intersections must consist of {2} subintervals each. Can this be done?

Think of a cube. It has {8} vertices and {6} faces, and {8+6=14}. In the set {faces and vertices}, we choose {7} subsets of cardinality {6} as follows:

  • {face, all its vertices, and the opposite face}. There are 6 such sets. Any two of them have exactly 2 elements in common.
  • {all faces} is our 7th set, it has 2 elements in common with each of the above.

Thus, {\delta_8 = 4/7}. But it is concerning that {\delta_6} and {\delta_8} required different combinatorial constructions. What happens next?

For {r=5}, we partition {[0, 1]} into {18} equal subintervals and look for {9} sets, with {8} subintervals in each. Pairwise intersections must consist of {3} subintervals. I don’t see any neat construction here.

Summary so far

The results obtained so far for {r=2, 3, 4} can be summarized as follows: there exists a 0-1 matrix {B_r} of size {(2r-1)\times (4r-2)} in which the dot product of two rows is {r-2} when the rows are distinct and {2r-2} otherwise. Specifically,

{\displaystyle B_2 = \begin{pmatrix} 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \end{pmatrix} }

{\displaystyle B_3 = \begin{pmatrix} 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \\ \end{pmatrix}}

The matrix {B_4} is best written in block form {[F | V]} according to the faces-vertices interpretation. Here

{\displaystyle F = \begin{pmatrix} 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 \end{pmatrix}} and {\displaystyle V = \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ \end{pmatrix}}

Does {B_5} exist? Perhaps not.

Combinatorics of finite sets

In combinatorial terms, we are looking for a {k}-uniform {\ell}-intersecting family of {m} subsets of {[n]}, with specific values

{k = 2r-2}, {\ell=r-2}, {m=2r-1}, {n=4r-2}.

Explanation of terms: {[n]=\{1, 2, \dots, n\}}, being {k}-uniform means every set has {k} elements, and being {\ell}-intersecting means all pairwise intersections have {\ell} elements.

Thus began my fishing expedition into the literature on combinatorics of finite sets. Surprisingly, it ended with a pivot back to where it started: measurable subsets of an interval.

Fractional intersecting uniform families

A recent (2018) preprint Uniform sets in a family with restricted intersections (Yandong Bai, Binlong Li, Jiuqiang Liu, Shenggui Zhang) relates the combinatorial problem back to subsets of an interval. I quote the relevant result (Theorem 2.5) with slight changes of notation.


Let {X=[0,n]} be the real interval and let {\lambda} be the Lebesgue measure on {X}. If {\mathcal{F}=\{F_1,\ldots,F_m\}} is a family of subsets of {X} such that

  1. {\lambda(F_i)=k} for all {i\in [m]}, and
  2. {\lambda(F_i\cap F_j)=\ell} for all distinct {i,j\in[m]},

then we call {\mathcal{F}} a fractional {\ell}-intersecting {k}-uniform {m}-family of {X}. For given real numbers {n,\ell} and integer {m}, let {\kappa_{\ell}^{\text{frac}}(n,m)} be the largest real number {k} such that there exists a fractional {\ell}-intersecting {k}-uniform {m}-family of {[0,n]}.

Theorem 2.5 Let {n>\ell\ge 0} be two real numbers and let {m\ge 1} be an integer. Then

{\displaystyle \kappa_{\ell}^{\text{frac}}(n,m)={m-1\choose s-1}\alpha+{m-1\choose t-1}\beta, }

where {\displaystyle s=\left\lfloor\frac{1+\sqrt{1+4m(m-1)\ell/n}}{2}\right\rfloor}, {\displaystyle t=\left\lceil\frac{1+\sqrt{1+4m(m-1)\ell/n}}{2}\right\rceil}

and {(\alpha,\beta)} is the solution of the system

{\displaystyle \binom{m}{s}\alpha+\binom{m}{t}\beta=n}

{\displaystyle \binom{m-2}{s-2}\alpha+\binom{m-2}{t-2}\beta=\ell}.


This looks complicated, but with our values {\ell=r-2}, {m=2r-1}, {n=4r-2} both {s} and {t} turn out to be equal to {r-1}. This is an exceptional case when the linear system is degenerate: it amounts to {\alpha+\beta = (4r-2)/\binom{2r-1}{r-1}}. The final result is {\kappa_{\ell}^{\text{frac}}(n,m)=2r-2} as we wanted. Yay.

So, the upper bounds for distances are indeed sharp, meaning that {n} subsets of the unit interval can be separated by measure-distance of {r/(2r-1)}, where {r = \lceil n/2\rceil}. But dividing the intervals into {4r-2} equal parts may not be enough to construct such subsets: in general one has to use a larger number of parts, some integer multiple of {4r-2}.

The authors remark that if {\alpha, \beta} are integers, then a partition with {k=\kappa^{\text{frac}}} exists in the combinatorial realm of {\{1, \dots, n\}}. However, this is not “if and only if”. Indeed, {\alpha+\beta = (4r-2)/\binom{2r-1}{r-1}} which takes integer values when {r=1, 2, 3}, and is fractional (less than 1) when {r\ge 4}. But we do have a combinatorial solution when {r=4}. So it is not quite clear for what values of {r} there exists a {(2r-2)}-uniform {(r-2)}-intersecting family of {2r-1} subsets of {[4r-2]}. Perhaps an answer could be found by following the construction for this specific case, when the formulas magically simplify.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.