## Measure-distanced subsets of an interval

Suppose ${(X, d)}$ is a bounded metric space in which we want to find ${n}$ points at safe distance ${\delta}$ from one other: ${d(x_j, x_k)\ge \delta}$ for all ${j\ne k}$. Let ${\delta_n}$ be the greatest value of ${\delta}$ for which this is possible (technically, the supremum which might not be attained). We have ${\mathrm{diam}\,(X) = \delta_2 \ge \delta_3\ge \cdots}$.

The sequence ${\delta_n}$ tends to zero if and only if ${X}$ is totally bounded. From now on, consider a specific metric space which is bounded but not totally bounded: the space ${X}$ of all measurable subsets of ${[0, 1]}$ with the metric ${d(A, B)=|A\triangle B|}$, the measure of the symmetric difference. More specifically, the elements of ${X}$ are equivalence classes of sets that differ by a null subset. Another way to think of it is the subset of ${L^1([0, 1])}$ which consists of functions that attain only the values ${0}$ and ${1}$. In this form we identify a set ${A}$ with its characteristic function ${\chi_A}$. The measure space ${[0, 1]}$ could be replaced by any isomorphic probability space, of which there are many.

The space ${X}$ contains an infinite subset ${\{E_1, E_2, \dots \}}$ with ${\delta=1/2}$ separation: namely, let ${E_k}$ be the set of all numbers ${x\in [0, 1]}$ such that the fractional part of ${2^k x}$ is less than ${1/2}$; or equivalently, the ${k}$th binary digit of ${x}$ is zero. Each symmetric difference ${E_j\triangle E_k}$ has measure exactly ${1/2}$. Therefore, ${\delta_n\ge 1/2}$ for every ${n}$.

## Upper bound for distances

To get an upper bound for ${\delta_n}$, suppose ${\{E_1, \dots, E_n\}}$ are ${\delta}$-separated. Let ${\chi_k}$ be the characteristic function of ${E_k}$. By assumption, ${\int_0^1 |\chi_k-\chi_j|\ge \delta}$ whenever ${1\le k < j \le n}$. Summing over all such pairs ${(k, j)}$ shows that ${\int_0^1 \sum_{k. Fix a point ${x\in [0, 1]}$ and let ${p}$ be the number of sets among ${E_1, \dots, E_n}$ that contain ${x}$. An easy counting argument shows ${\sum_{k. The latter quantity is bounded by ${r(n-r)}$ where ${r = \lceil n/2\rceil}$. Since this bound holds for every ${x}$, it follows that ${\delta n(n-1)/2 \le r(n-r)}$, which simplifies to

${\displaystyle \delta \le \frac{r}{2r-1}}$, where ${r = \lceil n/2\rceil}$.

For example, ${\delta_4\le \delta_3\le 2/3}$, ${\delta_6\le \delta_5\le 3/5}$, ${\delta_8\le \delta_7\le 4/7}$, and so on. Are these estimates sharp? For an affirmative answer it would be enough to demonstrate that ${\delta_{2r}\ge r/(2r-1)}$ by finding a suitable collection of ${2r}$ subsets. Note that sharpness requires that almost every point of ${[0, 1]}$ be covered by exactly ${r}$ of these subsets, and that all pairwise distances between sets must be equal to ${r/(2r-1)}$.

## Sharpness of the upper bound

Four subsets with ${2/3}$ separation are easy to find: ${[0,1/3]}$, ${[1/3, 2/3]}$, ${[2/3, 1]}$, and ${[0, 1]}$. Thus, ${\delta_4=2/3}$.

To find six subsets with ${3/5}$ separation, some preparation is helpful.

For any set ${A}$, the map ${E\mapsto E\triangle A}$ is an isometry of ${X}$ onto ${X}$. So, there is no loss of generality in assuming that one of our ${2r}$ sets is ${[0, 1]}$. Then we are to find ${E_1,\dots, E_{2r-1}}$ of measure ${1-r/(2r-1) = (r-1)/(2r-1)}$. In order to have the specified distances, the measure of intersection of any two sets must be ${(r-2)/(4r-2)}$.

For example, if ${r=2}$, the above says we need three disjoint sets of measure ${1/3}$, which is a trivial thing to find. For ${r>2}$ the sets will no longer be disjoint.

## Discretization of the problem

A natural way to proceed is to partition ${[0, 1]}$ into ${4r-2}$ equal subintervals (or other sets of equal measure). Then we look for ${2r-1}$ sets, each of which consists of ${(4r-2)(r-1)/(2r-1) = 2r-2}$ subintervals. Their pairwise intersections must consist of ${(4r-2)(r-2)/(4r-2) = r-2}$ subintervals.

The case ${r=2}$ was done above. If ${r=3}$, we want to use ${10}$ subintervals to form ${5}$ sets with ${4}$ subintervals in each. Pairwise intersections must consist of precisely ${1}$ subinterval. Encoding subintervals by their indices 0-9, we can describe such sets as 0123, 3456, 6780, 1479, 2589. And since 10 is a triangular number, there is a neat illustration of these five sets: three sides of the triangle, and two “stars” connecting its center to the sides.

There is no need to translate this back into sets formed by subintervals: as long as the combinatorial problem has a solution, we have the desired conclusion. In this case, the conclusion is ${\delta_6=3/5}$.

Next up, ${r=4}$. Following the above method, we partition ${[0, 1]}$ into ${4r-2 = 14}$ equal subintervals and look for ${7}$ sets, each of which consists of ${6}$ subintervals. Pairwise intersections must consist of ${2}$ subintervals each. Can this be done?

Think of a cube. It has ${8}$ vertices and ${6}$ faces, and ${8+6=14}$. In the set {faces and vertices}, we choose ${7}$ subsets of cardinality ${6}$ as follows:

• {face, all its vertices, and the opposite face}. There are 6 such sets. Any two of them have exactly 2 elements in common.
• {all faces} is our 7th set, it has 2 elements in common with each of the above.

Thus, ${\delta_8 = 4/7}$. But it is concerning that ${\delta_6}$ and ${\delta_8}$ required different combinatorial constructions. What happens next?

For ${r=5}$, we partition ${[0, 1]}$ into ${18}$ equal subintervals and look for ${9}$ sets, with ${8}$ subintervals in each. Pairwise intersections must consist of ${3}$ subintervals. I don’t see any neat construction here.

## Summary so far

The results obtained so far for ${r=2, 3, 4}$ can be summarized as follows: there exists a 0-1 matrix ${B_r}$ of size ${(2r-1)\times (4r-2)}$ in which the dot product of two rows is ${r-2}$ when the rows are distinct and ${2r-2}$ otherwise. Specifically,

${\displaystyle B_2 = \begin{pmatrix} 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \end{pmatrix} }$

${\displaystyle B_3 = \begin{pmatrix} 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 \\ \end{pmatrix}}$

The matrix ${B_4}$ is best written in block form ${[F | V]}$ according to the faces-vertices interpretation. Here

${\displaystyle F = \begin{pmatrix} 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 & 1 \end{pmatrix}}$ and ${\displaystyle V = \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ \end{pmatrix}}$

Does ${B_5}$ exist? Perhaps not.

## Combinatorics of finite sets

In combinatorial terms, we are looking for a ${k}$-uniform ${\ell}$-intersecting family of ${m}$ subsets of ${[n]}$, with specific values

${k = 2r-2}$, ${\ell=r-2}$, ${m=2r-1}$, ${n=4r-2}$.

Explanation of terms: ${[n]=\{1, 2, \dots, n\}}$, being ${k}$-uniform means every set has ${k}$ elements, and being ${\ell}$-intersecting means all pairwise intersections have ${\ell}$ elements.

Thus began my fishing expedition into the literature on combinatorics of finite sets. Surprisingly, it ended with a pivot back to where it started: measurable subsets of an interval.

## Fractional intersecting uniform families

A recent (2018) preprint Uniform sets in a family with restricted intersections (Yandong Bai, Binlong Li, Jiuqiang Liu, Shenggui Zhang) relates the combinatorial problem back to subsets of an interval. I quote the relevant result (Theorem 2.5) with slight changes of notation.

Let ${X=[0,n]}$ be the real interval and let ${\lambda}$ be the Lebesgue measure on ${X}$. If ${\mathcal{F}=\{F_1,\ldots,F_m\}}$ is a family of subsets of ${X}$ such that

1. ${\lambda(F_i)=k}$ for all ${i\in [m]}$, and
2. ${\lambda(F_i\cap F_j)=\ell}$ for all distinct ${i,j\in[m]}$,

then we call ${\mathcal{F}}$ a fractional ${\ell}$-intersecting ${k}$-uniform ${m}$-family of ${X}$. For given real numbers ${n,\ell}$ and integer ${m}$, let ${\kappa_{\ell}^{\text{frac}}(n,m)}$ be the largest real number ${k}$ such that there exists a fractional ${\ell}$-intersecting ${k}$-uniform ${m}$-family of ${[0,n]}$.

Theorem 2.5 Let ${n>\ell\ge 0}$ be two real numbers and let ${m\ge 1}$ be an integer. Then

${\displaystyle \kappa_{\ell}^{\text{frac}}(n,m)={m-1\choose s-1}\alpha+{m-1\choose t-1}\beta, }$

where ${\displaystyle s=\left\lfloor\frac{1+\sqrt{1+4m(m-1)\ell/n}}{2}\right\rfloor}$, ${\displaystyle t=\left\lceil\frac{1+\sqrt{1+4m(m-1)\ell/n}}{2}\right\rceil}$

and ${(\alpha,\beta)}$ is the solution of the system

${\displaystyle \binom{m}{s}\alpha+\binom{m}{t}\beta=n}$

${\displaystyle \binom{m-2}{s-2}\alpha+\binom{m-2}{t-2}\beta=\ell}$.

This looks complicated, but with our values ${\ell=r-2}$, ${m=2r-1}$, ${n=4r-2}$ both ${s}$ and ${t}$ turn out to be equal to ${r-1}$. This is an exceptional case when the linear system is degenerate: it amounts to ${\alpha+\beta = (4r-2)/\binom{2r-1}{r-1}}$. The final result is ${\kappa_{\ell}^{\text{frac}}(n,m)=2r-2}$ as we wanted. Yay.

So, the upper bounds for distances are indeed sharp, meaning that ${n}$ subsets of the unit interval can be separated by measure-distance of ${r/(2r-1)}$, where ${r = \lceil n/2\rceil}$. But dividing the intervals into ${4r-2}$ equal parts may not be enough to construct such subsets: in general one has to use a larger number of parts, some integer multiple of ${4r-2}$.

The authors remark that if ${\alpha, \beta}$ are integers, then a partition with ${k=\kappa^{\text{frac}}}$ exists in the combinatorial realm of ${\{1, \dots, n\}}$. However, this is not “if and only if”. Indeed, ${\alpha+\beta = (4r-2)/\binom{2r-1}{r-1}}$ which takes integer values when ${r=1, 2, 3}$, and is fractional (less than 1) when ${r\ge 4}$. But we do have a combinatorial solution when ${r=4}$. So it is not quite clear for what values of ${r}$ there exists a ${(2r-2)}$-uniform ${(r-2)}$-intersecting family of ${2r-1}$ subsets of ${[4r-2]}$. Perhaps an answer could be found by following the construction for this specific case, when the formulas magically simplify.

## Measurability of Banach space valued functions

There is only an indirect proof of the existence of a function ${f\colon [0, 1]\to \mathbb R}$ that is not Lebesgue measurable. But it’s easy to give an explicit example when the codomain of ${f}$ is a Banach space: just let ${b(t)}$ be the sequence of the binary digits of ${t}$, considered as an element of the sequence space ${\ell_\infty}$.

Why is ${b}$ not measurable? Recall that a Banach space-valued function ${f}$ is (Bochner) measurable iff there is a sequence of simple functions ${\sum v_k \chi_{A_k}}$ (finite sum, measurable ${A_k}$, arbitrary vectors ${v_k}$) that converges to ${f}$ almost everywhere. This property implies that, with an exception of a null set, the range of ${f}$ lies in the separable subspace spanned by all the vectors ${v_k}$ used in the sequence of simple functions. But ${b}$ has the property ${\|b(t)-b(s)\|=1}$ whenever ${t\ne s}$, so the image of any uncountable set under ${b}$ is nonseparable.

Another way to look at this: on the interval [0, 1) the function ${b}$ is injective and its range has discrete topology, which implies that every subset of [0, 1) is the preimage of some open subset of ${\ell_\infty}$ under ${b}$.

The binary-digits functions can also be used to illustrate an issue with the duality of Lebesgue-Bochner spaces ${L_p(0, 1; X)}$ where ${X}$ is a Banach space and ${1\le p<\infty}$. (So, ${f}$ belongs to this space iff it is Bochner measurable and the ${L^p}$ norm of ${\|f\|\colon [0, 1]\to [0, \infty)}$ is finite.) In general we do not have the expected relation ${L_p(0, 1; X)^* = L_q(0, 1; X^*)}$ with ${1/p+1/q=1}$. The natural isometric embedding of ${L_q(0, 1; X^*)}$ into ${L_p(0, 1; X)^*}$ is still there: any ${g\in L_q(0, 1; X^*)}$ acts on ${L_p(0, 1; X)}$ by ${f\mapsto \int \langle f(t), g(t) \rangle\, dt}$. But unless ${X^*}$ has the Radonâ€“Nikodym property, these are more bounded linear functionals on ${L_p(0, 1; X)}$.

To construct such a functional, let ${b_n(t)}$ be the ${n}$-th binary digit of ${t}$. Given ${f\in L_1(0, 1; \ell_1)}$, write it in coordinates as ${(f_1, f_2, \dots)}$ and define ${\varphi(f) = \sum_n \int_0^1 f_n b_n}$. This is a bounded linear functional, since ${|\varphi(f)|\le \sum_n \int_0^1 |f_n| = \|f\|}$. But there is no function ${g\in L_\infty(0, 1; \ell_\infty)}$ that represents it, i.e., ${\varphi(f) = \int_0^1 \langle f(t), g(t)\rangle \,dt = \sum_n \int_0^1 f_n g_n }$. Indeed, if such ${g}$ existed then by considering ${f}$ with only one nonzero coordinate, we find that ${g_n}$ must be ${b_n}$, using the duality ${L_1^* = L_\infty}$ in the scalar case. But the function ${[0, 1]\to \ell_\infty}$ with the components ${(b_1, b_2, \dots)}$ is not measurable, as shown above.

This example, which applies to all ${1\le p<\infty}$, also serves as a reminder that the duality relation ${L_p(0, 1; X)^* = L_q(0, 1; X^*)}$ depends on the dual space ${X^*}$ having the Radon-Nikodym property (RNP), not ${X}$ itself. Indeed, ${X=\ell_1}$ has the RNP; its dual does not.

The importance of ${X^*}$ having the RNP becomes clear once one tries to follow the usual proof of ${L_p^*=L_q}$. Given ${\varphi\in L_p(0,1;X)^*}$, we can define an ${X^*}$-valued measure ${\tau}$ on ${[0, 1]}$ by ${\tau(A)(v) = \varphi( v\chi_A)}$ where ${A\subset [0, 1]}$ is Lebesgue measurable and ${v\in X}$. This measure has reasonable finiteness and continuity properties coming from ${\varphi}$ being a bounded functional. Still, the existence of a density ${g\colon [0, 1]\to X^*}$ of the measure ${\tau}$ depends on the structure of the Banach space ${X^*}$.

## Diagonal

The diagonal ${D=\{(x,x):0\le x\le 1\}}$ of the unit square ${Q=[0,1]\times [0,1]}$ is a mysterious thing. For one thing, its length is an irrational number. But this is not what I’m writing about.

The diagonal has zero area. Lebesgue integrable functions on ${Q}$ form the normed space ${L^1(Q)}$ which, upon closer inspection, consists not of functions but of their equivalence classes. Two functions ${f,g}$ are equivalent if the set ${\{f\ne g\}}$ has zero area. In particular, changing all the value of ${f}$ on the diagonal ${D}$ does not change ${f}$ as an element of ${L^1(Q)}$. The logical conclusion is that given an element ${f\in L^1(Q)}$, we have no way to give a meaning to the integral of ${f}$ over ${D}$.

But wait a moment. If I take two square integrable function ${u,v\in L^2[0,1]}$, then the expression ${\int_0^1 u(x)v(x)\,dx}$ makes perfect sense. On the other hand, it represents the integral of the function ${f(x,y)=u(x)v(y)}$ over the diagonal ${D}$.

Pushing this further, if an element ${f\in L^1(Q)}$ can be represented as ${f(x,y)=\sum_{k=1}^n u_k(x)v_k(y)}$ for some ${u_k,v_k\in L^2[0,1]}$, then ${\int_D f}$ is naturally defined as ${\sum_{k=1}^n \int_0^1 u_kv_k}$. I can even take infinite sums, assuming that everything converges.

This is confusing. If I pick up an element of ${L^1(Q)}$ off the sidewalk, how will I know if it’s safe to integrate it over the diagonal? The existence or nonexistence of the decomposition into sum of products is not obvious.

I guess a satisfactory answer is given by the notion of a Lebesgue point. Given ${f\in L^1(Q)}$ and a point ${p}$ of ${Q}$, consider the following statement:

$\displaystyle \exists y_0\in \mathbb R \text{ such that } \lim_{r\rightarrow 0} \frac{1}{r^2} \int_{|x-p|

The validity of (1) and the value of ${y_0}$ are not affected by the choice of a representative of ${f}$. If (1) holds, ${p}$ is called a Lebesgue point of ${f}$, and we can think of ${y_0}$ as the “true” value of ${f(p)}$ (whether or not our representative agrees with that value). It’s a theorem that almost every point of the domain of ${f}$ is a Lebesgue point. The meaning of “almost every” corresponds to the measure under consideration: on the plane it’s the area, on a line it’s the length.

The product ${f(x,y)=u(x)v(y)}$ has a special feature. Since almost every ${x\in [0,1]}$ is a Lebesgue point of ${u}$, and a.e. ${y\in[0,1]}$ is a Lebesgue point of ${v}$, it follows that almost every point of the diagonal (in the sense of linear measure) is a Lebesgue point of the product ${f}$. (It helps to integrate over small squares instead of disks in (1), which does not change anything.) This makes it possible to define ${\int_D f}$ unambigiously.

The sums of products also have the property that almost every point of ${D}$ is a Lebesgue point. And other elements of ${L^1(Q)}$ may also have this property: they are safe to integrate diagonally, too.