Complex Cantor sets

Every real number in the interval [0,1] can be written in binary as {\sum_{k=1}^\infty c_k(1/2)^k} where each coefficient {c_k} is either 0 or 1. Another way to put this: the set of all possible sums {\sum_{k=1}^\infty c_kb^k} for b = 1/2 is a line segment.


What is this set for other values of “base” b, then? Let’s stick to |b| < 1 for now, so that the series converges. Nothing interesting happens for real b between 1/2 and 1; the segment grows longer, to length b/(1-b). When b is between 0 and 1, we get Cantor sets, with the classical middle-third set being the case b = 1/3.


There is no need to consider negative b, because of a symmetry between b and -b. Indeed, up to scaling and translation, the coefficients can be taken from {-1, 1} instead of {0, 1}. Then it’s obvious that changing the sign of b is the same as flipping half of coefficients the other way — does not change the set of possible sums.

Let’s look at purely imaginary b, then. Here is b = 0.6i


Why so rectangular? The real part is the sum of {c_kb^k} over even k, and the imaginary part is the sum over odd k. Each of these yields a Cantor type set as long as {|b|^2 < 1/2}. Since the odd- and even-numbered coefficients are independent of each other, we get the product of two Cantor sets. Which changes into a rectangle when  {|b| \ge \sqrt{1/2}}:


(I didn’t think a full-size picture of a solid rectangle was necessary here.)

This is already interesting: the phase transition from dust to solid (connected, and even with interior) happens at different values in the real and imaginary directions: 1/2 versus {\sqrt{1/2}}. What will happen for other complex values? Using complex conjugation and the symmetry between b and -b, we reduce the problem to the quarter-disk in the first quadrant. Which still leaves a room for a lot of things to happen…

b = 0.6 + 0.3i
b = 0.7 + 0.2i
b = 0.4 + 0.3i
b = 0.2 + 0.7i

It’s clear that for |b| < 1/2 we get a totally disconnected set — it is covered by 2 copies of itself scaled by the factor of |b|, so its Hausdorff dimension is less than 1 when |b| is less than 1/2. Also, the argument of b is responsible for rotation of the scaled copies, and it looks like rotation favors disconnectivity… but then again, the pieces may link up again after being scaled-rotated a few times, so the story is not a simple one.

The set of bases b for which the complex Cantor set is connected is a Manderbrot-like set introduced by Barnsley and Harrington in 1985. It has the symmetries of a rectangle, and features a prominent hole centered at 0 (discussed above). But it actually has infinitely many holes, with “exotic” holes being tiny islands of disconnectedness, surrounded by connected sets. This was proved in 2014 by Calegari, Koch, Walker, so I refer to Danny Calegari’s post for an explanation and more pictures (much better looking than mine).

Besides “disconnected to connected”, there is another phase transition: empty interior to nonempty interior. Hare and Sidorov proved that the complex Cantor set has nonempty interior when  {|b| > 2^{-1/4}}; their path to the proof involved a MathOverflow question The Minkowski sum of two curves which is of its own interest.

The pictures were made with a straightforward Python script, using expansions of length 20:

import matplotlib.pyplot as plt
import numpy as np
import itertools
n = 20
b = 0.6 + 0.3j
c = np.array(list(itertools.product([0, 1], repeat=n)))
w = np.array([b**k for k in range(n)]).reshape(1, -1)
z = np.sum(c*w, axis=1)
plt.plot(np.real(z), np.imag(z), '.', ms=4)

Since we are looking at partial sums anyway, it’s not necessary to limit ourselves to |b| being less than 1. Replacing b by 1/b only scales the picture, so the place to look for new kinds of pictures is the unit circle. Let’s try a 7th root of unity:

b = exp(pi i / 7)

The set above looks sparse because many points overlap. Let’s change b to something non-algebraic:

b = exp(i)

What’s with the cusps along the perimeter?

Iterating the logistic map: limsup of nonperiodic orbits

Last time we found that when a sequence with {x_1\in (0,1)} and {x_{n+1} = 4x_n(1-x_n)} does not become periodic, its upper limit {\limsup x_n} must be at least {\approx 0.925}. This time we’ll see that {\limsup x_n} can be as low as {(2+\sqrt{3})/4\approx 0.933} and determine for which {x_1} it is equal to 1.

The quadratic polynomial {f(x)=4x(1-x)} maps the interval {[0,1]} onto itself. Since the linear function {g(x) = 1-2x} maps {[0,1]} onto {[-1,1]}, it follows that the composition {h=g\circ f\circ g^{-1}} maps {[-1,1]} onto {[-1,1]}. This composition is easy to compute: {h(x) = 2x^2-1 }.

We want to know whether the iteration of {f}, starting from {x_1}, produces numbers arbitrarily close to {1}. Since {f\circ f \circ \cdots \circ f = g^{-1}\circ h \circ h \circ \cdots \circ h\circ g} the goal is equivalent to finding whether the iteration of {h}, starting from {g(x_1)}, produces numbers arbitrarily close to {g(1) = -1}. To shorten formulas, let’s write {h_n} for the {n}th iterate of {h}, for example, {h_3 = h\circ h\circ h}.

So far we traded one quadratic polynomial {f} for another, {h}. But {h} satisfies a nice identity: {h(\cos t)=2\cos^2 t-1 = \cos(2t)}, hence {h_n(\cos t) = \cos (2^n t)} for all {n\in\mathbb N}. It’s convenient to introduce {\alpha = \frac{1}{\pi}\cos^{-1}(1-2x_1)}, so that { h_n(g(x_1)) = h_n(\cos 2\pi \alpha ) = \cos(2^n\cdot 2\pi \alpha) }.

The problem becomes to determine whether the numbers {2^n\cdot 2\pi \alpha} come arbitrarily close to {\pi}, modulo an integer multiple of {2\pi}. Dividing by {2\pi} rephrases this as: does the fractional part of {2^n \alpha} come arbitrarily close to {1/2}?

A number that is close to {1/2} has the binary expansion beginning either with {0.01111111\dots} or with {0.10000000\dots}. Since the binary expansion of {2^n\alpha} is just the binary expansion of {\alpha} shifted {n} digits to the left, the property {\limsup x_n=1} is equivalent to the following: for every {k\in\mathbb N} the binary expansion of {\alpha} has infinitely many groups of the form “1 followed by k zeros” or “0 followed by k ones”.

A periodic expansion cannot have the above property; this, {\alpha} must be irrational. The property described above can then be simplified to “irrational and has arbitrarily long runs of the same digit”, since a long run of {0}s will be preceded by a {1}, and vice versa.

For example, combining the pairs 01 and 10 in some non-periodic way, we get an irrational number {\alpha} such that the fractional part of {2^n\alpha} does not get any closer to 1/2 than {0.01\overline{10}_2 = 5/12} or {0.10\overline{01}_2 = 7/12}. Hence, {\cos 2^n 2\pi \alpha \ge -\sqrt{3}/2}, which leads to the upper bound {x_n\le (2+\sqrt{3})/4\approx 0.933} for the sequence with the starting value {x_1=(1-\cos\pi\alpha)/2}.

Let us summarize the above observations about {\limsup x_n}.

Theorem: {\limsup x_n=1} if and only if (A) the number {\alpha = \frac{1}{\pi}\cos^{-1}(1-2x_1)} is irrational, and (B) the binary expansion of {\alpha} has arbitrarily long runs of the same digit.

Intuitively, one expects that a number that satisfies (A) will also satisfy (B) unless it was constructed specifically to fail (B). But to verify that (B) holds for a given number is not an easy task.

As a bonus, let’s prove that for every rational number {y\in (-1,1)}, except 0, 1/2 and -1/2, the number {\alpha = \frac{1}{\pi}\cos^{-1}y} is irrational. This will imply, in particular, that {x_1=1/3} yields a non-periodic sequence. The proof follows a post by Robert Israel and requires a lemma (which could be replaced with an appeal to Chebyshev polynomials, but the lemma keeps things self-contained).

Lemma. For every {n\in \mathbb N} there exists a monic polynomial {P_n} with integer coefficients such that {P_n(2 \cos t) = 2\cos nt } for all {t}.

Proof. Induction, the base case {n=1} being {P_1(x)=x}. Assuming the result for integers {\le n}, we have {2 \cos (n+1)t = e^{i(n+1)t} + e^{-i(n+1)t} } {  = (e^{int} + e^{-int})(e^{it} + e^{-it}) - (e^{i(n-1)t} + e^{-i(n-1)t}) } { = P_n(2 \cos t) (2\cos t) - P_{n-1}(2\cos t) }
which is a monic polynomial of {2\cos t}. {\Box}

Suppose that there exists {n} such that {n\alpha \in\mathbb Z}. Then {2\cos(\pi n\alpha)=\pm 2}. By the lemma, this implies {P_n(2\cos(\pi \alpha)) =\pm 2}, that is {P_n(2y)=\pm 2}. Since {2y} is a rational root of a monic polynomial with integer coefficients, the Rational Root Theorem implies that it is an integer. {\Box}

A limsup exercise: iterating the logistic map

Define the sequence {\{x_n\}} as follows: {x_1=1/3} and {x_{n+1} = 4x_n(1-x_n)} for {n=1,2,\dots}. What can we say about its behavior as {n\rightarrow\infty}?

The logistic map {f(x)=4x(1-x)} leaves the interval [0,1] invariant (as a set), so {0\le x_n\le 1} for all {n}. There are two fixed points: 0 and 3/4.

Two fixed points of the logistic map: 0 and 3/4

Can {x_n} ever be 0? If {n} is the first index this happens, then {x_{n-1}} must be {1}. Working backwards, we find {x_{n-2}=1/2}, and {x_{n-3}\in \{1/2 \pm \sqrt{2}/4\}}. But this is impossible since all elements of the sequence are rational. Similarly, if {n} is the first index when {x_n = 3/4}, then {x_{n-1}=1/4} and {x_{n-2}\in \{1/2\pm \sqrt{3}/4\}}, a contradiction again. Thus, the sequence never stabilizes.

If {x_n} had a limit, it would have to be one of the two fixed points. But both are repelling: {f'(x) = 4 - 8x}, so {|f'(0)|=4>1 } and {|f'(3/4)| = 2 > 1}. This means that a small nonzero distance to a fixed point will increase under iteration. The only way to converge to a repelling fixed point is to hit it directly, but we already know this does not happen. So the sequence {\{x_n\}} does not converge.

But we can still consider its upper and lower limits. Let’s try to estimate {S = \limsup x_n} from below. Since {f(x)\ge x} for {x\in [0,3/4]}, the sequence {\{x_n\}} increases as long as {x_n\le 3/4}. Since we know it doesn’t have a limit, it must eventually break this pattern, and therefore exceed 3/4. Thus, {S\ge 3/4}.

This can be improved. The second iterate {f_2(x)=f(f(x))} satisfies {f_2(x)\ge x} for {x} between {3/4} and {a = (5+\sqrt{5})/8 \approx 0.9}. So, once {x_n>3/4} (which, by above, happens infinitely often), the subsequence {x_n, x_{n+2}, x_{n+4},\dots} increases until it reaches {a}. Hence {S\ge a}.

The bound {\limsup x_n\ge a} is best possible if the only information about {x_1} is that the sequence {x_n} does not converge. Indeed, {a} is a periodic point of {f}, with the corresponding iteration sequence {\{(5+ (-1)^n\sqrt{5})/8\}}.

Further improvement is possible if we recall that our sequence is rational and hence cannot hit {a} exactly. By doubling the number of iterations (so that the iterate also fixes {a} but also has positive derivative there) we arrive at the fourth iterate {f_4}. Then {f_4(x)\ge x} for {a\le x\le b}, where {b } is the next root of {f_4(x)-x} after {a}, approximately {0.925}. Hence {S\ge b}.

Stacking the iterates

This is a colorful illustration of the estimation process (made with Sage): we are covering the line {y=x} with the iterates of {f}, so that each subsequent one rises above the line the moment the previous one falls. This improves the lower bound on {S} from 0.75 to 0.9 to 0.92.

Although this process can be continued, the gains diminish so rapidly that it seems unlikely one can get to 1 in this way. In fact, one cannot because we are not using any properties of {x_1} other than “the sequence {x_n} is not periodic.” And it’s not true that {\limsup x_n = 1} for every non-periodic orbit of {f}. Let’s return to this later.

Vectorization of integration

A frequently asked question about numerical integration is: how to speed up the process of computing {\int_a^b f(x,p)\,dx} for many values of parameter {p}? Running an explicit for over the values of {p} seems slow… is there a way to vectorize this, performing integration on an array of functions at once (which in reality means, pushing the loop to a lower level language)?

Usually, the answer is some combination of “no” and “you are worrying about the wrong thing”. Integration routines are typically adaptive, meaning they pick evaluation points based on the function being integrated. Different values of the parameter will require different evaluation points to achieve the desired precision, so vectorization is out of question. This applies, for example, to quad method of SciPy integration module.

Let’s suppose we insist on vectorization and are willing to accept a non-adaptive method. What are our options, considering SciPy/NumPy? I will compare

The test case is {\int_0^1 (p+1)x^p\,dx} with {p=0,0.01,\dots,99.99}. Theoretically all these are equal to {1}. But most of these functions are not analytic at {0}, and some aren’t even differentiable there, so their integration is not a piece of cake.

Keeping in mind that Romberg’s integration requires {2^k} subintervals, let’s use {1024} equal subintervals, hence {1025} equally spaced sample points. Here is the vectorized evaluation of all our functions at these points, resulting in a 2D array “values”:

import numpy as np
import scipy.integrate as si
x = np.linspace(0, 1, 1025).reshape(1, -1)
dx = x[0,1] - x[0,0]
p = np.arange(0, 100, 0.01).reshape(-1, 1)
values = (p+1)*x**p

This computation took 0.799 seconds on my AWS instance (t2.nano). Putting the results of evaluation together takes less time:

  • Trapezoidal rule np.trapz(values, dx=dx) took 0.144 seconds and returned values ranging from 0.99955 to 1.00080.
  • Simpson’s rule si.simps(values, dx=dx) took 0.113 seconds and returned values ranging from 0.99970 to 1.0000005.
  • Romberg quadrature si.romb(values, dx=dx) was fastest at 0.0414 seconds, and also most accurate, with output between 0.99973 and 1.000000005.

Conclusions so far:

  • SciPy’s implementation of Romberg quadrature is surprisingly fast, considering that this algorithm is the result of repeated Richardson extrapolation of the trapezoidal rule (and Simpson’s rule is just the result of the first extrapolation step). It’d be nice to backport whatever makes Romberg so effective back to Simpson’s rule.
  • The underestimation errors 0.999… are greatest when {p} is near zero, so the integrand is very nonsmooth at {0}. The lack of smoothness renders Richardson extrapolation ineffective, hence all three rules have about the same error here.
  • The overestimation errors are greatest when {p} is large. The function is pretty smooth then, so upgrading from trapezoidal to Simpson to Romberg brings substantial improvements.

All of this is nice, but the fact remains that non-vectorized adaptive integration is both faster and much more accurate. The following loop with quad, which uses adaptive Clenshaw-Curtis quadrature, runs in 0.616 seconds (less than it took to evaluate our functions on a grid) and returns values between 0.99999999995 and 1.00000000007. Better to use exponential notation here: {1-5\cdot 10^{-11} } and {1+7\cdot 10^{-11}}.

funcs = [lambda x, p=p_: (p+1)*x**p for p_ in np.arange(0, 100, 0.01)]
result = [si.quad(fun, 0, 1)[0] for fun in funcs]

An adaptive algorithm shines when {p} is small, by placing more evaluation points near zero. It controls both over- and under- estimates equally well, continuing to add points until the required precision is reached.

The last hope of non-adaptive methods is Gaussian quadrature, which is implemented in SciPy as fixed_quad (“fixed” referring to the number of evaluation points used). With 300 evaluation points, the integration takes 0.263 seconds (excluding the computation of nodes and weights, which can be cached) and the results are between {1-2\cdot 10^{-12}} and {1+2\cdot 10^{-7}}. This is twice as fast as a loop with quad, and more accurate for large {p} — but sadly, not so accurate for small {p}. As said before, {x^p} with {p} near zero is really a showcase for adaptive methods.

Kolakoski turtle curve

Let’s take another look at the Kolakoski sequence (part 1, part 2) which, by definition, is the sequence of 1s and 2s in which the nth term is equal to the length of the nth run of consecutive equal numbers in the same sequence. When a sequence has only two distinct entries, it can be visualized with the help of a turtle that turns left (when the entry is 1) or right (when the entry is 2). This visualization method seems particularly appropriate for the  Kolakoski sequence since there are no runs of 3 equal entries, meaning the turtle will never move around a square of sidelength equal to its step. In particular, this leaves open the possibility of getting a simple curve… Here are the first 300 terms; the turtle makes its first move down and then goes left-right-right-left-left-right-left-… according to the terms 1,2,2,1,1,2,1,…

300 terms of the sequence
300 terms of the sequence

No self-intersections yet… alas, at the 366th term it finally happens.

First self-intersection: 366 terms
First self-intersection: 366 terms

Self-intersections keep occurring after that:

1000 terms
1000 terms

again and again…

5000 terms
5000 terms

Okay, the curve obviously doesn’t mind intersecting self. But it can’t be periodic since the Kolakoski sequence isn’t. This leaves two questions unresolved:

  • Does the turtle ever get above its initial position? Probably… I haven’t tried more than 5000 terms
  • Is the curve bounded? Unlikely, but I’ve no idea how one would dis/prove that. For example, there cannot be a long diagonal run (left-right-left-right-left) because having 1,2,1,2,1 in the sequence implies that elsewhere, there are three consecutive 1s, and that doesn’t happen.

Here’s the Python code used for the above. I represented the sequence as a Boolean array with 1 = False, 2 = True.

import numpy as np
import turtle
n = 500                   # number of terms to compute
a = np.zeros(n, dtype=np.bool_)
j = 0                     # the index to look back at 
same = False   # will next term be same as current one?
for i in range(1, n):
    if same:
        a[i] = a[i-1]     # current run continues
        same = False
        a[i] = not a[i-1] # the run is over
        j += 1            # another run begins
        same = a[j]       # a[j] determines its length

for i in range(n):
    turtle.forward(10)    # used steps of 10 or 5 pixels 
    if a[i]:

Wild power pie

Many people are aware of {\pi} being a number between 3 and 4, and some also know that {e} is between 2 and 3. Although the difference {\pi-e} is less than 1/2, it’s enough to place the two constants in separate buckets on the number line, separated by an integer.

When dealing with powers of {e}, using {e>2} is frequently wasteful, so it helps to know that {e^2>7}. Similarly, {\pi^2<10} is way more precise than {\pi<4}. To summarize: {e^2} is between 7 and 8, while {\pi^2} is between 9 and 10.

Do any two powers of {\pi} and {e} have the same integer part? That is, does the equation {\lfloor \pi^n \rfloor = \lfloor e^m \rfloor} have a solution in positive integers {m,n}?

Probably not. Chances are that the only pairs {(m,n)} for which {|\pi^n - e^m|<10} are {m,n\in \{1,2\}}, the smallest difference attained by {m=n=1}.

Indeed, having {|\pi^n - e^m|<1} implies that {|n\log \pi - m|<e^{-m}}, or put differently, {\left|\log \pi - \dfrac{m}{n}\right| < \dfrac{1}{n \,\pi^n}}. This would be an extraordinary rational approximation… for example, with {n=100} it would mean that {\log \pi = 1.14\ldots} with the following {50} digits all being {0}. This isn’t happening.

Looking at the continued fraction expansion of {\log \pi} shows the denominators of modest size {[1; 6, 1, 10, 24, \dots]}, indicating the lack of extraordinarily nice rational approximations. Of course, can use them to get good approximations, {\left|\log \pi - \dfrac{m}{n}\right| < \dfrac{1}{n^2}}, which leads to {\pi^n\approx e^m} with small relative error. For example, dropping {24} and subsequent terms yields the convergent {87/76}, and one can check that {\pi^{76} = 6.0728... \cdot 10^{37}} while {e^{87} = 6.0760...\cdot 10^{37}}.

Trying a few not-too-obscure constants with the help of mpmath library, the best coincidence of integer parts that I found is the following: the 13th power of the golden ratio {\varphi = (\sqrt{5}+1)/2} and the 34th power of Apèry’s constant {\zeta(3) = 1^{-3}+2^{-3}+3^{-3}+4^{-4}+\dots} both have integer part 521.

A necklace of tears

The problem is: given a set of objects drawn from several groups, put all of them in a row in a “uniform” way. Whatever that means.

For example, suppose we have 21 gemstones: 9 red, 5 blue, 3 green, 2 cyan, 1 magenta and 1 yellow. How to place them on a string to make a reasonably looking necklace? The criterion is subjective, but we can probably agree that

Arrangement #1
Arrangement #1

looks better than

Arrangement #2
Arrangement #2

(referring to the uniformity of distributions, not the aesthetics of color.)

The approach I took was to repeatedly add the “most underrepresented” gem, defined by maximal difference between projected frequency of appearance (e.g., 5/21 for blue) and the frequency of appearance so far in the string. (Taking the convention 0/0=0 here.) Here is this algorithm in Python — not in Ruby, which would be more topical.

count = {'r': 9, 'b': 5, 'g': 3, 'c': 2, 'm': 1, 'y': 1}
length = sum(count.values())
str = ''
while len(str) < length:
    deficit = {}
    for char in count:
        deficit[char] = count[char]/length - (str.count(char)/len(str) if str else 0)
    str += max(deficit, key=deficit.get)

The output, “rbgcryrbrmgrbrcbrgrbr”, is what the first image in this post represents. The second image is the result of an ill-fated attempt to replace difference by ratio when determining the under-representation.

I initially thought that two unique gems (yellow and magenta) would end up together, but this hasn’t happened: after one is added, the frequency of more common gems drops, allowing them to come back into play for a while. Still, the left half of the string is noticeably more diverse than the right half. It’d be better if two unique gems were in roughly symmetrical position, and generally there would be no preference between left-right and right-left directions.

Perhaps the new character should be added to the string either on the right or on the left, in alternating fashion. That should make things nice and symmetric, right?


Alternating concatenation
Alternating concatenation

The search continues…

Update: Rahul suggested in a comment to adjust the deficit computation to

deficit[char] = count[char]/length - str.count(char)/(len(str) + 1)

This has some advantages but on the other hand, two unique gems (magenta and yellow) are placed next to each other, which is something I wanted to avoid.

Dividing by len(str) + 1