## Asymptotics of 3n+1 stopping time

It is a well-known open problem whether the following process terminates for every positive integer:

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer ${n}$ the number ${3n+1}$ is even, it is convenient to replace the step ${n:=3n+1}$ with ${n:=(3n+1)/2}$, as show below:

As a function of ${n}$, the stopping time has a nice patterned graph:

An odd integer ${n}$ is of the form ${4k+1}$ or ${4k+3}$. In the first case, ${(3n+1)/2 = 6k+2}$ is even, while in the second case ${(3n+1)/2 = 6k+5}$ is odd. So, if ${n}$ is picked randomly from all odd integers of certain size, the probability of ${(3n+1)/2}$ being even is ${1/2}$. Similarly, for an even ${n}$, the number ${n/2}$ can be even or odd with probability ${1/2}$.

This leads to a stochastic model of the process:

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

However, smoothing out both graphs by moving window of width ${200}$ or so, we see the similarity:

The stochastic process is much easier to analyze. Focusing on the logarithm ${\log x}$, we see that it changes either by ${\log(1/2)}$ or by approximately ${\log (3/2)}$. The expected value of the change is ${\displaystyle \frac12\log (3/4)}$. This suggests that we can expect the logarithm to drop down to ${\log 1=0}$ in about ${\displaystyle \frac{2}{\log (4/3)}\log x}$ steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve ${\displaystyle \frac{2}{\log (4/3)}\log x}$ fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

For an in-depth investigation, see Lower bounds for the total stopping time of 3X+1 iterates by Applegate and Lagarias.

For the computations, I used Scilab. The function hail(n,m) calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).

function hail(n,m)
steps=zeros(1:n);
steps(1)=0
for i=2:n
k=i;
s=0;
while k>=i
if modulo(k,2)==0 then
k=k/2;
s=s+1;
else
k=(3*k+1)/2;
s=s+1;
end
end
steps(i)=s+steps(k);
end
total = cumsum(steps)
for i=1:n-m
average(i)=(total(i+m)-total(i))/m;
end
plot(average,'+');
endfunction

As soon as the result of computations drops below the starting value, the number of remaining steps is fetched from the array that is already computed. This speeds up the process a bit.

The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once ${x}$ drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.

function randomhail(n,m)
rsteps=zeros(1:n);
rsteps(1)=0
for i=2:n
k=i;
s=0;
while k>1
if grand(1,1,"bin",1,1/2)==0 then
k=k/2;
s=s+1;
else
k=(3*k+1)/2;
s=s+1;
end
end
rsteps(i)=s;
end
rtotal = cumsum(rsteps)
for i=1:n-m
raverage(i)=(rtotal(i+m)-rtotal(i))/m;
end
plot(raverage,'r+');
endfunction

## An exercise in walking randomly

Suppose I randomly and blindly walk back and forth along this hallway, which may or may not happen in reality. The safest place to start is in the middle, next to the department’s main office. How long will it take before I bump into a wall at one of the ends?

Formally, let $X_1,X_2,\dots$ be independent random variables which take values $\pm 1$ with equal probability $1/2$. These are the steps of the symmetric random walk. Starting from $S_0=0$, after $n$ steps I end up at the point $S_n=X_1+\dots +X_n$. The sequence $(S_n)$ is a martingale, being the sum of independent random variables with zero mean. Less obviously, for every number $\theta\in\mathbb R$ the sequence $Z_n=\exp(\theta S_n-n\log \cosh \theta)$ is also a martingale, even though the increments $d_n=Z_n-Z_{n-1}$ are not independent of each other. The reason for $\log \cosh\theta$ is that $E(e^{\theta X_n})=\cosh\theta$. It is convenient to write $s=\log\cosh\theta$ in what follows. We check the martingale property of $Z_n=e^{\theta S_n-ns}$ this by conditioning on some value $S_{n-1}=\alpha$ and computing $E(Z_n | S_{n-1}=\alpha)=e^{\theta\alpha-(n-1)s}=Z_{n-1}$.

If our hallway is the interval $[-N,N]$, we should investigate the stopping time $T=\min\{n\ge 1\colon |S_n|\ge N\}$. The optional stopping time theorem applies to $Z_T$ because $|Z_n|\le e^{|\theta| N}$ whenever $n\le T$. Thus, $E(Z_T)=E(Z_0)=1$. In terms of $S_n$ this means $E e^{\theta S_T-sT}=1$.

Now, $S_T$ is either $N$ or $-N$; both happen with $P=1/2$ and in both cases $T$ has the same distribution. Therefore, $1=E e^{\theta S_T-sT}=\frac{1}{2}(e^{N\theta}+e^{-N\theta}) E e^{-sT}$. We end up with the Laplace transform of $T$,

$\displaystyle E e^{-sT}=\frac{1}{\cosh N\theta} = \frac{1}{\cosh N\, \mathrm{arccosh}\, e^{s}}, \qquad s\ge 0$

(I would not want to write $\cosh^{-1}$ in such a formula.) Since N is an integer and $e^{s}\ge 1$, a magical identity applies:

$\displaystyle \cosh N \,\mathrm{arccosh}\,x = \mathcal T_N(x)\qquad x\ge 1$,

$\mathcal T_N$ being the Nth Чебышёв polynomial of the first kind. (Compare to $\displaystyle \cos N \,\mathrm{arccos}\,x = \mathcal T_N(x)$ which holds for $|x|\le 1$.) Thus,

$\displaystyle E e^{-sT}=\frac{1}{\mathcal T_N (e^{s})}, \qquad s\ge 0$

One more change of notation: let $e^{-s}=y$, so that $E e^{-sT}=E(y^T)=\sum_{n=1}^{\infty} P(T=n)\,y^n$. The formula

$\displaystyle \sum_{n=1}^{\infty} P(T=n)\,y^n=\frac{1}{\mathcal T_N (1/y)}$

gives us the probabilities $P(T=n)$ for each $n$, assuming we can expand the reciprocal of the Чебышёв polynomial into a power series. Let’s check two simple cases:

• if $N=1$, then $\mathcal T_1(x)=x$ and $\frac{1}{\mathcal T_1 (1/y)}=y$. Hence $T\equiv 1$ which is correct: the walk immediately hits one of the walls $\pm 1$.
• if $N=2$, then $\mathcal T_1(x)=2x^2-1$, hence $\displaystyle \frac{1}{\mathcal T_2 (1/y)}=\frac{y^2}{2}\frac{1}{1-y^2/2}=\sum_{k=1}^\infty \frac{1}{2^k} y^{2k}$. This means $T$ can never be odd (which makes sense, since it must always have the parity of $N$), and the probability of hitting a wall in exactly $2k$ steps is $1/2^k$. A moment’s reflection confirms that the answer is right.

For a serious example, with $N=10$, I used Maple:

with(plots):
n:=150:
ser:=series(1/ChebyshevT(10, 1/x),x=0,2*n+1):
for i from 1 to 2*n do a[i]:=coeff(ser, x^i) end do:
listplot([seq(a[i],i=1..2*n)], color=blue, thickness=2);

By default listplot connects the dots, and since every other term is zero, the plot has a trippy pattern.

Obviously, it’s impossible to hit a wall in fewer than $N$ steps. The formula confirms this: writing $1/\mathcal T_N (1/y)$ a rational function with numerator $y^N$, we see that no terms $y^n$ with $n can appear in the expansion.

(This post is an exercise which I should have done long ago but never did. Thanks are due to Ted Cox for a walk-through.)