## Structure of the 3n+1 stopping time

Returning to the stopping time of the ${3n+1}$ process (first part here):

Here is the plot of the stopping time embellished with a few logarithmic functions.

This structure is explained by looking at the number of ${3x+1}$ operations that a number experiences before reaching ${1}$.

No ${3x+1}$ operations. Such number are obviously of the form ${2^m}$, with stopping time ${m}$. These creates the points ${(2^m,m)}$ which lie on the curve ${y=\log_2 x}$.

One ${3x+1}$ operation. To find such numbers, follow their orbit backward: a series of multiplication by ${2}$, then ${(x-1)/3}$ operation, then more multiplications by ${2}$. This leads to numbers of the form ${2^n \dfrac{2^{m}-1}{3}}$ where ${m}$ must be even in order for ${2^m-1}$ to be divisible by ${3}$. The stopping time is ${n+m+1}$. Since ${2^n \dfrac{2^{m}-1}{3} \approx \dfrac{2^{n+m}}{3}}$, the corresponding points lie close to the curve ${y=1+\log_2(3x)}$. Also notice that unlike the preceding case, clusters appear: there may be multiple pairs ${(n,m)}$ with even ${m}$ and the same ${n+m}$. The larger the sum ${n+2m}$ is, the more such pairs occur.

Two ${3x+1}$ operations. Tracing the orbit backwards again, we find that these are numbers of the form ${2^p\dfrac{2^n \dfrac{2^{m}-1}{3} -1}{3}}$

It is straightforward to work out the conditions on ${(m,n)}$ which allow both divisions by ${3}$ to proceed. They are: either ${n}$ is odd and ${m \equiv 4 \mod 6}$, or ${n}$ is even and ${m\equiv 2 \mod 6}$. In any event, the stopping time is ${m+n+p+2}$ and the number itself is approximately ${2^{m+n+p}/9}$. On the above chart, these points lie near the curve ${y=2+\log_2(9x)}$. Clustering will be more prominent than in the previous case, because we now deal with triples ${(n,m,p)}$ that will be nearby each other if ${n+m+p}$ is the same.

k) ${k}$ operations of ${3x+1}$ kind. These yield the points near the curve ${y=k+\log_2(3^k x)}$, or, to put it another way, ${y=k\log_2 6+\log_2(x)}$. The plot above shows such curves for ${k=0,1,\dots,11}$.

## Asymptotics of 3n+1 stopping time

It is a well-known open problem whether the following process terminates for every positive integer:

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer ${n}$ the number ${3n+1}$ is even, it is convenient to replace the step ${n:=3n+1}$ with ${n:=(3n+1)/2}$, as show below:

As a function of ${n}$, the stopping time has a nice patterned graph:

An odd integer ${n}$ is of the form ${4k+1}$ or ${4k+3}$. In the first case, ${(3n+1)/2 = 6k+2}$ is even, while in the second case ${(3n+1)/2 = 6k+5}$ is odd. So, if ${n}$ is picked randomly from all odd integers of certain size, the probability of ${(3n+1)/2}$ being even is ${1/2}$. Similarly, for an even ${n}$, the number ${n/2}$ can be even or odd with probability ${1/2}$.

This leads to a stochastic model of the process:

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

However, smoothing out both graphs by moving window of width ${200}$ or so, we see the similarity:

The stochastic process is much easier to analyze. Focusing on the logarithm ${\log x}$, we see that it changes either by ${\log(1/2)}$ or by approximately ${\log (3/2)}$. The expected value of the change is $\frac12\log (3/4)$. This suggests that we can expect the logarithm to drop down to ${\log 1=0}$ in about $\frac{2}{\log (4/3)}\log x$ steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve $\frac{2}{\log (4/3)}\log x$ fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

For the computations, I used Scilab. The function hail(n,m) calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).
 function hail(n,m) steps=zeros(1:n); steps(1)=0 for i=2:n k=i; s=0; while k>=i if modulo(k,2)==0 then k=k/2; s=s+1; else k=(3*k+1)/2; s=s+1; end end steps(i)=s+steps(k); end total = cumsum(steps) for i=1:n-m average(i)=(total(i+m)-total(i))/m; end plot(average,'+'); endfunction 
The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once ${x}$ drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.
 function randomhail(n,m) rsteps=zeros(1:n); rsteps(1)=0 for i=2:n k=i; s=0; while k>1 if grand(1,1,"bin",1,1/2)==0 then k=k/2; s=s+1; else k=(3*k+1)/2; s=s+1; end end rsteps(i)=s; end rtotal = cumsum(rsteps) for i=1:n-m raverage(i)=(rtotal(i+m)-rtotal(i))/m; end plot(raverage,'r+'); endfunction