Structure of the 3n+1 stopping time

Returning to the stopping time of the {3n+1} process (first part here):

3n+1 flow chart
3n+1 flow chart

Here is the plot of the stopping time embellished with a few logarithmic functions.

Order in chaos
Order in chaos

This structure is explained by looking at the number of {3x+1} operations that a number experiences before reaching {1}.

No {3x+1} operations. Such number are obviously of the form {2^m}, with stopping time {m}. These creates the points {(2^m,m)} which lie on the curve {y=\log_2 x}.

One {3x+1} operation. To find such numbers, follow their orbit backward: a series of multiplication by {2}, then {(x-1)/3} operation, then more multiplications by {2}. This leads to numbers of the form {2^n \dfrac{2^{m}-1}{3}} where {m} must be even in order for {2^m-1} to be divisible by {3}. The stopping time is {n+m+1}. Since {2^n \dfrac{2^{m}-1}{3} \approx \dfrac{2^{n+m}}{3}}, the corresponding points lie close to the curve {y=1+\log_2(3x)}. Also notice that unlike the preceding case, clusters appear: there may be multiple pairs {(n,m)} with even {m} and the same {n+m}. The larger the sum {n+2m} is, the more such pairs occur.

Two {3x+1} operations. Tracing the orbit backwards again, we find that these are numbers of the form

{2^p\dfrac{2^n \dfrac{2^{m}-1}{3} -1}{3}}

It is straightforward to work out the conditions on {(m,n)} which allow both divisions by {3} to proceed. They are: either {n} is odd and {m \equiv 4 \mod 6}, or {n} is even and {m\equiv 2 \mod 6}. In any event, the stopping time is {m+n+p+2} and the number itself is approximately {2^{m+n+p}/9}. On the above chart, these points lie near the curve {y=2+\log_2(9x)}. Clustering will be more prominent than in the previous case, because we now deal with triples {(n,m,p)} that will be nearby each other if {n+m+p} is the same.

k) {k} operations of {3x+1} kind. These yield the points near the curve {y=k+\log_2(3^k x)}, or, to put it another way, {y=k\log_2 6+\log_2(x)}. The plot above shows such curves for {k=0,1,\dots,11}.

Asymptotics of 3n+1 stopping time

It is a well-known open problem whether the following process terminates for every positive integer:

3n+1 flow chart
3n+1 flow chart

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer {n} the number {3n+1} is even, it is convenient to replace the step {n:=3n+1} with {n:=(3n+1)/2}, as show below:

Optimized flow
(3n+1)/2 flow chart

As a function of {n}, the stopping time has a nice patterned graph:

Stopping time
Stopping time

An odd integer {n} is of the form {4k+1} or {4k+3}. In the first case, {(3n+1)/2 = 6k+2} is even, while in the second case {(3n+1)/2 = 6k+5} is odd. So, if {n} is picked randomly from all odd integers of certain size, the probability of {(3n+1)/2} being even is {1/2}. Similarly, for an even {n}, the number {n/2} can be even or odd with probability {1/2}.

This leads to a stochastic model of the process:

Stochastic flow
Stochastic flow

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

Stopping time, stochastic version
Stopping time, stochastic version

However, smoothing out both graphs by moving window of width {200} or so, we see the similarity:

Moving averages, deterministic and stochastic
Moving averages, deterministic and stochastic

The stochastic process is much easier to analyze. Focusing on the logarithm {\log x}, we see that it changes either by {\log(1/2)} or by approximately {\log (3/2)}. The expected value of the change is {\displaystyle \frac12\log (3/4)}. This suggests that we can expect the logarithm to drop down to {\log 1=0} in about {\displaystyle \frac{2}{\log (4/3)}\log x} steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve {\displaystyle \frac{2}{\log (4/3)}\log x} fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

Logarithmic growth
Logarithmic growth

For an in-depth investigation, see Lower bounds for the total stopping time of 3X+1 iterates by Applegate and Lagarias.

For the computations, I used Scilab. The function hail(n,m) calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).

function hail(n,m)
    steps=zeros(1:n);
    steps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>=i 
            if modulo(k,2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        steps(i)=s+steps(k);
    end
    total = cumsum(steps) 
    for i=1:n-m
        average(i)=(total(i+m)-total(i))/m;
    end
    plot(average,'+');
endfunction 

As soon as the result of computations drops below the starting value, the number of remaining steps is fetched from the array that is already computed. This speeds up the process a bit.

The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once {x} drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.

function randomhail(n,m)
    rsteps=zeros(1:n);
    rsteps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>1 
            if grand(1,1,"bin",1,1/2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        rsteps(i)=s;
    end
    rtotal = cumsum(rsteps) 
    for i=1:n-m
        raverage(i)=(rtotal(i+m)-rtotal(i))/m;
    end
    plot(raverage,'r+');
endfunction