Asymptotics of 3n+1 stopping time

It is a well-known open problem whether the following process terminates for every positive integer:

3n+1 flow chart
3n+1 flow chart

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer {n} the number {3n+1} is even, it is convenient to replace the step {n:=3n+1} with {n:=(3n+1)/2}, as show below:

Optimized flow
(3n+1)/2 flow chart

As a function of {n}, the stopping time has a nice patterned graph:

Stopping time
Stopping time

An odd integer {n} is of the form {4k+1} or {4k+3}. In the first case, {(3n+1)/2 = 6k+2} is even, while in the second case {(3n+1)/2 = 6k+5} is odd. So, if {n} is picked randomly from all odd integers of certain size, the probability of {(3n+1)/2} being even is {1/2}. Similarly, for an even {n}, the number {n/2} can be even or odd with probability {1/2}.

This leads to a stochastic model of the process:

Stochastic flow
Stochastic flow

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

Stopping time, stochastic version
Stopping time, stochastic version

However, smoothing out both graphs by moving window of width {200} or so, we see the similarity:

Moving averages, deterministic and stochastic
Moving averages, deterministic and stochastic

The stochastic process is much easier to analyze. Focusing on the logarithm {\log x}, we see that it changes either by {\log(1/2)} or by approximately {\log (3/2)}. The expected value of the change is {\displaystyle \frac12\log (3/4)}. This suggests that we can expect the logarithm to drop down to {\log 1=0} in about {\displaystyle \frac{2}{\log (4/3)}\log x} steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve {\displaystyle \frac{2}{\log (4/3)}\log x} fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

Logarithmic growth
Logarithmic growth

For an in-depth investigation, see Lower bounds for the total stopping time of 3X+1 iterates by Applegate and Lagarias.

For the computations, I used Scilab. The function hail(n,m) calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).

function hail(n,m)
    steps=zeros(1:n);
    steps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>=i 
            if modulo(k,2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        steps(i)=s+steps(k);
    end
    total = cumsum(steps) 
    for i=1:n-m
        average(i)=(total(i+m)-total(i))/m;
    end
    plot(average,'+');
endfunction 

As soon as the result of computations drops below the starting value, the number of remaining steps is fetched from the array that is already computed. This speeds up the process a bit.

The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once {x} drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.

function randomhail(n,m)
    rsteps=zeros(1:n);
    rsteps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>1 
            if grand(1,1,"bin",1,1/2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        rsteps(i)=s;
    end
    rtotal = cumsum(rsteps) 
    for i=1:n-m
        raverage(i)=(rtotal(i+m)-rtotal(i))/m;
    end
    plot(raverage,'r+');
endfunction

An exercise in walking randomly

Suppose I randomly and blindly walk back and forth along this hallway, which may or may not happen in reality. The safest place to start is in the middle, next to the department’s main office. How long will it take before I bump into a wall at one of the ends?

Carnegie Building

Formally, let X_1,X_2,\dots be independent random variables which take values \pm 1 with equal probability 1/2. These are the steps of the symmetric random walk. Starting from S_0=0, after n steps I end up at the point S_n=X_1+\dots +X_n. The sequence (S_n) is a martingale, being the sum of independent random variables with zero mean. Less obviously, for every number \theta\in\mathbb R the sequence Z_n=\exp(\theta S_n-n\log \cosh \theta) is also a martingale, even though the increments d_n=Z_n-Z_{n-1} are not independent of each other. The reason for \log \cosh\theta is that E(e^{\theta X_n})=\cosh\theta. It is convenient to write s=\log\cosh\theta in what follows. We check the martingale property of Z_n=e^{\theta S_n-ns} this by conditioning on some value S_{n-1}=\alpha and computing E(Z_n | S_{n-1}=\alpha)=e^{\theta\alpha-(n-1)s}=Z_{n-1}.

If our hallway is the interval [-N,N], we should investigate the stopping time T=\min\{n\ge 1\colon |S_n|\ge N\}. The optional stopping time theorem applies to Z_T because |Z_n|\le e^{|\theta| N} whenever n\le T. Thus, E(Z_T)=E(Z_0)=1. In terms of S_n this means E e^{\theta S_T-sT}=1.

Now, S_T is either N or -N; both happen with P=1/2 and in both cases T has the same distribution. Therefore, 1=E e^{\theta S_T-sT}=\frac{1}{2}(e^{N\theta}+e^{-N\theta}) E e^{-sT}. We end up with the Laplace transform of T,

\displaystyle E e^{-sT}=\frac{1}{\cosh N\theta} = \frac{1}{\cosh N\, \mathrm{arccosh}\, e^{s}}, \qquad s\ge 0

(I would not want to write \cosh^{-1} in such a formula.) Since N is an integer and e^{s}\ge 1, a magical identity applies:

\displaystyle  \cosh N \,\mathrm{arccosh}\,x =  \mathcal T_N(x)\qquad x\ge 1,

\mathcal T_N being the Nth Чебышёв polynomial of the first kind. (Compare to \displaystyle  \cos N \,\mathrm{arccos}\,x =  \mathcal T_N(x) which holds for |x|\le 1.) Thus,

\displaystyle E e^{-sT}=\frac{1}{\mathcal T_N (e^{s})}, \qquad s\ge 0

One more change of notation: let e^{-s}=y, so that E e^{-sT}=E(y^T)=\sum_{n=1}^{\infty} P(T=n)\,y^n. The formula

\displaystyle \sum_{n=1}^{\infty} P(T=n)\,y^n=\frac{1}{\mathcal T_N (1/y)}

gives us the probabilities P(T=n) for each n, assuming we can expand the reciprocal of the Чебышёв polynomial into a power series. Let’s check two simple cases:

  • if N=1, then \mathcal T_1(x)=x and \frac{1}{\mathcal T_1 (1/y)}=y. Hence T\equiv 1 which is correct: the walk immediately hits one of the walls \pm 1.
  • if N=2, then \mathcal T_1(x)=2x^2-1, hence \displaystyle \frac{1}{\mathcal T_2 (1/y)}=\frac{y^2}{2}\frac{1}{1-y^2/2}=\sum_{k=1}^\infty \frac{1}{2^k} y^{2k}. This means T can never be odd (which makes sense, since it must always have the parity of N), and the probability of hitting a wall in exactly 2k steps is 1/2^k. A moment’s reflection confirms that the answer is right.

For a serious example, with N=10, I used Maple:

with(plots):
n:=150: 
ser:=series(1/ChebyshevT(10, 1/x),x=0,2*n+1):
for i from 1 to 2*n do a[i]:=coeff(ser, x^i) end do:
listplot([seq(a[i],i=1..2*n)], color=blue, thickness=2);

By default listplot connects the dots, and since every other term is zero, the plot has a trippy pattern.

Exit time distribition: click to magnify

Obviously, it’s impossible to hit a wall in fewer than N steps. The formula confirms this: writing 1/\mathcal T_N (1/y) a rational function with numerator y^N, we see that no terms y^n with n<N can appear in the expansion.

(This post is an exercise which I should have done long ago but never did. Thanks are due to Ted Cox for a walk-through.)