It is a well-known open problem whether the following process terminates for every positive integer:

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer the number is even, it is convenient to replace the step with , as show below:

As a function of , the stopping time has a nice patterned graph:

An odd integer is of the form or . In the first case, is even, while in the second case is odd. So, if is picked randomly from all odd integers of certain size, the probability of being even is . Similarly, for an even , the number can be even or odd with probability .

This leads to a stochastic model of the process:

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

However, smoothing out both graphs by moving window of width or so, we see the similarity:

The stochastic process is much easier to analyze. Focusing on the logarithm , we see that it changes either by or by approximately . The expected value of the change is . This suggests that we can expect the logarithm to drop down to in about steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

For an in-depth investigation, see Lower bounds for the total stopping time of 3X+1 iterates by Applegate and Lagarias.

For the computations, I used Scilab. The function `hail(n,m)`

calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).

```
```function hail(n,m)
steps=zeros(1:n);
steps(1)=0
for i=2:n
k=i;
s=0;
while k>=i
if modulo(k,2)==0 then
k=k/2;
s=s+1;
else
k=(3*k+1)/2;
s=s+1;
end
end
steps(i)=s+steps(k);
end
total = cumsum(steps)
for i=1:n-m
average(i)=(total(i+m)-total(i))/m;
end
plot(average,'+');
endfunction

As soon as the result of computations drops below the starting value, the number of remaining steps is fetched from the array that is already computed. This speeds up the process a bit.

The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.

```
```function randomhail(n,m)
rsteps=zeros(1:n);
rsteps(1)=0
for i=2:n
k=i;
s=0;
while k>1
if grand(1,1,"bin",1,1/2)==0 then
k=k/2;
s=s+1;
else
k=(3*k+1)/2;
s=s+1;
end
end
rsteps(i)=s;
end
rtotal = cumsum(rsteps)
for i=1:n-m
raverage(i)=(rtotal(i+m)-rtotal(i))/m;
end
plot(raverage,'r+');
endfunction