The distribution of pow(2, n, n)

For a positive integer {n} let {f(n)} be the remainder of the division of {2^n} by {n}. This is conveniently computed in Python as pow(2, n, n). For example, the first 20 values are returned by

[pow(2, n, n) for n in range(1, 21)]

and they are:

[0, 0, 2, 0, 2, 4, 2, 0, 8, 4, 2, 4, 2, 4, 8, 0, 2, 10, 2, 16]

Obviously, {f(n)=0} if and only if {n} is a power of {2}. It also looks like {f} is always even, with powers of 2 dominating the list of its values. But {f} does take on odd values, although this does not happen often: only 6 times in the first 100 integers.

[(n, pow(2, n, n)) for n in range(1, 101) if pow(2, n, n) % 2]

returns

[(25, 7), (45, 17), (55, 43), (91, 37), (95, 13), (99, 17)]

It is conjectured that the range of {f} consists of all nonnegative integers except 1. Here is a proof that {f(n)\ne 1}, provided by Max Alexeyev on the OEIS page for A036236

Suppose {2^n \equiv 1 \bmod n} for some {n>1}. Let {p} be the smallest prime divisor of {n}. Then {2^n\equiv 1 \bmod p}. This means that the order of {2} in the multiplicative group {(\mathbb Z/ p \mathbb Z)^*} is a divisor of {n}. But this group has {p-1} elements, and the only divisor of {n} that is smaller than {p} is {1}. Thus, the order of {2} is {1}, which means {2\equiv 1 \bmod p}, which is absurd.

The fact that the smallest {n} with {f(n) = 3} is 4700063497 deserves to be mentioned here. But I would like to consider the most frequent values of {f} instead. Experimentally, they are found like this:

from collections import Counter
freq = Counter(pow(2, n, n) for n in range(1, 100000000 + 1))
print(freq.most_common(20))

which prints 20 pairs (value, frequency):

(2, 5763518),
(4, 3004047),
(8, 2054167), 
(16, 1569725), 
(64, 1076293), 
(256, 824438), 
(512, 737450), 
(1024, 668970), 
(32, 638063), 
(4096, 569705), 
(128, 466015), 
(65536, 435306), 
(262144, 389305), 
(1048576, 351723), 
(2097152, 326926), 
(16384, 246533), 
(16777216, 243841), 
(32768, 232037), 
(2048, 153537), 
(8192, 131614)

These are all powers of 2… but not exactly in the order one might expect. I repeated the experiment for the first {10^8k} integers with {k=1, 2, 3, 4, 5, 6}. The frequency order remained stable at the top. These are the most common 11 values, with their frequency (in %) listed for the aforementioned six ranges.

2, [5.8, 5.5, 5.4, 5.3, 5.3, 5.2]
4, [3.0, 2.9, 2.8, 2.8, 2.7, 2.7]
8, [2.1, 2.0, 1.9, 1.9, 1.9, 1.8]
16, [1.6, 1.5, 1.5, 1.4, 1.4, 1.4]
64, [1.1, 1.0, 1.0, 1.0, 1.0, 1.0]
256, [0.8, 0.8, 0.8, 0.8, 0.7, 0.7]
512, [0.7, 0.7, 0.7, 0.7, 0.7, 0.7]
1024, [0.7, 0.6, 0.6, 0.6, 0.6, 0.6]
32, [0.6, 0.6, 0.6, 0.6, 0.6, 0.6]
4096, [0.6, 0.5, 0.5, 0.5, 0.5, 0.5]
128, [0.5, 0.4, 0.4, 0.4, 0.4, 0.4]

It seems that 32 and 128 are consistently less common than one might expect.

The most common value, 2, is contributed by primes and pseudoprimes to base 2. The value of 4 appears when {n = 2p} with {p} prime (but not only then). Still, the primes having zero density makes it difficult to explain the frequency pattern based on primality. A more convincing reason could be: when {n=2k} is even, we are computing {4^k \bmod 2k} and that is likely to be a power of {4}. This boosts the frequency of the even (generally, composite) powers of 2 in the sequence.

Magic angles

The function {u(x, y, z) = 2z^2 - x^2 - y^2} is notable for the following combination of properties:

  1. It is a harmonic function: {\Delta u = -2 - 2 + 4 = 0}
  2. It vanishes at the origin {(0, 0, 0)} together with its gradient.
  3. It is positive on the cone {C=\{(x, y, z) : z > \sqrt{(x^2+y^2)/2}\}}

The cone C has the opening angle {\theta_3 = \cos^{-1}(1/\sqrt{3}) \approx 57.4^\circ} which is known as the magic angle in the context of NMR spectroscopy. Let us consider the mathematical side of its magic.

If C is replaced by any larger cone, the properties 1-2-3 cannot be satisfied by a harmonic function in a neighborhood of the origin. That is, C is the largest cone such that a harmonic function can have a critical point at its vertex which is also a point of its extremum on the cone. Why is that?

Let {u} be a harmonic function in some neighborhood of {(0,0,0)} and suppose {u(0,0,0)=0}, {\nabla u(0,0,0) = 0}, and {u>0} on some cone {C = \{(x, y, z) \colon z>c \sqrt{x^2+y^2+z^2} \}}. Expand {u} into a sum of polynomials {p_2+p_3+\cdots } where each {p_k} is a harmonic homogeneous polynomial of degree {k}. Let {m} be the smallest integer such that {p_m} is not identically zero. Then {p_m} has the same properties as {u} itself, since it dominates the other terms near the origin. We may as well replace {u} by {p_m}: that is, {u} is a spherical harmonic from now on.

Rotating {u} around the {z}-axis preserves all the properties of interest: harmonic, positive on the cone, zero gradient at the origin. Averaging over all these rotations we get a rotationally symmetric function known as a zonal spherical harmonic. Up to a constant factor, such a function is given by {P_m(\cos \phi)} where {\phi} is a spherical coordinate (angle with the {z}-axis) and {P_m} is the Legendre polynomial of degree {m}.

The positivity condition requires {P_m(t) > 0} for {t>c}. In other words, the bound on {\theta} comes from the greatest zero of the Legendre polynomial. As is true for orthogonal polynomials in general, the zeros are interlaced: that is, the zeros of {P_m=0} appear strictly between any two consecutive zeros of {P_{m+1}}. It follows that the value of the greatest zero grows with {m}. Thus, it is smallest when {m=2}. Since {P_2(t) = (3t^2-1)/2}, the zero of interest is {1/\sqrt{3}}, and we conclude that {c \ge  1/\sqrt{3}}. Hence the magic angle.

cone
A piece of magic cone

The magic angle is easy to visualize: it is formed by an edge of a cube and its space diagonal. So, the magic cone with vertex at (0,0,0) is the one that passes through (1, 1, 1), as shown above.

In other dimensions the zonal harmonics are expressed in terms of Gegenbauer polynomials (which reduce to Chebyshev polynomials in dimensions 2 and 4). The above argument applies to them just as well. The relevant Gegenbauer polynomial of degree {2} is {nx^2-1} up to a constant. Thus, in {\mathbb R^n} the magic angle is {\cos^{-1}(1/\sqrt{n})}, illustrated by the harmonic function {u(x)=nx_n^2 - |x|^2}.

This analysis is reminiscent of the Hopf Lemma which asserts that if a positive harmonic function in a smooth domain has boundary value 0 at some point, the normal derivative cannot vanish at that point. The smoothness requirement is typically expressed as the interior ball condition: one can touch every boundary point by a ball contained in the domain. The consideration of magic angles shows that if the function is also harmonic in a larger domain, the interior ball condition can be replaced by the interior cone condition, provided that the opening angle of the cone is greater than {\cos^{-1}(1/\sqrt{n})}.

Measurability of Banach space valued functions

There is only an indirect proof of the existence of a function {f\colon [0, 1]\to \mathbb R} that is not Lebesgue measurable. But it’s easy to give an explicit example when the codomain of {f} is a Banach space: just let {b(t)} be the sequence of the binary digits of {t}, considered as an element of the sequence space {\ell_\infty}.

Why is {b} not measurable? Recall that a Banach space-valued function {f} is (Bochner) measurable iff there is a sequence of simple functions {\sum v_k \chi_{A_k}} (finite sum, measurable {A_k}, arbitrary vectors {v_k}) that converges to {f} almost everywhere. This property implies that, with an exception of a null set, the range of {f} lies in the separable subspace spanned by all the vectors {v_k} used in the sequence of simple functions. But {b} has the property {\|b(t)-b(s)\|=1} whenever {t\ne s}, so the image of any uncountable set under {b} is nonseparable.

Another way to look at this: on the interval [0, 1) the function {b} is injective and its range has discrete topology, which implies that every subset of [0, 1) is the preimage of some open subset of {\ell_\infty} under {b}.

The binary-digits functions can also be used to illustrate an issue with the duality of Lebesgue-Bochner spaces {L_p(0, 1; X)} where {X} is a Banach space and {1\le p<\infty}. (So, {f} belongs to this space iff it is Bochner measurable and the {L^p} norm of {\|f\|\colon [0, 1]\to [0, \infty)} is finite.) In general we do not have the expected relation {L_p(0, 1; X)^* = L_q(0, 1; X^*)} with {1/p+1/q=1}. The natural isometric embedding of {L_q(0, 1; X^*)} into {L_p(0, 1; X)^*} is still there: any {g\in L_q(0, 1; X^*)} acts on {L_p(0, 1; X)} by {f\mapsto \int \langle f(t), g(t) \rangle\, dt}. But unless {X^*} has the Radon–Nikodym property, these are more bounded linear functionals on {L_p(0, 1; X)}.

To construct such a functional, let {b_n(t)} be the {n}-th binary digit of {t}. Given {f\in L_1(0, 1; \ell_1)}, write it in coordinates as {(f_1, f_2, \dots)} and define {\varphi(f) = \sum_n \int_0^1 f_n b_n}. This is a bounded linear functional, since {|\varphi(f)|\le \sum_n \int_0^1 |f_n| = \|f\|}. But there is no function {g\in L_\infty(0, 1; \ell_\infty)} that represents it, i.e., {\varphi(f) = \int_0^1 \langle f(t), g(t)\rangle \,dt = \sum_n \int_0^1 f_n g_n }. Indeed, if such {g} existed then by considering {f} with only one nonzero coordinate, we find that {g_n} must be {b_n}, using the duality {L_1^* =  L_\infty} in the scalar case. But the function {[0, 1]\to \ell_\infty} with the components {(b_1, b_2, \dots)} is not measurable, as shown above.

This example, which applies to all {1\le p<\infty}, also serves as a reminder that the duality relation {L_p(0, 1; X)^* = L_q(0, 1; X^*)} depends on the dual space {X^*} having the Radon-Nikodym property (RNP), not {X} itself. Indeed, {X=\ell_1} has the RNP; its dual does not.

The importance of {X^*} having the RNP becomes clear once one tries to follow the usual proof of {L_p^*=L_q}. Given {\varphi\in L_p(0,1;X)^*}, we can define an {X^*}-valued measure {\tau} on {[0, 1]} by {\tau(A)(v) = \varphi( v\chi_A)} where {A\subset [0, 1]} is Lebesgue measurable and {v\in X}. This measure has reasonable finiteness and continuity properties coming from {\varphi} being a bounded functional. Still, the existence of a density {g\colon [0, 1]\to X^*} of the measure {\tau} depends on the structure of the Banach space {X^*}.

Immaculate perfection

— I am so lucky. I was born on April 4th 1944, that’s 4/4/44. If you add that up, it comes to 16, one six. One plus six is seven: luckiest number of all.
— You know your math.
— It’s more than math, Mike, it’s… immaculate perfection.

The value of 7 for the the digital root indeed seems preferable, at least on this blog. But looking at the dates formed by repeated digits is too anthropocentric (not to mention xkcd1179). Besides, mathematics already has a concept of perfect numbers: being equal to the sum of proper divisors.

So, are there any perfect numbers with digital root equal to 7, equivalently {n\equiv 7\bmod 9}? For even perfect numbers, one can expect that the Euclid-Euler theorem {n = 2^{p-1} (2^p-1)} will help with the answer, but even so, the solution turns out to be surprisingly simple.

A lucky coincidence is that {2^6\equiv 1\bmod 9} ({64 = 1 + 7\cdot 9}). Since all primes greater than 3 have remainder 1 or 5 mod 6, there are only a few cases to check:

  • {p\equiv 1\bmod 6}, hence {2^{p-1}\equiv 2^{1-1} \equiv 1 \bmod 9} and {2^p-1\equiv 2^1-1 \equiv 1\bmod 9}. In this case {n\equiv 1\cdot 1 \equiv 1 \bmod 9}
  • {p\equiv 5\bmod 6}, hence {2^{p-1}\equiv 2^{5-1} \equiv 7 \bmod 9} and {2^p-1\equiv 2^5-1 \equiv 4\bmod 9}. In this case {n\equiv 7\cdot 4 \equiv 1 \bmod 9}
  • Exceptional cases p=2, 3 correspond to n = 6, 28. The latter is congruent to 1 mod 9, matching the preceding cases.

So, every even perfect number except 6 is congruent to 1 mod 9. Should a perfect number be congruent to 7 mod 9, it has to be odd. An odd perfect number, should it exist, must be either divisible by 9 (not good for us) or congruent to 1 mod 12. (Reference: On the form of an odd perfect number). The condition {n\equiv 1 \bmod 12} is consistent with {n\equiv 7 \bmod 9}: for example, 25 satisfies both. Come to think of it, the sum of proper divisors of 25 is… not 25. But there might still be a 7 mod 9 perfect number out there.

What numbers k have the property that the sequence {n mod k : n is perfect} stabilizes? Assuming odd perfect numbers do not exist, every power of 2 has this property. So do 3 and 9 by the above (and therefore, their products with powers of 2). I don’t know any other such k, in particular n mod 27 does not appear to stabilize. Of course, it is impossible to prove any negative result here since we don’t know if there are infinitely many perfect numbers.

Bipartite metric spaces and positive kernels

Bipartite metric spaces provide simple examples of highly non-Euclidean metric spaces. Let {A} and {B} be two disjoint abstract sets, say, with {m} and {n} points respectively. Define a metric on the union {A\cup B} as follows, where {a, b, c} are some positive numbers:

  • The distance between distinct elements of {A} is {a}.
  • The distance between distinct elements of {B} is {b}.
  • The distance between two elements from different sets is {c}.

The triangle inequality is satisfied provided that {a, b\le 2c}. When {c} is large, the space is not very interesting: it is approximately the union of two simplexes placed at some large distance from each other. On the other hand, when {a=b=2c} we encounter a bizarre situation: every point of {B} is a “midpoint” between any two distinct points of {A}, and vice versa.


A function {K\colon X\times X \to \mathbb R} is called a positive kernel if the matrix {K(x_i, x_j)_{i, j=1}^n} is positive semidefinite for any choice of {n} and {x_1, \dots, x_n\in X}. A classical theorem of Schoenberg says that a metric space {(X, d)} is isometric to a subset of a Hilbert space if and only if {\exp(-\gamma  d^2)} is a positive kernel for all {\gamma >0}. One may ask: is there any function {f\colon [0, \infty)\to\mathbb R} such that {f\circ d} is a positive kernel on {X} for every metric space {(X, d)}?

Suppose {f} is such a function. Applying it to the bipartite metric defined above, we find that the following matrix must be positive semidefinite (the case {m=3}, {n=2} shown; in general the diagonal blocks have sizes {m\times m} and {n\times n}):

{\displaystyle \begin{pmatrix} f(0) & f(a) & f(a)  & f(c) & f(c) \\   f(a) & f(0) & f(a)  & f(c) & f(c) \\   f(a) & f(a) & f(0)  & f(c) & f(c) \\   f(c) & f(c) & f(c)  & f(0) & f(b) \\   f(c) & f(c) & f(c)  & f(b) & f(0)  \end{pmatrix} }

Test the PSD property against vectors of the form {(s, s, s, t, t)}, that is {m} entries equal to some number {s} followed by {n} entries equal to some number {t}. The result is the quadratic form {As^2 + Bt^2 + 2C^2st } where {A = (m^2-m) f(a)+mf(0)}, {B = (n^2-n)f(b)+nf(0)}, and {C = mnf(c)}.

For this quadratic form to be positive semidefinite, we need {A\ge 0} and {C^2\le AB} to hold. The inequality {A\ge 0} amounts to {f(a)\ge 0} and {f(0)\ge 0}, because {m} can be any positive integer. Simply put, {f} must be nonnegative.

The inequality {C^2\le AB} simplifies to

{ mn f(c)^2 \le ( (m-1) f(a)+f(0)) \cdot ((n-1)f(b)+ f(0)) }

When {m=n=1} and {a=b}, we obtain {f(c)\le f(0)}; so, {f} is maximized at {0}. Letting {m, n\to \infty} yields {f(c)^2 \le f(a)f(b)}. What does this mean?

If {c=(a+b)/2}, the inequality {f(c)^2 \le f(a)f(b)} amounts to log-convexity of {f}, and there are lots of log-convex functions. But crucially, our only constraint on {c} in the construction of a bipartite metric space is {c\ge \max(a, b)/2}. In the special case {a=b} we find that {f(c)\le f(a)} whenever {c\ge a/2} (and {a>0}). On one hand, {f} must be nonincreasing because all {c\ge a} are allowed. On the other hand, {f(a/2)\le f(a)} implies that the sequence {k\mapsto f(2^k)} is nondecreasing for {k\in\mathbb Z}. These conditions can be satisfied together only if {f} is constant on {(0, \infty)}.

And there you have it, a complete description of functions {f\colon [0, \infty)\to\mathbb R} that transform every distance matrix into a positive semidefinite matrix: {f\equiv c\ge 0} on {(0, \infty)}, and {f(0)\ge c}. It is immediate that such {f} indeed works for all metric spaces, not just bipartite ones.

Noncontracting Jordan curves

A simple closed curve {\Gamma} on the plane can be parameterized by a homeomorphism {f\colon \mathbb T\to \Gamma} in infinitely many ways. It is natural to look for “nice” parameterizations, like nonexpanding ones in the previous post. This time, let us look for noncontracting parameterizations: {|f(a)-f(b)|\ge |a-b|} for all {a, b\in \mathbb T}. Note that Euclidean distance is used here, not arclength. And we still want {f} to be a homeomorphism. Its noncontracting property simply means that the inverse {f^{-1}} is nonexpanding aka 1-Lipschitz.

What are some necessary conditions for the existence of a noncontracting parameterization? We can mimic the three from the earlier post Nonexpanding Jordan curves, with similar proofs:

  1. The curve must have length at least {2\pi}.
  2. The curve must have diameter at least 2.
  3. The curve must enclose some disk of radius 1. (Apply Kirszbraun’s theorem to {f^{-1}} and note that the resulting Lipschitz map G of the plane will attain 0 somewhere, by a topological degree / winding number argument. Any point where G = 0 works as the center of such a disk.)

This time, the 3rd item supersedes both 1 and 2. Yet, the condition it presents is not sufficient for the existence of a noncontracting parameterization. A counterexample is a disk with a “comb-over”:

combover

Indeed, suppose that {g=f^{-1}} is a nonexpanding map from this curve onto the unit circle. Let {1 = A < B < C} be the three points at which the curve meets the positive x-axis. Since every point of the curve is within small distance from its arc AB, it follows that {g(AB)} is a large subarc of {\mathbb T} that covers almost all of the circle. But the same argument applies to the arcs {g(BC)} and {g(CA)}, a contradiction.

No matter how large the enclosed disk is, a tight combover around it can force arbitrarily large values of the Lipschitz constant of {f^{-1}}. Sad!

What about sufficient conditions?

  1. It is sufficient for the curve to be star-shaped with respect to the origin, in addition to enclosing the unit disk. (Equivalent statement: {\Gamma} has a polar equation {r = r(\theta)} in which {r \ge 1}.) Indeed, the nearest-point projection onto a convex set is a nonexpanding map, and projecting {\Gamma} onto the unit circle in this way gives the desired parameterization.
  2. It is sufficient for the curve to have curvature bounded by 1. I am not going into this further, because second derivatives should not be needed to control a Lipschitz constant.

Recall that for nonexpanding parameterizations, the length of the curve was a single quantity that mostly answered the existence question (length at most 4 — yes; length greater than {2\pi} — no). I do not know of such a quantity for the noncontracting case.

Extreme values of a reproducing kernel for polynomials

For every nonnegative integer {n} there exists a (unique) polynomial {K_n(x, y)} of degree {n} in {x} and {y} separately with the following reproducing property:

{\displaystyle p(x) = \int_{-1}^1 K_n(x, y)p(y)\,dy}

for every polynomial {p} of degree at most {n}, and for every {x}. For example, {K_1(x, y)= (3xy+1)/2}; other examples are found in the post Polynomial delta function.

This fact gives an explicit pointwise bound on a polynomial in terms of its integral on an interval:

{\displaystyle |p(x)| \le M_n(x) \int_{-1}^1 |p(y)|\,dy}

where {M_n(x) = \sup\{|K(x, y)| \colon y\in [-1, 1]\}}. For example, {M_1(x) = (3|x|+1)/2}.

Although in principle {x} could be any real or complex number, it makes sense to restrict attention to {x\in [-1, 1]}, where integration takes place. This leads to the search for extreme values of {K} on the square {Q=[-1, 1]\times [-1, 1]}. Here is how this function looks for {n=1, 2, 3}:

K1
Degree 1
K2
Degree 2
K3
Degree 3

The symmetries {K(x, y)=K(-x, -y) = K(y, x)} are evident here.

Explicitly,

{\displaystyle K_n(x, y) = \sum_{k=0}^n \frac{2k+1}{2} P_k(x)P_k(y)}

where {P_k} is the Legendre polynomial of degree {k} and the factor {(2k+1)/2} is included to make the polynomials an orthonormal set in {L^2(-1, 1)}. Since {P_k} oscillates between {-1} and {1}, it follows that

{\displaystyle |K_n(x, y)|\le \sum_{k=0}^n \frac{2k+1}{2} = \frac{(n+1)^2}{2}}

and this bound is attained at {K(1, 1)=K(-1,-1)=(n+1)^2/2} because {P_k(1)=1} and {P_k(-1)=(-1)^k}.

Is

{\displaystyle K_n(-1, 1) =\sum_{k=0}^n (-1)^k\frac{2k+1}{2} = (-1)^n \frac{n+1}{2}}

the minimum value of {K} on the square {Q}? Certainly not for even {n}. Indeed, differentiating the sum

{\displaystyle S_n(x) = K_n(x, 1) = \sum_{k=0}^n \frac{2k+1}{2} P_k(x)}

with respect to {x} and using {P_k'(-1) =(-1)^{k-1}k(k+1)/2}, we arrive at

{\displaystyle S_n'(-1) = (-1)^{n-1} \frac{n(n^2+3n+2)}{4}}

which is negative if {n} is even, ruling out this point as a minimum.

What about odd {n}, then: is it true that {K_n \ge -(n+1)/2} on the square {Q}?

{n=1}: yes, {K_1(x, y) = (3xy+1)/2 \ge (-3+1)/2 = -1} is clear enough.

{n=3}: the inequality {K_3\ge -2} is also true… at least numerically. It can be simplified to {35(xy)^3 + 9(xy)^2 + 15xy \ge (21x+21y+3)(x^2+y^2)} but I do not see a way forward from there. At least on the boundary of {Q} it can be shown without much work:

{\displaystyle K_3(x, 1) + 2 = \frac{5}{4}(x+1)(7x^2-4x+1)}

The quadratic term has no real roots, which is easy to check.

{n=5}: similar story, the inequality {K_5\ge -3} appears to be true but I can only prove it on the boundary, using

{\displaystyle K_5(x, 1)+3 = \frac{21}{16}(x + 1)(33 x^4 - 18x^3 - 12x^2 + 2x + 3)}

The quartic term has no real roots, which is not so easy to check.

{n=7}: surprisingly, {K_7(4/5, 1) = -2229959/500000} which is about {-4.46}, disproving the conjectural bound {K_7\ge -4}. This fact is not at all obvious from the plot.

K7
Degree 7 kernel

Questions:

  • Is {K_n \ge -Cn} on the square {Q = [-1, 1]\times [-1, 1]} with some universal constant {C}?
  • Is the minimum of {K_n} on {Q} always attained on the boundary of {Q}?
  • Can {M_n(x) = \sup\{|K(x, y)| \colon y\in [-1, 1]\}} be expressed in closed form, at least for small degrees {n}?