Autogenerated numbers 2

Although this is a follow up to an earlier post, it can be read independently.

Repeatedly squaring the number 5, we get the following.

5^1 =                        5
5^2 =                       25
5^4 =                      625
5^8 =                   390625
5^16 =            152587890625
5^32 = 23283064365386962890625

There is a pattern here, which becomes more evident if the table is truncated to a triangle:

5^1 =                        5
5^2 =                       25
5^4 =                      625
5^8 =                  ...0625
5^16 =                ...90625
5^32 =               ...890625
5^64 =              ...2890625
5^128 =            ...12890625
5^256 =           ...212890625

What other digits, besides 5, have this property? None in the decimal system (ignoring the trivial 0 and 1). But the same pattern is shown by 3 in base 6 (below, the exponent is still in base 10, for clarity).

3^1 =                        3
3^2 =                       13
3^4 =                      213
3^8 =                  ...0213
3^16 =                ...50213
3^32 =               ...350213
3^64 =              ...1350213

and so on.

Here is a proof that the pattern appears when the base is 2 mod 4 (except 2), and the starting digit is half the base.

Let {x} be the starting digit, and {b=2x} the base. The claim is that {x^{2^k}\equiv x^{2^{k-1}} \bmod b^k} for all {k=1,2,\dots}.

Base of induction: {x^2-x = x(x-1)} is divisible by {x} and also by {2}, since {x-1} is even. Hence it is divisible by {2x=b}.

Step of induction: Suppose {x^{2^k} - x^{2^{k-1}}} is divisible by {b^k}. Then
{x^{2^{k+1}} - x^{2^{k}} = (x^{2^k} - x^{2^{k-1}})(x^{2^k} + x^{2^{k-1}})}
The second factor is divisible by {x} and also by {2}, being the sum of two odd numbers. So, it is divisible by {b}. Since the first factor is divisible by {b^k}, the claim follows: {x^{2^{k+1}} - x^{2^{k}} } is divisible by {b^{k+1}}.

So that was easy. The part I can’t prove is the converse. Numerical evidence strongly suggests that {b=2\bmod 4}, {x=b/2}
is also the necessary condition for the triangular pattern to appear.

Autogenerated numbers

An integer is automorphic if its square ends with that integer, like 762 = 5776. This notion is clearly base-dependent. Ignoring trivial 0 and 1, in base 10 there are two such numbers for any number of digits; they comprise two infinite sequences that can be thought of informally as “infinite” solutions of x2 = x, and formally as solutions of this equation in the ring of 10-adic numbers. They are …56259918212890625 and …740081787109376, both recorded in OEIS, as Wolfram MathWorld points out.

There is a difference between these two sequences. The former naturally grows from the single digit 5, by repeatedly squaring and keeping one more digit than we had: 52 = 25,  252 = 625, 6252= 390625, 06252 = 390625, 906252 = 8212890625, … (One should not get greedy and keep all the digits after squaring: for example, the leading 3 in 390625 is not the digit we want.)  The process described above does not work for 6, because 62 = 36 rather than 76. For the lack of a better term, I’ll call the infinite numbers such as …56259918212890625 autogenerated.

According to Wikipedia, the number of b-adic solutions of x2 = x is 2d where d is the number of distinct prime factors of b. (Another way to put it: there are as many solutions as square-free divisors of the base.) Two of the solutions are trivial: 0 and 1. So, infinite automorphic numbers exist in every base that is not a prime power, and only in those.

Autogenerated numbers are rarer. For example, there are none in base 12. Indeed, the two viable seed digits are 4 and 9: in base 12, 42 is 14 and 92 is 69. But 14 is not automorphic: 142 = 194. With 69 we get a little further: 692 = 3969. But then 969 is not automorphic, and the process stops.

Computer search suggests that autogenerated numbers exist if and only if the base is 2 mod 4 (and is not 2). Furthermore, there is exactly one autogenerated number for each such base, and its seed is half the base. Some examples, with 20 digits shown in each base:

  • … 21314 15515 22213 50213 in base 6
  • … 92256 25991 82128 90625 in base 10
  • … 8676a 8cba5 7337a a0c37 in base 14
  • … aea80 1g4c9 68da4 e1249 in base 18
  • … 179aa 1f0e7 igdi8 d185b in base 22
  • … b9ofn odpbn 31mm3 h1g6d in base 26
  • … f46rg 1jirj r6f3f e1q7f in base 30
  • … g2khh vlas5 k7h4h i248h in base 34

And this is how they were computed with Python: brute force search works fine, because unsuccessful candidates drop out very quickly.

def baseN(num, b, numerals="0123456789abcdefghijklmnopqrstuvwxyz"): 
    return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])
    # from http://stackoverflow.com/a/2267428
 
target = 100    # number of digits to compute
for b in range(3, 36):    # base
    for x in range(2, b):    # seed digit
        mod = b
        while x < b**(target-1) and (x**2 - x) % mod == 0:
            mod = mod*b
            x = x*x % mod
        if x >= b**(target-1):
            print('Autogenerated number {} in base {}'.format(baseN(x, b), b)) 

I don’t have a proof of the “2 mod 4” claim, but it may well have a proof somewhere already… According to Dave Richeson, the autogenerating property of 5 in base 10 was discussed in a blog post by Foxmaths!, but the blog is private now. It is also stated in their OEIS article as “a(n+1) = a(n)^2 mod 10^(n+1). – Eric M. Schmidt, Jul 28 2012”.

Convex up to a reparametrization

Given a function {f\colon \mathbb R\rightarrow \mathbb R}, we have several ways of checking if it is convex. But how to recognize the functions that are convex up to a reparametrization, meaning that there is a homeomorphism {h} of the real line such that the composition {g = f\circ h} is convex?

Let’s start by listing some necessary conditions for {f} to have this property.

1. Since convex functions are continuous, {f} must be continuous.

2. Convex functions are quasiconvex, meaning that each sublevel set {U_t = \{x\colon g(x) < t\}} is convex (in this case, an open interval). This property is preserved under reparametrization. So, {f} must be quasiconvex as well.

3. Nonconstant convex functions are unbounded above. So, {f} must have this property too.

Is this all? No. For example, let {f(x) = -x} if {x< 0} and {f(x)=\arctan x} if {x\ge 0}. This is a continuous quasiconvex function that is unbounded above. But it can’t be made convex by a reparametrization, because as {x\rightarrow +\infty}, it increases while staying bounded. This leads to another condition:

4. If {f} is not monotone, then its limits as {x\rightarrow+\infty} and {x\rightarrow -\infty} must both be {+\infty}.

And still we are not done. The function

{f(x) = \begin{cases} x+1, \quad &x<-1 \\ 0, \quad & -1 \le x\le 1 \\ x-1, \quad & x>1 \end{cases} }

satisfies all of 1–4. But it has a flat spot (the interval {[-1,1]} where it is constant) which does not achieve its minimum. This can’t happen to convex functions: if a convex function is constant on an interval, then (by virtue of being above its tangent lines) it has to attain its minimum there. This leads to the 5th condition:

5. {f} is either strictly monotone, or attains its minimum on some closed interval {I_{\min}(f)} (possibly unbounded or a single point). In the latter case, it is strictly monotone on each component of {\mathbb R\setminus I_{\min}(f)}.

Finally, we have enough: together, 1–5 is the necessary and sufficient condition for {f} to be convex up to a reparametrization. The proof of sufficiency turns out to be easy. A continuous strictly monotone function is a homeomorphism between its domain and range. The strictly monotone (and unbounded) parts of {f} give us homeomorphisms such that the composition of {f} with them is affine. Consequently, there is a homeomorphism {h} of the real line such that {f\circ h} is one of the following five functions:

  1. {g(x) \equiv 1} if {I_{\min}(f) = \mathbb R}
  2. {g(x) = \max(x,0)} if {I_{\min}(f)} is unbounded, but not all of {\mathbb R}
  3. {g(x) = \max(|x|-1, 0)} if {I_{\min}(f)} is a bounded nondegenerate interval
  4. {g(x) =|x|} if {I_{\min}(f)} is a single point
  5. {g(x) =x} if {I_{\min}(f)} is empty

Incidentally, we also see there are five equivalence classes of convex functions on {\mathbb R}, the equivalence being any homeomorphic change of variables.


What functions on {\mathbb R^n} are convex up to reparametrization? Unclear. (Related Math SE question)

 

String matching in Google Sheets/Docs/Apps Script

Here I summarize various ways of matching string patterns in selected Google products: Sheets, Docs, and Apps Script. They are ordered from least to most powerful.

  1. String comparison in Sheets, =if(A1 = "cat", "is", "is not") Perhaps surprisingly, this is not literal comparison, but case-insensitive match: the condition also holds true if A1 has “Cat” or “CAT”. This makes one wonder how to actually test strings for equality; I’ll return to this later.
  2. Wildcards in Sheets functions like countif and sumif: ? matches any single character, * matches any number of any characters.
    Example: =countif(A1:A9, "a?b*")
  3. like comparison in query, the Sheets function using the Google Visualization API Query Language. This is similar to the previous: underscore _ matches any single character, percentage symbol % any number of any characters.
    Example: =query(A1:B9, "select A where B like 'a_b%'").
  4. findText method of objects of class Text in Apps Script, extending Google Docs. Documentation says “A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.” In particular, it does not support lookarounds.
    Example:
  5. var body = DocumentApp.getActiveDocument().getBody().editAsText();
    var found = body.findText("ar{3,}gh").getElement().asText().getText();
    

    Yeah… I find Google Docs API clumsy, verbose and generally frustrating; unlike Google Sheets API.

  6. regexmatch function in Sheets and its relatives regexextract and regexreplace. Uses RE2 regex library, which is performance-oriented but somewhat limited, for example it does not support lookarounds. It does support capture groups.
    Example: =regexmatch(A1, "^h[ao]t+\b")
  7. matches comparison in query, the Sheets function using the Google Visualization API Query Language. Supports regular expressions (matching an entire string), including lookaheads but apparently not lookbehinds. Not clear what exactly is supported.
    Example: =query(A1:B9, "select A where B matches 'a.b(?!c).*'").
  8. JavaScript regular expression methods are fully supported in Apps Script.

So what about the literal, case-sensitive string comparison in Google Sheets? Apparently, the way to do it is to use regexmatch… Example: =regexmatch(A1, "^cat$") returns True when A1 contains “cat” in lower case.

Weak convergence in metric spaces

Weak convergence in a Hilbert space is defined as pointwise convergence of functionals associated to the elements of the space. Specifically, {x_n\rightarrow x} weakly if the associated functionals {f_n(y) = \langle x_n, y \rangle} converge to {f(y) = \langle x_n, y \rangle} pointwise.

How could this idea be extended to metric spaces without linear structure? To begin with, {f_n(y)} could be replaced with {\|x_n-y\|^2-\|x_n\|^2}, since this agrees with original {f_n} up to some constant terms. Also, {\|x_n\|^2} here could be {\|x_n-z\|^2} for any fixed {z}; to avoid introducing another variable here, let’s use {x_1} for the purpose of fixed reference point {z}. Now we have a metric space version of the weak convergence: the functions
{f_n(y) = d(x_n,y)^2 - d(x_n,x_1)^2}
must converge pointwise to
{f(y) = d(x,y)^2 - d(x,x_1)^2}

The triangle inequality shows that strong convergence {d(x_n,x)\rightarrow 0} implies weak convergence, as expected. And the converse is not necessarily true, as the example of a Hilbert space shows.


Aside: the above is not the only way to define weak convergence in metric spaces. Another approach is to think of {\langle x_n , y\rangle} in terms of projection onto a line through {y}. A metric space version of this concept is the nearest-point projection onto a geodesic curve. This is a useful approach, but it is only viable for metric spaces with additional properties (geodesic, nonpositive curvature).

Also, both of these approaches take the Hilbert space case as the point of departure, and do not necessarily capture weak convergence in other normed spaces.


Let’s try this out in {\ell^1} with the standard basis sequence {e_n}. Here {f_n(y) = \|e_n-y\|^2 - 4 \rightarrow (\|y\| + 1)^2 - 4}. Is there an element {x\in \ell^1} such that

{\|x-y\|^2 - \|x-e_1\|^2= (\|y\|+1)^2 - 4} for all {y}?

Considering both sides as functions of one variable {y_n}, for a fixed {n}, shows that {x_n=0} for {n}, because the left hand side is non-differentiable at {y_n=x_n} while the right hand side is non-differentiable at {y_n=0}. Then the desired identity simplifies to {\|y\|^2 - 1 = (\|y\|+1)^2 - 4} which is false. Oh well, that sequence wasn’t weakly convergent to begin with: by Schur’s theorem, every weakly convergent sequence in {\ell^1} also converges strongly.

This example also shows that not every bounded sequence in a metric space has a weakly convergent subsequence, unlike the way it works in Hilbert spaces.

Graphs with an invertible incidence matrix

Graphs are often encoded by their adjacency matrix: a symmetric matrix where {1} in the entry {(i,j)} means there is an edge between vertices labeled {i} and {j}. Another useful encoding is the incidence matrix, in which rows correspond to edges and columns to vertices (or the other way around). Having {1} in the entry {(i,j)} means that {i}th edge is incident to {j}th vertex. Simply put, each row contains exactly two nonzero entries, which indicate the endpoints of an edge.

An incidence matrix is generally rectangular. It is the square matrix when the graph has as many edges as vertices. For example, the 4-cycle has incidence matrix

{M = \displaystyle \begin{pmatrix} 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 1 & 0 & 0 & 1 \end{pmatrix}}

(up to relabeling of vertices). This matrix is not invertible. Indeed, {M(1,-1,1,-1)^T=0}, which is a reflection of the fact that the graph is bipartite. Indeed, for any bipartite graph the {\pm 1} vector describing a 2-coloring of the vertices belongs to the kernel of the incidence matrix.

The incidence matrix of an odd cycle is invertible, however. What other such “invertible” graphs exist? Clearly, the disjoint union of “invertible” graphs is itself invertible, so we can focus on connected graphs.

There are no invertible graphs on 1 or 2 vertices. For 3 through 8 vertices, exhaustive search yields the following counts:

  • 1 graph on 3 vertices (3-cycle)
  • 1 graph on 4 vertices (3-cycle with an edge attached)
  • 4 graphs on 5 vertices
  • 8 graphs on 6 vertices
  • 23 graphs on 7 vertices
  • 55 graphs on 8 vertices

The numbers match the OEIS sequence A181688 but it’s not clear whether this is for real or a coincidence. The brute force search takes long for more than 8 vertices. It’s probably better to count such graphs by using their (folklore?) characterization as “trees planted on an odd cycle”, described by Chris Godsil on MathOverflow.

Some examples:

graph7_4

graph8_52

graph8_7

The examples were found and plotted by this Python script.

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from itertools import combinations

def is_invertible(graph, n):
    matrix = []
    for i in range(n):
        row = [0]*n
        row[graph[i][0]] = 1
        row[graph[i][1]] = 1
        matrix.append(row)
    return abs(np.linalg.det(matrix)) > 0.5

n = 6   # number of vertices
graphs = combinations(combinations(range(n), 2), n)
inv = []
for g in graphs:
    if is_invertible(g, n):
        gr = nx.Graph(g)
        if nx.is_connected(gr) and not any([nx.is_isomorphic(gr, gr2) for gr2 in inv]):
            inv.append(gr)
            nx.draw_networkx(gr)
            plt.savefig("graph{}_{}.png".format(n, len(inv)))
            plt.clf()

print("There are {} connected invertible graphs with {} vertices".format(len(inv), n))

f(f(x)) = 4x

There are plenty of continuous functions {f} such that {f(f(x)) \equiv x}. Besides the trivial examples {f(x)=x} and {f(x)=-x}, one can take any equation {F(x,y)=0} that is symmetric in {x,y} and has a unique solution for one variable in terms of the other. For example: {x^3+y^3-1 =0 } leads to {f(x) = (1-x^3)^{1/3}}.

x3y3
x^3+y^3 = 1

I can’t think of an explicit example that is also differentiable, but implicitly one can be defined by {x^3+y^3+x+y=1}, for example. In principle, this can be made explicit by solving the cubic equation for {x}, but I’d rather not.

implicit
x^3+y^3+x+y = 1

I don’t know of any diffeomorphism {f\colon \mathbb R \rightarrow \mathbb R} such that both {f} and {f^{-1}} have a nice explicit form.


Let’s change the problem to {f(f(x))=4x}. There are still two trivial, linear solutions: {f(x)=2x} and {f(x)=-2x}. Any other? The new equation imposes stronger constraints on {f}: for example, it implies

\displaystyle f(4x) = f(f(f(x)) = 4f(x)

But here is a reasonably simple nonlinear continuous example: define

\displaystyle f(x) = \begin{cases} 2^x,\quad & 1\le x\le 2 \\ 4\log_2 x,\quad &2\le x\le 4 \end{cases}

and extend to all {x} by {f(\pm 4x) = \pm 4f(x)}. The result looks like this, with the line {y=2x} drawn in red for comparison.

ff4x
f(f(x)) = 4x

To check that this works, notice that {2^x} maps {[1,2]} to {[2,4]}, which the function {4\log_2 x} maps to {[4,8]}, and of course {4\log _2 2^x = 4x}.

From the plot, this function may appear to be differentiable for {x\ne 0}, but it is not. For example, at {x=2} the left derivative is {4\ln 2 \approx 2.8} while the right derivative is {2/\ln 2 \approx 2.9}.
This could be fixed by picking another building block instead of {2^x}, but not worth the effort. After all, the property {f(4x)=4f(x)} is inconsistent with differentiability at {0} as long as {f} is nonlinear.

The plots were made in Sage, with the function f define thus:

def f(x):
    if x == 0:
        return 0
    xa = abs(x)
    m = math.floor(math.log(xa, 2))
    if m % 2 == 0:
        return math.copysign(2**(m + xa/2**m), x)
    else:
        return math.copysign(2**(m+1) * (math.log(xa, 2)-m+1), x)