Inequalities from inequations

An inequality is a statement of the form $A\le B$ or $A < B$. (What's up with vertical alignment of formulas in WP?) An inequation is $A\ne B$. I can’t think of any adequate Russian word for “inequation”, but that’s besides the point. The literal analog “неравенство” is already used for “inequality”.

Suppose we want to prove that a map $f\colon \mathbb C \to \mathbb C$ is Lipschitz, that is, $\exists\ L$ such that $|f(z)-f(w)|\le L|z-w|$ for all $z,w\in\mathbb C$. All we know is that the map $f_{\lambda}(z):=f(x)+\lambda z$ is injective for all $\lambda\in \mathbb C$ with a large modulus, i.e., for $|\lambda|\ge C$. In the following we consider only such values of $\lambda$.

Fix distinct $z,w\in \mathbb C$ and record the inequation: $f(z)+\lambda z\ne f(w)+\lambda w$. Rearrange as $f(z)-f(w)\ne -\lambda (z-w)$. Note that we can multiply $\lambda$ by any unimodular complex number, since the inequation holds whenever $|\lambda|\ge C$. Thus, we have a stronger inequation: $|f(z)-f(w)|\ne |\lambda| |z-w|$. Yes, putting absolute values in an inequation makes it stronger, opposite to what happens with equations.

Still keeping $z$ and $w$ fixed, increase the modulus of $\lambda$ until the inequality $|f(z)-f(w)| < |\lambda| |z-w|$ holds. Continuity with respect to $\lambda$ tells us that $\ne$ was indeed $<$ all the way. So we bring $|\lambda|$ back down to $|\lambda|=C$, and happily record the inequality $|f(z)-f(w)| < C |z-w|$, the desired Lipschitz continuity.

Self-promotion: a paper in which the above trick was used.

Integer sets without singular matrices

Let’s say that a set $A\subset \mathbb N$ contains an $n\times n$ singular matrix if there exists an $n\times n$ matrix with determinant zero whose entries are distinct elements of $A$. For example, the set of prime numbers does not contain any $2\times 2$ singular matrices; however, for every $n\ge 3$ it contains infinitely many $n\times n$ singular matrices.

I don’t know of an elementary proof of the latter fact. By a 1939 theorem of van der Corput, the set of primes contains infinitely many progressions of length 3. (Much more recently, Green and Tao replaced 3 with an arbitrary $k$.) If every row of a matrix begins with a 3-term arithmetic progression, the matrix is singular.

In the opposite direction, one may want to see an example of an infinite set $A\subset \mathbb N$ that contains no singular matrices of any size. Here is one:
Continue reading “Integer sets without singular matrices”

Playing with numbers: 1/998001 and others

A post in response to Colin’s “Patterns in numbers”:

1/998001 = 0.000 001 002 003 004 005 006 007…

goes through all integers 000 through 999, skipping only 998. Maple confirms this:

This fraction made rounds on the internet a while ago.

I begin with two general claims:

• Integer numbers are more complicated than polynomials
• Decimal fractions are more complicated than power series

To illustrate the first one: multiplying 748 by 367 takes more effort than multiplying the corresponding polynomials,

$(7x^2+4x+8)(3x^2+6x+7) = 21x^4+54x^3+97x^2+76x+56$

The reason is that there’s no way for the product of $4 x$ and $3x^2$ to “roll over” into $x^4$. It’s going to stay in the $x^3$ group. Mathematically speaking, polynomials form a graded ring and integers don’t. At the same time, one can recover the integers from polynomials by setting $x=10$ in $21x^4+54x^3+97x^2+76x+56$. The result is $274516$.

Moving on to the second claim, consider the power series

$\displaystyle (*) \qquad \frac{1}{1-t}=1+t+t^2+t^3+\dots = \sum_{n=0}^{\infty} t^n$

I prefer to use $t$ instead of $x$ here, because we often want to replace $t$ with an expression in $x$. For example, setting $t=2x$ gives us

$\displaystyle \frac{1}{1-2x}=1+2x+4x^2+8x^3+\dots = \sum_{n=0}^{\infty} 2^n x^n$

which is nothing surprising. But if we now set $x=0.001$ (and, for neatness, divide both sides by 1000), the result is

$\displaystyle \frac{1}{998}=0.001\ 002\ 004\ 008\ 016\ 032 \dots$

which looks like a magical fraction producing powers of 2. But in reality, setting $x=0.001$ did nothing but mess things up. Now we have a complicated decimal number, in which powers of 2 break down starting with “513” because of the extra digits rolled over from 1024. In contrast, the neat power series keeps generating powers of 2 forever.

By the way, $\displaystyle \frac{1}{1-2x}$ is the generating function for the numbers 1,2,4,8,16…, i.e., the powers of 2.

So, if you want to cook up a ‘magical’ fraction, all you need to do is find the generating function for the numbers you want, and set the variable to be some negative power of 10. E.g., the choice $x=0.001$ avoids digits rolling over until the desired numbers reach 1000. But we could take $x=10^{-6}$ and get many more numbers at the cost of a more complicated fraction.

For example, how would one come up with 1/998001? We need a generating function for 1,2,3,4,…, that is, we need a formula for the power series $\sum nt^n$. No big deal: just take the derivative of (*):

$\displaystyle \frac{1}{(1-t)^2}=\sum_{n=0}^{\infty} nt^{n-1}$

and multiply both sides by $t$ to restore the exponent:

$\displaystyle (**) \qquad \frac{t}{(1-t)^2}=\sum_{n=0}^{\infty} nt^{n}$

Now set $t=0.001$, which is easiest if you expand the denominator as $1-2t+t^2$ and multiply both the numerator and denominator by $1000000$. The result is

$\displaystyle \frac{1000}{998001} = \sum_{n=0}^{\infty} n\,0.001^{n} = 0.001002003\dots$

Dropping 1000 in the numerator is a matter of taste (cf. xkcd 163).

Let’s cook up something else. For example, again take the derivative in (**) and multiply by $t$:

$\displaystyle \frac{t(1+t)}{(1-t)^3}=\sum_{n=0}^{\infty} n^2t^{n}$

Now set $t=0.001$ to get

$\displaystyle \frac{1001}{997002999} = 0.000\ 001\ 004\ 009\ 016\ 025\ 036\ 049\dots$

Admittedly, this fraction is less likely to propagate around the web than 1/998001.

For the last example, take the Fibonacci numbers 1,1,2,3,5,8,13,… The recurrence relation $F_{n+2}=F_n+F_{n+1}$ can be used to find the generating function, $\displaystyle \frac{1}{1-t-t^2} = \sum F_n t^n$. Setting $t=0.001$ yields

$\displaystyle \frac{1}{998999} = 0.000\ 001\ 002\ 003\ 005\ 008\ 013\ 021\ 034\ 055\dots$

Three weddings and a funeral

This is a continuation of the series on submetries. First, an example: the distance function to a closed convex set $K\subset \mathbb R^n$ is a submetry from $\mathbb R^n\setminus K$ onto $(0,\infty)$. Note that the best regularity we can expect from such distance function is $C^{1,1}$, the Lipschitzness of first derivatives. The second derivative does not exist at the points where level curves make the transition from straight to circular.

Now back to an attempt to introduce duality between immetries (=isometric embeddings) and submetries. Recall that the Lipschitz dual $X^\sharp$ of a pointed metric space $X\ni 0$ is the set of all Lipschitz functions $\varphi\colon X\to\mathbb R$ such that $f(0)=0$. Given the Lipschitz norm, $X^\sharp$ becomes a Banach space. Any closed ball $B_r(a)\subset X$ can be written as
(*) $B_r(a)=\lbrace x\in X\colon |\varphi(x)-\varphi(a)|\le r \text{ for all } \varphi\in X^\sharp \text{ such that }\|\varphi\|\le 1\rbrace$.
Indeed, $\subset$ is obvious, and the reverse inclusion follows by considering $\varphi(x)=d(x,a)-d(0,a)$.

It’s natural to use (*) to relate immetries to submetries, because the former are defined by $f^{-1}(B_r(f(x)))=B_r(x)$ and the latter by $f(B_r(x))=B_r(f(x))$.

In the previous post I considered four implications:

1. If $f$ is an immetry, then $f^\sharp$ is a submetry
2. If $f$ is a submetry, then $f^\sharp$ is an immetry
3. If $f^\sharp$ is an immetry, then $f$ is a submetry
4. If $f^\sharp$ is a submetry, then $f$ is an immetry

I proved (1) already. Implication (2) is also easy to prove: if $f\colon Y\to X$ is a submetry, then $\|\varphi\circ f\|=\|\varphi\|$ for every $\varphi\in X^\sharp$, because pairs of points that (almost) realize $\|\varphi\|$ can be lifted through $f$.

Here is a proof of (4). If $f^{\sharp}\colon X^\sharp\to Y^\sharp$ is a submetry, then $\lbrace \varphi\in X^\sharp \colon \|\varphi\|\le 1\rbrace = \lbrace \psi\circ f\colon \psi\in Y^\sharp, \|\psi\|\le 1\rbrace$. We can use this in (*) to obtain, for any $a\in X$ and $r\ge 0$,
$B_r(a)=\lbrace x\in X\colon |\psi(f(x))-\psi(f(a))|\le r \text{ for all } \psi\in Y^\sharp \text{ such that }\|\psi\|\le 1\rbrace$.
The latter set is nothing but $\lbrace x\in X\colon f(x)\in B_r(f(a))\rbrace$, which means $f$ is an immetry.

I already noted that (3) fails for general metric spaces, the inclusion $f\colon \mathbb Q\to \mathbb R$ being a counterexample. However, this counterexample is not impressive, because $f(B_r(x))$ is dense in $B_r(f(x))$. One could still hope that (3) holds for proper metric spaces. By definition, a metric space is proper if every closed ball is compact. (Or, equivalently, every bounded sequence has a convergent subsequence.) In metric geometry properness is a very common assumption which excludes incomplete spaces and infinite-dimensional spaces. Its relevance to submetries can be seen from their definition $f(B_r(x))=B_r(f(x))$. It requires the image $f(B_r(x))$ to be closed, which is not very likely to happen unless $B_r(x)$ is compact. (No such issues arise on the immetry side, because $f^{-1}(B_r(f(x)))$ is always closed.)

However, it turns out (3) is false even for compact spaces. The counterexample was already present on this blog:

Indeed, the Lipschitz norm of a function $\varphi$ defined on an interval is equal to the essential supremum of $|\varphi'|$. The composition with the metric quotient map given above preserves $\mathrm{ess\,sup}\, |\varphi'|$. Hence, the induced map $[0,2]^\sharp\to [0,3]^\sharp$ is an immetry.

Lipschitz duality: immetries and submetries

Continuation of an earlier post that considered submetries, namely the maps $f\colon Y\to X$ between metric spaces such that $f(B_r(y))=B_r(f(y))$ for all $y\in Y$ and all $r\ge 0$. (Here and in what follows $B_r$ is a closed ball.) The dual notion is an immetry: a map $f\colon X\to Y$ such that $f^{-1}(B_r(f(x)))=B_r(x)$ for all $x\in X$ and all $r\ge 0$. Immetries can be characterized by the condition $d(f(a),f(b))=d(a,b)$, which means they are nothing but isometric embeddings. I just made up the word to introduce symmetry between submetry and immetry. And then tried to look it up.

In what sense are these dual? Recall that reversal of arrows in a “categorical” definition of an immetry did not produce a submetry. Let’s try another approach. Let our metric spaces be pointed: they all contain the point $0$ which is fixed by all maps under consideration. The Lipschitz dual $X^\sharp$ of a metric space $X$ consists of all Lipschitz maps $\varphi\colon X\to \mathbb R$. This is naturally a vector space. Moreover, it is a Banach space with the norm $\displaystyle \|\varphi\|=\sup\frac{|\varphi(a)-\varphi(b)|}{d(a,b)}$ if we identify the functions that differ by a constant. A Lipschitz map $f\colon X\to Y$ induces a bounded linear operator $f^\sharp\colon Y^\sharp\to X^\sharp$. If $f$ is 1-Lipschitz, then so is $f^\sharp$ (i.e., its operator norm is at most 1).

It would be nice to have the following:

• $f$ is an immetry iff $f^\sharp$ is a submetry
• $f$ is a submetry iff $f^\sharp$ is an immetry

but that’s too much to hope for. For example, the inclusion of $\mathbb Q$ into $\mathbb R$ is not a submetry, but it induces the identity map $\mathbb R^\sharp\to\mathbb Q^\sharp$. Let’s go through these one by one.

Suppose $f\colon X\to Y$ is an immetry. Then we think of $X$ as a subset of $Y$, and $f^\sharp$ is simply the restriction operator. To prove that it’s a submetry, we should verify the 2-point lifting property: given any $\varphi,\psi\in X^\sharp$ and any $\Phi\in Y^\sharp$ that extends $\varphi$, we must find $\Psi\in Y^\sharp$ that extends $\psi$ and satisfies $\|\Psi-\Phi\|=\|\psi-\varphi\|$. This is easy: extend $\psi-\varphi$ in a norm-preserving way (by McShane-Whitney) and add $\Phi$.

I also wrote down an (easy) proof that $f$ being a submetry implies that $f^\sharp$ is an immetry, but WP ate it. Specifically, having pressed “Save Draft”, I was asked to re-login (the cookie expired). Having done so, I was presented with a 10 min old draft.

We already know that $f^\sharp$ being an immetry does not imply that $f$ is a submetry. Whether $f^\sharp$ being an submetry implies that $f$ is an immetry is left as an exercise for the reader.

Compactness of operators, and a lot of projections

Let $\mathcal{H}$ be an infinite-dimensional Hilbert space. Claim: an operator $T\colon\mathcal{H}\to\mathcal{H}$ is compact if and only if $Te_n\to 0$ for every orthonormal sequence $\lbrace e_n\rbrace$.
Continue reading “Compactness of operators, and a lot of projections”

Re: “Democracy on the high seas” by Colin Carroll

A slightly modified version of the riddle discussed in detail by Colin Carroll. I recap the key parts of his description below. Rule 4 was added by me in order to eliminate any probabilistic issues from the problem.

You are the captain of a pirate ship with a crew of N people ordered by rank. Your crew just managed to plunder 100 Pieces of Eight. Now you are to propose a division of the 100 PoE, and the crew will vote on the division. The captain doesn’t vote except to break a tie. If the proposal fails, the captain walks the plank, and the first mate becomes captain, the third in command becomes first mate, and so on. Each pirate votes according to the following ordered priorities:

1. They do not want to die.
2. They want to maximize their own profit.
3. They like to kill people, including crewmates.
4. They prefer to be on good terms with the higher ranked crewmates.

The question is: what division do you, as the captain, suggest?

More fractals: Laakso spaces and slit carpets

In early 2000s Tomi Laakso published two papers which demonstrated that metric spaces could behave in very Euclidean ways without looking Euclidean at all. One of his examples became known as the Laakso graph space, since it is the limit of a sequence of graphs. In fact, the best known version of the construction was introduced by Lang and Plaut, who modified Laakso’s original example to make it more symmetric. The building block is this graph with 6 edges:

Each edge is assigned length 1/4, so that the distance between the leftmost and rightmost points is 1. Next, replace each edge with a copy of the building block. This increases the number of edges by the factor of 6; their length goes down by the factor of 4 (so that the leftmost-rightmost distance is still 1). Repeat.

The resulting metric space is doubling: every ball can be covered by a fixed number (namely, 6) balls of half the size. This is the typical behavior of subsets of Euclidean spaces. Yet, the Laakso space does not admit a bi-Lipschitz embedding into any Euclidean space (in fact, even into any uniformly convex Banach space). It remains the simplest known example of a doubling space without such an embedding. Looking back at the building block, one recognizes the cycle $C_4$ as the source of non-embeddability (a single cycle forces a certain amount of distortion; adding cycles withing cycles ad infinitum forces infinite distortion). The extra edges on the left and right are necessary for the doubling condition.

In some sense, the Laakso space is the antipode of the Cantor set: instead of deleting the middle part repeatedly, we duplicate it. But it’s also possible to construct it with a ‘removal’ process very similar to Cantor’s. Begin with the square and slit it horizontally in the center; let the length of the slit be 1/3 of the sidelength. Then repeat as with the Cantor set, except in addition to cutting left and right of the previous cut, we also do it up and down. Like this:

Our metric space if the square minus the slits, equipped with the path metric: the distance between two points is the infimum of the length of curves connecting them within the space. Thus, the slits seriously affect the metric. This is how the set will look after a few more iterations:

I called this a slit pseudo-carpet because it has nonempty interior, unlike true Sierpinski carpets. To better see the similarity with the Laakso space, multiply the vertical coordinate by $\epsilon$ and let $\epsilon\to 0$ (equivalently, redefine the length of curves as $\int |x'|$ instead of $\int \sqrt{|x'|^2+|y'|^2}$). This collapses all vertical segments remaining in the set, leaving us with a version of the Laakso graph space.

Finally, some Scilab code. The Laakso graph was plotted using the Chaos game, calling the function below with parameters laakso(0.7,50000).

function laakso(angle,steps) s=1/(2+2*cos(angle)); xoffset = [0,1-s,s,s,1/2,1/2]; yoffset = [0,0,0,0,s*sin(angle),-s*sin(angle)]; rotation =[0,0,angle,-angle,-angle,angle]; sc=s*cos(rotation); ss=s*sin(rotation); point = zeros(steps,2); vert = grand(1,steps,'uin',1,6); for j = 2:steps point(j,:) = point(j-1,:)*[sc(vert(j)),ss(vert(j));-ss(vert(j)),sc(vert(j))] + [xoffset(vert(j)), yoffset(vert(j))]; end plot(point(:,1),point(:,2),'linestyle','none','markstyle','.','marksize',1); endfunction

Apropos of et al

I have 111 MathSciNet reviews posted, and there are three more articles on my desk that I should be reviewing instead of blogging. Even though I think of canceling my AMS membership, I don’t mind helping the society pay their bills (MathSciNet brings about 37% of the AMS revenue, according to their 2010-11 report.)

Sure, reviews need to be edited, especially when written by non-native English speakers like myself. Still, I’m unhappy with the edited version of my recent review:

This was the approach taken in the foundational paper by J. Heinonen et al. [J. Anal. Math. 85 (2001), 87-139]

The paper was written by J. Heinonen, P. Koskela, N. Shanmugalingam, and J. T. Tyson. Yes, it’s four names. Yes, the 14-letter name is not easy to pronounce without practice. But does the saving of 45 bytes justify omitting the names of people who spent many months, if not years, working on the paper? Absolutely not. The tradition of using “et al” for papers with more than 3 authors belongs to the age of typewriters.

P.S. I don’t think MathSciNet editors read my blog, so I emailed them.

P.P.S. The names are now restored. In the future I’ll be sure to add in “comments to the editor” that names should not be replaced by et al.

This ain’t like dusting crops, boy

The hyperspace is a set of sets equipped with a metric or at least with a topology. Given a metric space $X$, let $\mathcal{H}(X)$ be the set of all nonempty closed subsets of $X$ with the Hausdorff metric: $d(A,B) if no matter where you are in one set, you can jump into the other by traveling less than $r$. So, the distance between letters S and U is the length of the longer green arrow.

The requirement of closedness ensures $d(A,B)>0$ for $A\ne B$. If $X$ is unbounded, then $d(A,B)$ will be infinite for some pairs of sets, which is natural: the hyperspace contains infinitely many parallel universes which do not interact, being at infinite distance from one another.

Every continuous surjection $f\colon X\to Y$ has an inverse $f^{-1}\colon Y\to \mathcal{H}(X)$ defined in the obvious way: $f^{-1}(y)=f^{-1}(y)$. Yay ambiguous notation! The subset of $\mathcal{H}(X)$ that consists of the singletons is naturally identified with $X$, so for bijective maps we recover the usual inverse.

Exercise: what conditions on $f$ guarantee that $f^{-1}$ is (a) continuous; (b) Lipschitz? After the previous post it should not be surprising that

• Even if $f$ is open and continuous, $f^{-1}$ may be discontinuous.
• If $f$ is a Lipschitz quotient, then $f^{-1}$ is Lipschitz.

Proofs are not like dusting crops—they are easier.