Convex up to a reparametrization

Given a function {f\colon \mathbb R\rightarrow \mathbb R}, we have several ways of checking if it is convex. But how to recognize the functions that are convex up to a reparametrization, meaning that there is a homeomorphism {h} of the real line such that the composition {g = f\circ h} is convex?

Let’s start by listing some necessary conditions for {f} to have this property.

1. Since convex functions are continuous, {f} must be continuous.

2. Convex functions are quasiconvex, meaning that each sublevel set {U_t = \{x\colon g(x) < t\}} is convex (in this case, an open interval). This property is preserved under reparametrization. So, {f} must be quasiconvex as well.

3. Nonconstant convex functions are unbounded above. So, {f} must have this property too.

Is this all? No. For example, let {f(x) = -x} if {x< 0} and {f(x)=\arctan x} if {x\ge 0}. This is a continuous quasiconvex function that is unbounded above. But it can’t be made convex by a reparametrization, because as {x\rightarrow +\infty}, it increases while staying bounded. This leads to another condition:

4. If {f} is not monotone, then its limits as {x\rightarrow+\infty} and {x\rightarrow -\infty} must both be {+\infty}.

And still we are not done. The function

{f(x) = \begin{cases} x+1, \quad &x<-1 \\ 0, \quad & -1 \le x\le 1 \\ x-1, \quad & x>1 \end{cases} }

satisfies all of 1–4. But it has a flat spot (the interval {[-1,1]} where it is constant) which does not achieve its minimum. This can’t happen to convex functions: if a convex function is constant on an interval, then (by virtue of being above its tangent lines) it has to attain its minimum there. This leads to the 5th condition:

5. {f} is either strictly monotone, or attains its minimum on some closed interval {I_{\min}(f)} (possibly unbounded or a single point). In the latter case, it is strictly monotone on each component of {\mathbb R\setminus I_{\min}(f)}.

Finally, we have enough: together, 1–5 is the necessary and sufficient condition for {f} to be convex up to a reparametrization. The proof of sufficiency turns out to be easy. A continuous strictly monotone function is a homeomorphism between its domain and range. The strictly monotone (and unbounded) parts of {f} give us homeomorphisms such that the composition of {f} with them is affine. Consequently, there is a homeomorphism {h} of the real line such that {f\circ h} is one of the following five functions:

  1. {g(x) \equiv 1} if {I_{\min}(f) = \mathbb R}
  2. {g(x) = \max(x,0)} if {I_{\min}(f)} is unbounded, but not all of {\mathbb R}
  3. {g(x) = \max(|x|-1, 0)} if {I_{\min}(f)} is a bounded nondegenerate interval
  4. {g(x) =|x|} if {I_{\min}(f)} is a single point
  5. {g(x) =x} if {I_{\min}(f)} is empty

Incidentally, we also see there are five equivalence classes of convex functions on {\mathbb R}, the equivalence being any homeomorphic change of variables.

What functions on {\mathbb R^n} are convex up to reparametrization? Unclear. (Related Math SE question)


Condition number and maximal rotation angle

The condition number {K(A)} of an invertible matrix {A} is the product of norms {\|A\|\,\|A^{-1}\|}, or, in deciphered form, the maximum of the ratio {|Au|/|Av|} taken over all unit vectors {u,v}. It comes up a lot in numerical linear algebra and in optimization problems. One way to think of the condition number is in terms of the image of the unit ball under {A}, which is an ellipsoid. The ratio of longest and shortest axes of the ellipsoid is {K(A)}.

But for positive definite matrices {A} the condition number can also be understood in terms of rotation. First, note that the positivity of {\langle Av,v\rangle } says precisely that that the angle between {v} and {Av} is always less than {\pi/2}. Let {\gamma} be the maximum of such angles taken over all {v\ne 0} (or over all unit vectors, the same thing). Then

\displaystyle  K(A)=\frac{1+\sin\gamma }{1-\sin\gamma} \ \ \ \  \ \ \ \ \ (1)

One can also make (1) a little less geometric by introducing {\delta=\delta(A)} as the largest number such that {\langle Av,v\rangle \ge \delta|Av|\,|v|} holds for all vectors {v}. Then {\delta=\cos \gamma}, and (1) takes the form

\displaystyle  K=\frac{1+\sqrt{1-\delta^2}}{1-\sqrt{1-\delta^2}} =  \left(\frac{1+\sqrt{1-\delta^2}}{\delta}\right)^2  \ \ \ \ \ \ \ \ \ \ (2)

Could this be of any use? The inequality

\displaystyle \langle Av,v\rangle \ge \delta\,|Av|\,|v| \ \ \ \ \ \ \ \ \ \ (3)

is obviously preserved under addition of matrices. Therefore, it is preserved by integration. In particular, if the Hessian of a twice differentiable convex function {u} satisfies (3) at every point, then integration along a line segment from {x} to {y} yields

\displaystyle    \langle \nabla u(x)- \nabla u(y),x-y\rangle \ge \delta\,|\nabla u(x)-\nabla u(y)|\,|x-y|  \ \ \ \ \ \ \ \ \ \ (4)

Conversely, if {u} is a twice differentiable convex function such that (4) holds, then (by differentiation) its Hessian satisfies (3), and therefore admits a uniform bound on the condition number by virtue of (2). Thus, for such functions inequality (4) is equivalent to uniform boundedness of the condition number of the Hessian.

But the Hessian itself does not appear in (4). Condition (4) expresses “uniform boundedness of the condition number of the Hessian” without requiring {u} to be twice differentiable. As a simple example, take {u(x_1,x_2)=|x|^{4/3}}. The Hessian matrix is

\displaystyle  \frac{4}{9}|x|^{-4/3} \begin{pmatrix} x_1^2+3x_2^2 & -2x_1x_2 \\ -2x_1x_2 & 3x_1^2+x_2^2 \end{pmatrix}

The eigenvalues are {\frac{4}{3}|x|^{-1/3}} and {\frac{4}{9}|x|^{-1/3}}. Thus, even though the eigenvalues blow up at the origin and decay at infinity, the condition number of the Hessian remains equal to {3}. Well, except that the second derivatives do not exist at the origin. But if we use the form (4) instead, with {\delta = \sqrt{3}/2}, then non-differentiability becomes a non-issue.

Let’s prove (2). It suffices to work in two dimensions, because both {K(A)} and {\delta(A)} are determined by the restrictions of {A} to two-dimensional subspaces. In two dimensions we can represent the linear map {A} as {z\mapsto \alpha z+\beta \bar z} for some complex numbers {\alpha,\beta}. Actually, {\alpha} is real and positive because {A} is symmetric positive definite. As {z} runs through unimodular complex numbers, the maximum of {|\alpha z+\beta \bar z|} is {\alpha+|\beta|} and the minimum is {\alpha-|\beta|}. Therefore, {K(A)=\frac{1+|\beta|/\alpha}{1-|\beta|/\alpha}}.

When {|z|=1}, the angle {\gamma} that the vector {\alpha z+\beta \bar z} forms with {z} is equal to the argument of {\bar z (\alpha z+\beta \bar z)=\alpha+\beta \bar z^2}. The latter is maximized when {0, \alpha, \alpha+\beta \bar z^2} form a right triangle with hypotenuse {\alpha}.

Proof by picture
Proof by picture

Hence, {\sin\gamma = |\beta|/\alpha}. This proves {K(A)=\frac{1+|\beta|/\alpha}{1-|\beta|/\alpha}}, and (2) follows.

There are similar observations in the literature on quasiconformality of monotone maps, including an inequality similar to (2) (for general matrices), but I have not seen either (1) or (2) stated as an identity for positive definite matrices.