Condition number and maximal rotation angle

The condition number {K(A)} of an invertible matrix {A} is the product of norms {\|A\|\,\|A^{-1}\|}, or, in deciphered form, the maximum of the ratio {|Au|/|Av|} taken over all unit vectors {u,v}. It comes up a lot in numerical linear algebra and in optimization problems. One way to think of the condition number is in terms of the image of the unit ball under {A}, which is an ellipsoid. The ratio of longest and shortest axes of the ellipsoid is {K(A)}.

But for positive definite matrices {A} the condition number can also be understood in terms of rotation. First, note that the positivity of {\langle Av,v\rangle } says precisely that that the angle between {v} and {Av} is always less than {\pi/2}. Let {\gamma} be the maximum of such angles taken over all {v\ne 0} (or over all unit vectors, the same thing). Then

\displaystyle  K(A)=\frac{1+\sin\gamma }{1-\sin\gamma} \ \ \ \  \ \ \ \ \ (1)

One can also make (1) a little less geometric by introducing {\delta=\delta(A)} as the largest number such that {\langle Av,v\rangle \ge \delta|Av|\,|v|} holds for all vectors {v}. Then {\delta=\cos \gamma}, and (1) takes the form

\displaystyle  K=\frac{1+\sqrt{1-\delta^2}}{1-\sqrt{1-\delta^2}} =  \left(\frac{1+\sqrt{1-\delta^2}}{\delta}\right)^2  \ \ \ \ \ \ \ \ \ \ (2)

Could this be of any use? The inequality

\displaystyle \langle Av,v\rangle \ge \delta\,|Av|\,|v| \ \ \ \ \ \ \ \ \ \ (3)

is obviously preserved under addition of matrices. Therefore, it is preserved by integration. In particular, if the Hessian of a twice differentiable convex function {u} satisfies (3) at every point, then integration along a line segment from {x} to {y} yields

\displaystyle    \langle \nabla u(x)- \nabla u(y),x-y\rangle \ge \delta\,|\nabla u(x)-\nabla u(y)|\,|x-y|  \ \ \ \ \ \ \ \ \ \ (4)

Conversely, if {u} is a twice differentiable convex function such that (4) holds, then (by differentiation) its Hessian satisfies (3), and therefore admits a uniform bound on the condition number by virtue of (2). Thus, for such functions inequality (4) is equivalent to uniform boundedness of the condition number of the Hessian.

But the Hessian itself does not appear in (4). Condition (4) expresses “uniform boundedness of the condition number of the Hessian” without requiring {u} to be twice differentiable. As a simple example, take {u(x_1,x_2)=|x|^{4/3}}. The Hessian matrix is

\displaystyle  \frac{4}{9}|x|^{-4/3} \begin{pmatrix} x_1^2+3x_2^2 & -2x_1x_2 \\ -2x_1x_2 & 3x_1^2+x_2^2 \end{pmatrix}

The eigenvalues are {\frac{4}{3}|x|^{-1/3}} and {\frac{4}{9}|x|^{-1/3}}. Thus, even though the eigenvalues blow up at the origin and decay at infinity, the condition number of the Hessian remains equal to {3}. Well, except that the second derivatives do not exist at the origin. But if we use the form (4) instead, with {\delta = \sqrt{3}/2}, then non-differentiability becomes a non-issue.

Let’s prove (2). It suffices to work in two dimensions, because both {K(A)} and {\delta(A)} are determined by the restrictions of {A} to two-dimensional subspaces. In two dimensions we can represent the linear map {A} as {z\mapsto \alpha z+\beta \bar z} for some complex numbers {\alpha,\beta}. Actually, {\alpha} is real and positive because {A} is symmetric positive definite. As {z} runs through unimodular complex numbers, the maximum of {|\alpha z+\beta \bar z|} is {\alpha+|\beta|} and the minimum is {\alpha-|\beta|}. Therefore, {K(A)=\frac{1+|\beta|/\alpha}{1-|\beta|/\alpha}}.

When {|z|=1}, the angle {\gamma} that the vector {\alpha z+\beta \bar z} forms with {z} is equal to the argument of {\bar z (\alpha z+\beta \bar z)=\alpha+\beta \bar z^2}. The latter is maximized when {0, \alpha, \alpha+\beta \bar z^2} form a right triangle with hypotenuse {\alpha}.

Proof by picture
Proof by picture

Hence, {\sin\gamma = |\beta|/\alpha}. This proves {K(A)=\frac{1+|\beta|/\alpha}{1-|\beta|/\alpha}}, and (2) follows.


There are similar observations in the literature on quasiconformality of monotone maps, including an inequality similar to (2) (for general matrices), but I have not seen either (1) or (2) stated as an identity for positive definite matrices.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s