## Orthogonality in normed spaces

For a vector ${x}$ in a normed space ${X}$, define the orthogonal complement ${x^\perp}$ to be the set of all vectors ${y}$ such that ${\|x+ty\|\ge \|x\|}$ for all scalars ${t}$. In an inner product space (real or complex), this agrees with the normal definition of orthogonality because ${\|x+ty\|^2 - \|x\|^2 = 2\,\mathrm{Re}\,\langle x, ty\rangle + o(t)}$ as ${t\to 0}$, and the right hand side can be nonnegative only if ${\langle x, y\rangle=0}$.

Let’s see what properties of orthogonal complement survive in a general normed space. For one thing, ${x^\perp=X}$ if and only if ${x=0}$. Another trivial property is that ${0\in x^\perp}$ for all ${x}$. More importantly, ${x^\perp}$ is a closed set that contains some nonzero vectors.

•  Closed because the complement is open: if ${\|x+ty\| < \|x\|}$ for some ${t}$, the same will be true for vectors close to ${y}$.
• Contains a nonzero vector because the Hahn-Banach theorem provides a norming functional for ${x}$, i.e., a unit-norm linear functional ${f\in X^*}$ such that ${f(x)=\|x\|}$. Any ${y\in \ker f}$ is orthogonal to ${x}$, because ${\|x+ty\|\ge f(x+ty) = f(x) = \|x\|}$.

In general, ${x^\perp}$ is not a linear subspace; it need not even have empty interior. For example, consider the orthogonal complement of the first basis vector in the plane with ${\ell_1}$ (taxicab) metric: it is $\{(x, y)\colon |y|\ge |x|\}$.

This example also shows that orthogonality is not symmetric in general normed spaces: ${(1,1)\in (1,0)^\perp}$ but ${(1,0)\notin (1,1)^\perp}$. This is why I avoid using notation ${y \perp x}$ here.

In fact, ${x^\perp}$ is the union of kernels of all norming functionals of ${x}$, so it is only a linear subspace when the norming functional is unique. Containment in one direction was already proved. Conversely, suppose ${y\in x^\perp}$ and define a linear functional ${f}$ on the span of ${x,y}$ so that ${f(ax+by) = a\|x\|}$. By construction, ${f}$ has norm 1. Its Hahn-Banach extension is a norming functional for ${x}$ that vanishes on ${y}$.

Consider ${X=L^p[0,1]}$ as an example. A function ${f}$ satisfies ${1\in f^\perp}$ precisely when its ${p}$th moment is minimal among all translates ${f+c}$. This means, by definition, that its “${L^p}$-estimator” is zero. In the special cases ${p=1,2,\infty}$ the ${L^p}$ estimator is known as the median, mean, and midrange, respectively. Increasing ${p}$ gives more influence to outliers, so ${1\le p\le 2}$ is the more useful range for it.