The Walsh basis

If you ask a random passerby to give you an orthonormal basis for L^2[0,1], they will probably respond with e_n(t)=\exp(2\pi i nt), n\in \mathbb Z. There is a lot to like about this exponential basis: most importantly, it diagonalizes the \frac{d}{dt} operator: \frac{d}{dt}e_n=2\pi i n e_n. This property makes the exponential basis indispensable in the studies of differential equations. However, I prefer to describe the Walsh basis, which has several advantages:

  • the basis functions take just two values \pm 1, which simplifies the computation of coefficients
  • the proof of the basis property is easier than for the exponential basis
  • there is a strong connection to probability: the Walsh expansion can be seen as conditional expectation, and the partial sums form a Doob martingale
  • partial sums converge a.e. for any L^1 function, which is not the case for the exponential basis.

First, introduce the Rademacher functions r_n=\mathrm{sign}\, \sin (2^{n+1} \pi t), n=0,1,2,\dots (The enumeration is slightly different from what I used in class.) These are r_0,r_1,r_2,r_3:

Rademacher functions

Alternatively, one can define r_n as the function which takes the values +1,-1 alternatively on the dyadic intervals \displaystyle \bigg[\frac{j}{2^{n+1}},\frac{j+1}{2^{n+1}}\bigg).

To define the nth Walsh function W_n, express the index n as the sum of powers of 2, i.e., n=2^{p_1}+2^{p_2}+\dots and let W_n=r_{p_1}r_{p_2}\dots . For example, W_{13}=r_3r_2r_0 because 13=2^3+2^2+2^0. Since the binary representation is unique, the definition makes sense. We also have W_0=1 because the product of an empty set of numbers is 1.

In class I checked that the set \lbrace W_n\colon n=0,1,2,\dots\rbrace is orthonormal. Also, for any integer k\ge 0 the linear span of \lbrace W_n\colon 0\le n< 2^k \rbrace is the space V_k of all functions that are constant on the dyadic intervals of length 2^{-k}. This follows by observing that \lbrace W_n\colon 0\le n< 2^k \rbrace\subset V_k and that the dimension of V_k is 2^k.

To prove that the Walsh basis is indeed a basis, suppose that h\in L^2[0,1] is orthogonal to all W_n. Since h\perp V_k for all k, the integral of h over any dyadic interval is zero (note that the characteristic function of any dyadic interval belongs to some V_k). But any subinterval I\subset [0,1] can be written as a disjoint countable union of dyadic intervals: just take all dyadic intervals that are contained in I. (You don't necessarily get the right type of endpoints, but as long as we work with integrals, the difference between open and closed intervals is immaterial.) Thus, the integral of h over any subinterval of [0,1] is zero. By the Lebesgue differentiation theorem, for a.e. t we have \displaystyle h(t)=\lim_{\delta\to 0}\frac{1}{2\delta}\int_{t-\delta}^{t+\delta} h =0. Thus h=0 as required.

The proof is even simpler if we use the non-centered form of the Lebesgue differentiation theorem: for a.e. t the average \frac{1}{b-a}\int_a^b h approaches h(t) as a,b\to t in such a way that a\le t\le b. Armed with this theorem, we can consider the sequence of dyadic intervals containing t, and immediately obtain h(t)=0 a.e.

Having proved that \lbrace W_n\rbrace is a basis, let’s expand something in it. For example, this moderately ugly function f:

Ugly function

I used Maple to compute the coefficients c_n=\langle f, W_n\rangle and plotted the partial sums \sum_{n=0}^N c_n W_n for N=1,3,7,15:

Partial sums

Such partials sums (those that use 2^k basis functions) are particularly nice: they are obtained simply by averaging f over each dyadic interval of length 2^{-k}. In probability theory this is known as conditional expectation. The conditional expectation is a contraction in any L^p space, including L^1 which gives so much trouble to the exponential basis. The highly oscillatory parts of f are killed by the dyadic averaging; in contrast, when integrated against the exponentials, they may cleverly accumulate and destroy the convergence of partial sums.

Playing with iterated function systems

After Colin Carroll posted several fractal experiments with Matlab, I decided to do something of the sort. One difference is that I use Scilab, an open-source alternative to Matlab.

The first experiment: drawing the Sierpinski carpet using the Chaos Game. Namely, given a finite family of strict contractions f_1,\dots,f_r\colon \mathbb R^2\to \mathbb R^2 and an initial point p_0, plot the sequence p_{n+1}=f_{j_n}(p_n), where j_n \in \{1,\dots,r\} is chosen randomly at each step. To simplify matters, let f_j be the similarity transformation with scaling factor s\in (0,1) and the fixed point v_j.

A canonical example is: v_1,v_2,v_3 are the vertices of equilateral triangle, s=1/2. This produces the fractal known as the Sierpinski gasket. For a different example, set s=1/3 and let v_1,\dots,v_8 be the vertices of square together with midpoints of its sides. The resulting fractal is known as the Sierpinski carpet.

Sierpinski Carpet
Sierpinski Carpet

This image was obtained by calling the scilab function given below as Scarpet(1/3, 100000). The function is essentially a translation of Colin’s code to scilab. Caution: if you copy and paste this code, watch out for line breaks and encoding of quote marks.

function Scarpet(scale,steps)
    b=1-scale;
    x = [1,0,-1,-1,-1,0,1,1];
    y = [1,1,1,0,-1,-1,-1,0];
    sides=length(x);
    point = zeros(steps,2);
    vert = grand(1,steps,'uin',1,sides);
    for j = 2:steps
        point(j,:) = scale*point(j-1,:) + b*[x(vert(j)),y(vert(j))];
    end
    plot(point(:,1),point(:,2),'linestyle','none','markstyle','.','marksize',1);
endfunction

Regardless of the choice of initial point p_0, the set of cluster points of the sequence (p_n) is exactly the invariant set K, namely the unique nonempty compact set such that K=\bigcup_{j=1}^r f_j(K). This is proved, for example, in the book Integral, Probability, and Fractal Measures by Gerald Edgar.

The scaling factor s=1/3 for the carpet is chosen so that the images of the original square under the eight similarities touch, but do not overlap. With a smaller factor the fractal looks like dust (a totally disconnected set), while with s\ge 1/2 it becomes a solid square. The intermediate range 1/3<s<1/2 is tricky: I think that K has measure zero, but can’t even prove that it’s nowhere dense.

It’s also possible to draw K in the opposite way, by removing points rather than adding them. To this end, let P be the convex hull of the set \{v_1,\dots,v_r\}; that is, a solid convex polygon. It’s not hard to see that K\subset P. Therefore, \bigcup_{j=1}^r f_j(K)\subset \bigcup_{j=1}^r f_j(P), but since the set on the left is K itself, we get K\subset \bigcup_{j=1}^r f_j(P). By induction, K=\bigcap_{n=1}^{\infty} P_n where P_0=P and P_{n+1}=\bigcup_{j=1}^r f_j(P_n).

triangle
Fat Sierpinski gasket: s=3/5 instead of 1/2

The above example is ifs(3,3/5,11), calling the Scilab code below.

function ifs(sides,scale,steps)
    b=1-scale; t=2*%pi*(1:sides)/sides; x=cos(t); y=sin(t);
    xpols=x'; ypols=y';
    for j=2:steps
        xpols=scale*xpols; ypols=scale*ypols;
        xpolsnew=[]; ypolsnew=[];
        for k=1:sides
            xpolsnew=[xpolsnew xpols+b*x(k)*ones(xpols)];
            ypolsnew=[ypolsnew ypols+b*y(k)*ones(ypols)];
        end
        xpols=xpolsnew; ypols=ypolsnew;
    end
    a=gca(); a.data_bounds=[-1,-1;1,1];
    [m,n]=size(xpols);
    xfpolys(xpols,ypols,ones(n,1))
endfunction

The final example is an “upper bound” for the fat pentagonal fractal that Colin created with the Chaos Game: the points v_1,\dots,v_5 are the vertices of regular pentagon, and s=1/2. The function was called as ifs(5,1/2,8). Again, I think that the invariant set has measure zero, but can’t even prove that the interior is empty. (Or find a reference where this is already done.)

Pentagonal set with s=1/2
Pentagonal set with s=1/2

For the sake of completeness

Let’s prove the completeness of \ell^p. The argument consists of two steps.

Claim 1. Suppose X is a normed space in which every absolutely convergent series converges; that is, \sum_{n=1}^{\infty} x_n converges whenever x_n\in X are such that \sum_{n=1}^{\infty} \|x_n\| converges. Then the space is complete.

Proof. Take a Cauchy sequence \{y_n\}\subset X. For j=1,2,\dots find an integer n_j such that \|y_n-y_m\|<2^{-j} as long as n,m\ge n_j. (This is possible because the sequence is Cauchy.) Also let n_0=1 and consider the series \sum_{j=1}^{\infty} (y_{n_{j}}-y_{n_{j-1}}). By the hypothesis this series converges. Its partial sums simplify (telescope) to y_{n_j}-y_1. Hence the subsequence \{y_{n_j}\} has a limit. It remains to apply a general theorem about metric spaces: if a Cauchy sequence has a convergent subsequence, then the entire sequence converges. This proves Claim 1.

Claim 2. Every absolutely convergent series in \ell^p converges.

Proof. The elements of \ell^p are functions from \mathbb N to \mathbb C, so let’s write them as such: f_j\colon \mathbb N\to \mathbb C. (This avoids confusion of indices.) Suppose the series \sum_{j=1}^{\infty} \|f_j\| converges. Then for any n the series \sum_{j=1}^{\infty} |f_j(n)| also converges, by Comparison Test. Hence \sum_{j=1}^{\infty} f_j(n) converges (absolutely convergent implies convergent for series of real or complex numbers). Let f(n) = \sum_{j=1}^{\infty} f_j(n). So far the convergence is only pointwise, so we are not done. We still have to show that the series converges in \ell^p, that is, its tails have small \ell^2 norm: \sum_{n=1}^\infty |\sum_{j=k}^{\infty} f_j(n)|^p \to 0 as k\to\infty.

What we need now is a dominating function, so that we can apply the Dominated Convergence Theorem. Namely, we need a function g\colon \mathbb N\to [0,\infty) such that
(1) \sum_{n=1}^{\infty} g(n)<\infty, and
(2) |\sum_{j=k}^{\infty} f_j(n)|^p \le g(n) for all k,n.

Set g=(\sum_{j=1}^{\infty} |f_j|)^p. Then (2) follows from the triangle inequality. Also, g is the increasing limit of functions g_k =(\sum_{j=1}^k |f_j|)^p, for which we have
\sum_n g_k(n) \le (\sum_{j=1}^k \|f_j\|)^p \le (\sum_{j=1}^{\infty} \|f_j\|)^p<\infty
using the triangle inequality in \ell^p. Therefore, \sum_n g(n)<\infty by the Monotone Convergence Theorem.

Almost norming functionals, Part 2

Let E be a real Banach space with the dual E^*. Fix \delta\in (0,1) and call a linear functional e^*\in E^* almost norming for e if |e|=|e^*|=1 and e^*(e)\ge \delta. In Part 1 I showed that in any Banach space there exists a continuous selection of almost norming functionals. Here I will prove that there is no uniformly continuous selection in \ell_1.

Claim. Let S be the unit sphere in \ell_1^n, the n-dimensional \ell_1-space.  Suppose that f\colon S\to \ell_{\infty}^n is a map such that f(e) is almost norming e in the above sense. Then the modulus of continuity \omega_f satisfies \omega_f(2/n)\ge 2\delta.

(If an uniformly continuous selection was available in \ell_1, it would yield selections in \ell_1^n with a modulus of continuity independent of n.)

Proof. Write f=(f_1,\dots,f_n). For any \epsilon\in \{-1,1\}^n we have n^{-1}\epsilon \in S, hence

\sum\limits_{i=1}^n \epsilon_i f_i(n^{-1}\epsilon)\ge n\delta for all \epsilon\in \{-1,1\}^n. Sum over all \epsilon and change the order of summation:

\sum\limits_{i=1}^n \sum\limits_{\epsilon}\epsilon_i f_i(n^{-1}\epsilon)\ge n2^n\delta

There exists i\in\{1,2,\dots,n\} such that

\sum\limits_{\epsilon}\epsilon_i f_i(n^{-1}\epsilon) \ge 2^n \delta

Fix this i from now on. Define \tilde \epsilon to be the same \pm vector as \epsilon, but with the ith component flipped. Rewrite the previous sum as

\sum\limits_{\epsilon} -\epsilon_i f_i(n^{-1}\tilde \epsilon)\ge 2^n\delta

and add them together:

\sum\limits_{\epsilon}\epsilon_i [f_i(n^{-1}\epsilon)-f_i(n^{-1}\tilde \epsilon)]\ge 2^{n+1}\delta

Since \|n^{-1}\epsilon-n^{-1}\tilde \epsilon\|=2/n, it follows that 2^n \omega_f(2/n) \ge 2^{n+1}\delta, as claimed.

A relation between polynomials

This is a brief foray into algebra from a 2006 REU project at Texas A&M.

Given two polynomials P,Q \in \mathbb C[z_1,\dots,z_n], we write Q\preccurlyeq P if there is a differential operator T\in \mathbb C[\frac{\partial}{\partial z_1},\dots, \frac{\partial}{\partial z_n}] such that Q=T P.

The relation \preccurlyeq  is reflexive and transitive, but is not antisymmetric. If both Q\preccurlyeq P and Q\preccurlyeq P hold, we say that P and Q are \partial-equivalent, denoted P\thicksim Q.

A polynomial is \partial -homogeneous if it is \partial -equivalent to a homogeneous polynomial. Obviously, any polynomial in one variable has this property. Polynomials in more than one variable usually do not have it.

The interesting thing about \partial -homogeneous polynomials is that they are refinable, meaning that one has a nontrivial identity of the form P(z)=\sum_{j\in\mathbb Z^n} c_{j} P(\lambda z-j) where c_{j}\in \mathbb C, j\in \mathbb Z^n, and only finitely many of the coefficients c_j are nonzero. The value of \lambda does not matter as long as |\lambda|\ne 0,1. Conversely, every \lambda -refinable polynomial is \partial -homogeneous.

Controlled bilipschitz extension

A map f\colon X\to Y is L-bilipschitz if L^{-1} |a-b| \le |f(a)-f(b)| \le L |a-b| for all a,b\in X. This definition makes sense if X and Y are general metric spaces, but let’s suppose they are subsets on the plane \mathbb R^2.

Definition 1. A set A\subset \mathbb R^2 has the BL extension property if any bilipschitz map f\colon A\to\mathbb R^2 can be extended to a bilipschitz map F\colon \mathbb R^2\to\mathbb R^2. (Extension means that F is required to agree with f on A.)

Lines and circles have the BL extension property. This was proved in early 1980s independently by Tukia, Jerison and Kenig, and Latfullin.

Definition 2. A set A\subset \mathbb R^2 has the controlled BL extension property if there exists a constant C such that any L-bilipschitz map f\colon A\to\mathbb R^2 can be extended to a C L-bilipschitz map F\colon \mathbb R^2\to\mathbb R^2.

Clearly, Definition 2 asks for more than Definition 1. I can prove that a line has the controlled BL extension property, even with a modest constant such as C=2000. (Incidentally, one cannot take C=1.) I still can’t prove the controlled BL extension property for a circle.

Update: extension from line is done in this paper.