As a service* to math students everywhere (especially those taking calculus), I started Mathematics.SE Index. The plan is to have a thematic catalog of common exercises in college-level mathematics, each linked to a solution posted on Math.SE.

As of now, the site has reasonably complete sections on Limits and Series, with a rudimentary section on binomial sums. All lists are automatically generated. Initial filtering was done with Data Explorer SQL query, using tags and keywords in question body. The query also took into account the view count (i.e., how often the problem is searched for), and the existence of upvoted answers.

The results of the query were processed with a Google Sheets script: a bunch of regular expressions extracted LaTeX markup with a desired pattern, checked its integrity [not of academic kind] and transformed into WordPress-compatible markup.

Plans for near future: integrals (especially improper), basic proofs by induction, , maybe some group theory and differential equations… depends on how easy it is to teach these topics to regular expressions.

For some reason I wanted to construct polynomials approximating this piecewise constant function :

Of course approximation cannot be uniform, since the function is not continuous. But it can be achieved in the sense of convergence of graphs in the Hausdorff metric: their limit should be the “graph” shown above, with the vertical line included. In concrete terms, this means for every there is such that for the polynomial satisfies

and also

How to get such explicitly? I started with the functions when is large. The idea is that as , the limit of is what is wanted: when , when . Also, for each there is a Taylor polynomial that approximates uniformly on . Since the Taylor series is alternating, it is not hard to find suitable . Let’s shoot for in the Taylor remainder and see where this leads:

Degree polynomial for

Degree polynomial for

Degree polynomial for

Degree polynomial for

Degree polynomial for

The results are unimpressive, though:

To get within of the desired square-ness, we need . This means . Then, to have the Taylor remainder bounded by at , we need . Instead of messing with Stirling’s formula, just observe that does not even begin to decrease until exceeds , which is more than . That’s a … high degree polynomial. I would not try to ask a computer algebra system to plot it.

To avoid dealing with , it is better to use odd degrees. For comparison, I used the same or smaller degrees as above: .

Looks good. But I don’t know of a way to estimate the degree of Bernstein polynomial required to obtain Hausdorff distance less than a given (say, ) from the square function.

The winding map is a humble example that is conjectured to be extremal in a long-standing open problem. Its planar version is defined in polar coordinates by

All this map does it stretch every circle around the origin by the factor of two — tangentially, without changing its radius. As a result, the circle winds around itself twice. The map is not injective in any neighborhood of the origin .

The 3D version of the winding map has the same formula, but in cylindrical coordinates. It winds the space around the -axis, like this:

In the tangential direction the space is stretched by the factor of ; the radial coordinate is unchanged. More precisely: the singular values of the derivative matrix (which exists everywhere except when ) are . Hence, the Jacobian determinant is , which makes sense since the map covers the space by itself, twice.

In general, when the singular values of the matrix are , the ratio is called the inner distortion of . The word “inner” refers to the fact that is the radius of the ball inscribed into the image of unit ball under ; so, the inner distortion compares this inner radius of the image of unit ball to its volume.

For a map, like above, the inner distortion is the (essential) supremum of the inner distortion of its derivative matrices over its domain. So, the inner distortion of is , in every dimension. Another example: the linear map has inner distortion .

It is known that there is a constant such that if the inner distortion of a map is less than , the map is locally injective: every point has a neighborhood in which is injective. This was proved by Martio, Rickman, and Väisälä in 1971. They conjectured that is optimal: that is, the winding map has the least inner distortion among all maps that are not locally injective.

But at present, there is still no explicit nontrivial lower estimate for , for example we don’t know if inner distortion less than implies local injectivity.

Find the equation of tangent line to parabola … borrring calculus drill.

Okay. Draw two tangent lines to the parabola, then. Where do they intersect?

If the points of tangency are and , then the tangent lines are
and . Equate and solve:

Neat! The -coordinate of the intersection point is midway between and .

What does the -coordinate of the intersection tell us? It simplifies to

the geometric meaning of which is not immediately clear. But maybe we should look at the vertical distance from intersection to the parabola itself. That would be

This is the square of the distance from the midpoint to and . In other words, the squared radius of the smallest “disk” covering the set .

Same happens in higher dimensions, where parabola is replaced with the paraboloid , .

Indeed, the tangent planes at and are
and . Equate and solve:

So, lies on the equidistant plane from and . And, as above,

is the square of the radius of smallest disk covering both and .

The above observations are useful for finding the smallest disk (or ball) covering given points. For simplicity, I stick to two dimensions: covering points on a plane with the smallest disk possible. The algorithm is:

Given points , , write down the equations of tangent planes to paraboloid . These are .

Find the point that minimizes the vertical distance to paraboloid, that is , and lies (non-strictly) below all of these tangent planes.

The coordinates of this point is the center of the smallest disk covering the points. (Known as the Chebyshev center of the set). Also, is the radius of this disk; known as the Chebyshev radius.

The advantage conferred by the paraboloid model is that at step 2 we are minimizing a quadratic function subject to linear constraints. Implementation in Sage:

points = [[1,3], [1.5,2], [3,2], [2,-1], [-1,0.5], [-1,1]]
constraints = [lambda x, p=q: 2*x[0]*p[0]+2*x[1]*p[1]-p[0]^2-p[1]^2-x[2] for q in points]
target = lambda x: x[0]^2+x[1]^2-x[2]
m = minimize_constrained(target,constraints,[0,0,0])
circle((m[0],m[1]),sqrt(m[0]^2+m[1]^2-m[2]),color='red') + point(points)

Credit: this post is an expanded version of a comment by David Speyer on last year’s post Covering points with caps, where I considered the same problem on a sphere.

Every subset inherits the metric from , namely . But we can also consider the intrinsic metric on , defined as follows: is the infimum of the lengths of curves that connect to within . Let’s assume there is always such a curve of finite length, and therefore is always finite. All the properties of a metric hold, and we also have for all .

If happens to be convex, then because any two points are joined by a line segment. There are also some nonconvex sets for which coincides with the Euclidean distance: for example, the punctured plane . Although we can’t always get from to in a straight line, the required detour can be as short as we wish.

On the other hand, for the set the intrinsic distance is sometimes strictly greater than Euclidean distance.

For example, the shortest curve from to has length , while the Euclidean distance is . This is the worst ratio for pairs of points in this set, although proving this claim would be a bit tedious. Following Gromov (Metric structures on Riemannian and non-Riemannian spaces), define the distortion of as the supremum of the ratios over all pairs of distinct points . (Another term in use for this concept: optimal constant of quasiconvexity.) So, the distortion of the set is .

Gromov observed (along with posing the Knot Distortion Problem) that every simple closed curve in a Euclidean space (of any dimension) has distortion at least . That is, the least distorted closed curve is the circle, for which the half-length/diameter ratio is exactly .

Here is the proof. Parametrize the curve by arclength: . For define and let . The curve connects two antipodal points of magnitude at least , and stays outside of the open ball of radius centered at the origin. Therefore, its length is at least (projection onto a convex subset does not increase the length). On the other hand, is a 2-Lipschitz map, which implies . Thus, . Take any that realizes the minimum of . The points and satisfy and . Done.

Follow-up question: what are the least distorted closed surfaces (say, in )? It’s natural to expect that a sphere, with distortion , is the least distorted. But this is false. An exercise from Gromov’s book (which I won’t spoil): Find a closed convex surface in with distortion less than . (Here, “convex” means the surface bounds a convex solid.)

Mathematical reflections, not those supposedly practiced in metaphilosophy.

Given a function defined for , we have two basic ways to reflect it about : even reflection and odd reflection . Here is the even reflection of the exponential function :

The extended function is not differentiable at . The odd reflection, pictured below, is not even continuous at . But to be fair, it has the same slope to the left and to the right of , unlike the even reflection.

Can we reflect a function preserving both continuity and differentiability? Yes, this is what higher-order reflections are for. They define not just in terms of but also involve values at other points, like . Here is one such smart reflection:

Indeed, letting , we observe continuity: both sides converge to . Taking derivatives of both sides, we get

where the limits of both sides as again agree: they are .

A systematic way to obtain such reflection formulas is to consider what they do to monomials: , , , etc. A formula that reproduces the monomials up to degree will preserve the derivatives up to order . For example, plugging or into (1) we get a valid identity. With the equality breaks down: on the left, on the right. As a result, the curvature of the graph shown above is discontinuous: at it changes the sign without passing through .

To fix this, we’ll need to use a third point, for example . It’s better not to use points like , because when the original domain of is a bounded interval , we probably want the reflection to be defined on all of .

So we look for coefficients such that holds as identity for . The linear system , , has the solution , , . This is our reflection formula, then:

And this is the result of reflecting according to (2):

Now the curvature of the graph is continuous. One could go on, but since human eye is not sensitive to discontinuities of the third derivative, I’ll stop here.

In case you don’t believe the last paragraph, here is the reflection with three continuous derivatives, given by

and below it, the extension given by (2). For these plots I used Desmos because plots in Maple (at least in my version) have pretty bad aliasing.

Also, cubic splines have only two continuous derivatives and they connect dots naturally.

Then he dropped two in at once, and leant over the bridge to see which of them would come out first; and one of them did; but as they were both the same size, he didn’t know if it was the one which he wanted to win, or the other one. – A. A. Milne

It’s useful to have a way of measuring how different two sticks (or fir cones) are in size, shape, and their position in a river. Yes, we have the Hausdorff distance between sets, but it does not take into account the orientation of sticks. And it performs poorly when the sticks are broken: the Hausdorff distance between these blue and red curves does not capture the disparity of their shapes:

Indeed, is relatively small here, because from any point of red curve one can easily jump to some point of the blue curve, and the other way around. However, this kind of measurement completely ignores the fact that curves are meant to be traveled along in a continuous, monotone way.

There is a concept of distance that is better suited for comparing curves: the Fréchet distance . Wikipedia gives this (folklore) description:

Imagine a dog walking along one curve and the dog’s owner walking along the other curve, connected by a leash. Both walk continuously along their respective curve from the prescribed start point to the prescribed end point of the curve. Both may vary their speed, and even stop, at arbitrary positions and for arbitrarily long. However, neither can backtrack. The Fréchet distance between the two curves is the length of the shortest leash that is sufficient for traversing both curves in this manner.

To get started, let’s compute this distance for two oriented line segments and . The length of the leash must be at least in order to begin the walk, and at least to finish. So,

In fact, equality holds here. In order to bound from above, we just need one parametrization of the segments. Take the parametrization proportional to length:

Then is the quadratic polynomial of . Without doing any computations, we can say the coefficient of is nonnegative, because cannot be negative for any . Hence, this polynomial is a convex function of , which implies that its maximum on the interval is attained at an endpoints. And the endpoints we already considered. (By the way, this proof works in every CAT(0) metric space.)

In general, the Fréchet distance is not realized by constant-speed parametrization. Consider these two curves, each with a long detour:

It would be impractical for the dog and the owner to go on the detour at the same time. One should go first while the other waits for his/her/its turn. In particular, we see symmetry breaking here: even for two perfectly symmetric curves, the Fréchet-optimal parametrizations would not be symmetric to each other.

It is not obvious from the definition of whether it is a metric; as usual, it’s the triangle inequality that is suspect. However, indeed satisfies the triangle inequality. To prove this, we should probably formalize the definition of . Given two continuous maps from into (or any metric space), define

where and range over all nondecreasing functions from onto itself. Actually, we can require and to be strictly increasing (it only takes a small perturbation), which in dog/owner terms means they are not allowed to stop, but can mosey along as slowly as they want. Then we don’t need both and , since

So, given we can pick such that is within of ; then pick such that is within of . Then

For a bounded set on the plane (or in any Euclidean space) one can define the circumcenter and circumradius as follows: is the smallest radius of a closed disk containing , and is the center of such a disk. (Other terms in use: Chebyshev center and Chebyshev radius.)

The fact that is well-defined may not be obvious: what if there are multiple disks of radius that contain ? To investigate, introduce the farthest distance function . By definition, is where attains its minimum. The function is convex, being the supremum of a family of convex functions. However, that does not guarantee the uniqueness of its minimum. We have two issues here:

is not strictly convex

the supremum of an infinite family of strictly convex functions can fail to be strictly convex (like on the interval ).

The first issue is resolved by squaring . Indeed, attains its minimum at the same place where does, and where each term is strictly convex.

Also, we don’t want to lose strict convexity when taking the supremum over . For this purpose, we must replace strict inequality by something more robust. The appropriate substitute is strong convexity: a function is strongly convex if there is such that is convex. Let’s say that is -convex in this case.

Since is a convex (in fact linear) function of , we see that is -convex. This property passes to supremum: subtracting from the supremum is the same as subtracting it from each term. Strong convexity implies strict convexity and with it, the uniqueness of the minimum point. So, , the minimum of , is uniquely defined. (Finding it in practice may be difficult. The spherical version of this problem is considered in Covering points with caps).

Having established uniqueness, it is natural to ask about stability, or more precisely, the continuity of and with respect to . Introduce the Hausdorff distance on the set of bounded subsets. By definition, if is contained in -neighborhood of , and is contained in -neighborhood of . It is easy to see that , and therefore

In words, the circumradius is a -Lipschitz function of the set.

What about the circumcenter? If the set is shifted by units in some direction, the circumcenter moves by the same amount. So it may appear that it should also be a -Lipschitz function of . But this is false.

Observe (or recall from middle-school geometry) that the circumcenter of a right triangle is the midpoint of its hypotenuse:

Consider two right triangles:

Vertices . The right angle is at , and the circumvcenter is the midpoint of opposite side: .

Vertices . The right angle is at
and the circumcenter is at .

The Hausdorff distance between these two triangles is merely , yet the distance between their circumcenters is . So, Lipschitz continuity fails, and the most we can hope for is Hölder continuity with exponent .

And indeed, the circumcenter is locally -Hölder continuous. To prove this, suppose . The -convexity of implies that

On the other hand, since everywhere,

Putting things together,

Thus, as long as remains bounded above, we have an inequality of the form , which is exactly -Hölder continuity.

Remark. The proof uses no information about other than the -convexity of the squared distance function. As such, it applies to every CAT(0) space.

In the novella Flatland by Edwin A. Abbott, the Sphere leads the Square “downward to the lowest depth of existence, even to the realm of Pointland, the Abyss of No dimensions”:

I caught these words, “Infinite beatitude of existence! It is; and there is nothing else beside It.” [...] “It fills all Space,” continued the little soliloquizing Creature, “and what It fills, It is. What It thinks, that It utters; and what It utters, that It hears; and It itself is Thinker, Utterer, Hearer, Thought, Word, Audition; it is the One, and yet the All in All. Ah, the happiness, ah, the happiness of Being!”

Indeed, Pointland (a one-point space) is zero-dimensional by every concept of dimension that I know of. Yet there is something smaller: Nothingland — empty space, — whose non-existent inhabitants must be perpetually enjoying the happiness of Non-Being.

What is the dimension of Nothingland?

In topology, the empty set has dimension . This fits the inductive definition of topological dimension, which is the smallest number such that the space can be minced by removing a subset of dimension . (Let’s say a space has been minced if what’s left has no connected subsets other than points.)

Thus, a nonempty finite (or countable) set has dimension : it’s minced already, so we remove nothing, a set of dimension . A line or a curve is one-dimensional: they can be minced by removing a zero-dimensional subset, like rational numbers.

The Flatland itself can be minced by removing a one-dimensional subset (e.g., circles with rational radius and rational coordinates of the center), so it is two-dimensional. And so on.

The convention , helpful in the definition, gets in the way later. For example, the topological dimension is subadditive under products: … unless both and are empty, because then is false. So the case must be excluded from the product theorem. We would not have to do this if was defined to be .

Next, consider the Hausdorff dimension. Its definition is not inductive, but one has to introduce other concepts first. First, define the -dimensional premeasure on scale :

where the infimum is taken over all covers of by nonempty subsets with . Requiring to be nonempty avoids the need to define the diameter of Nothingland, which would be another story. The empty space can be covered by empty family of nonempty subsets. The sum of empty set of numbers is , and so .

Then we define the -dimensional Hausdorff measure:

and finally,

If in this last infimum we require , the result is . But why make this restriction? The -dimensional pre-measures and measures make sense for all real . It’s just that for nonempty , we are raising some small (or even zero) numbers to negative power, getting something large as a result. Consequently, every nonempty space has for all .

But , from the sum of empty collection of numbers being zero. Hence, for all real , and this leads to .

To have is also convenient because the Hausdorff dimension is superadditive under products: . This inequality was proved for general metric spaces as recently as 1995, by John Howroyd. If we don’t have , then both factors and must be assumed nonempty.

So… should Nothingland have topological dimension and Hausdorff dimension ? But that would violate the inequality which holds for every other separable metric space. In fact, for such spaces the topological dimension is simply the infimum of the Hausdorff dimension over all metrics compatible with the topology.

I am inclined to let the dimension of Nothingland be for every concept of dimension.