Filtering Stack Exchange users

Mathematics Stack Exchange site has grown rapidly. Not on par with Stack Overflow, but it left the rest of the network behind in terms of the number of questions:

Questions, lots of questions
Questions, lots of questions

I decided to try a new way of navigating the raging waters of Math.SE: restricting the set of users. There are no built-in “follow” or “block” options (status-declined), but there is a well documented API.

Question 1: Where to keep the whitelist of users? I use a shared Google Document with a semicolon-separated list of UserIds. The document was made available to “anyone with a link” and also “published to the web”. Not the most elegant solution, but it works for me. I do not want to tie the list to browser’s local storage, and have no plans to build an app with write access to my Google Drive.

Question 2: How to access the list? I did not find an easy way to retrieve the content of a Google Document in a structured form (JSON), the way it can be done for Sheets. So the data is grabbed from published HTML page:

function whitelist() {
    $.get('https://docs.google.com/document/d/1eRNBmafOTkpV3Kg-AGFdd_zrvC3q4du6SmbiWHYgnJw/pub', function (data) {
        var bigList = data.split('<span>')[1].split('</span>')[0].split(';');
        getQuestions(bigList);
    }, 'html');
}

Question 3: How to pass the list to API? There is a reason it’s called bigList: the length exceeds 100, which is the maximum number of UserIds that API accepts. I split it into blocks of 50. Not 100, because each API call returns at most 100 questions making it likely that some users will not be represented in the results. So I opt for a few more calls in order to get more results. As long as the length of the list stays in triple digits, I’m not in danger of hitting the limit of 30 API calls per seconds. To be on the safe side, I make API calls recursively instead of all at the same time; the script waits for one call to be returned before making another. This creates noticeable delay, which is fine with me.

function getQuestions(bigList) {
    var smallList = [];
    if (bigList.length > 50) {
        smallList = bigList.slice(0, 49);
        bigList = bigList.slice(50);
    } else {
        smallList = bigList;
        bigList = '';
    }
    var callUrl = 'https://api.stackexchange.com/2.1/users/' + smallList.join(';') + '/questions/no-answers?pagesize=100&order=desc&sort=activity&site=math&filter=!C0hDELdnxnK9L*SWwZxFWieLPolYBGg3R';
    $.get(callUrl, function (data) {
        questions = questions.concat(data.items);
        if (bigList !== '') {
            getQuestions(bigList);
        } else {
            insertInPage(questions);
        }
    });
}

The parameter no-answers limits the results to questions without an answer. There are other options. The parameter site=math is the only piece of code that is specifies the site as Mathematics; the same approach should work with every SE site. The last parameter, filter=!C0hDELdnxnK9L*SWwZxFWieLPolYBGg3R tells the API to return only certain fields; this saves bandwidth and speeds up parsing on the client side. The filter is automatically created by selecting the desired items. (It took me a while to understand why I kept getting empty objects when using a filter. Turns out, you have to keep the items field in the wrapper, because all other information goes under it.)

Question 4: How to present the questions? I use the front page of the site, replacing its contents with the questions returned by API.

Questions by whitelisted users
Questions by whitelisted users

There is nothing interesting in this part: the function insertInPage(questions) loops through the array of questions, inserting the data into HTML template consistent with the front page. The result goes into $('#question-mini-list').html(html);

And that’s about it. It is refreshing to use an open platform like Stack Exchange. This kind of filtering, if used by several users with a shared whitelist, comes close to creating a smaller “overlay” site on top of the existing ocean of questions.

Unpacked Chrome extension with the above script is available. Obviously, the Google Doc URL in the function whitelist() would have to be changed in order to use a whitelist different from mine. The extension takes action only when the user navigates to http://math.stackexchange.com/?tab=whitelist. I decided against modifying SE navigation bars to include this link. To avoid triggering any API throttles, I pin a browser tab with this URL, and open any questions from there in a new tab.

Update: since the exclusion of questions with answers is not always desirable, I added another feature to the extension. Navigating to http://math.stackexchange.com/?tab=recent returns recent questions by whitelisted users without accepted answers.

Random queen

This is a sequel to Random knight… don’t expect more than you usually get from sequels. I do not plan to do other chess pieces, although “random pawn” is an intriguing idea.

Queen begins its random walk at d1. After two moves, it can get anywhere on the board.

2 queen moves
2 queen moves

It is not surprising that the most likely outcome is the return to initial state. It is less obvious why the second most likely position is d3.

After 3 moves, the most likely positions are near the center of the board.

3 queen moves
3 queen moves

After 4 moves, a concentric-square pattern emerges.

4 queen moves
4 queen moves

After 10 moves, the distribution is visually identical to the steady-state distribution.

10 queen moves; ≈ steady state.
10 queen moves; ≈ steady state

As was noted in the Random knight post, the steady state distribution is the normalized number of moves from a given position. Hence, all probabilities are rational. For the knight they happen to be unit fractions: 1/168, 1/112, 1/84, 1/56, 1/42. The queen probabilities are uglier: 3/208, 23/1456, 25/1456, 27/1456. Random queen is distributed more uniformly than random knight: it spends between 1.44% and 1.85% of its time in any given square. For the knight, the range is from 0.59% to 2.38%.

But the champion in uniform coverage of the chessboard is the rook: when moving randomly, it spends 1/64 of time on every square.

Random knight

A knight walks randomly on the standard chessboard. What is the proportion of time that it will spend at e4?

The answer (1/42) is not hard to get, but I’ll take the computational approach first (testing Scilab 5.5.0 beta 1, by the way). Begin by placing the knight at b1 and letting it make five random moves. This is the distribution of its position after five moves:

1 knight 5 moves
1 knight 5 moves

Unsurprisingly, half of the board is black; actually more than half because the knight can’t get to h8 in five moves. the other half isn’t — you can even get to h8 in five moves (want to find all possible ways to do this?).

After ten moves, the colors become more uniform.

1 knight 10 moves
1 knight 10 moves

After 200 (or any large even number) of moves, the distribution is little changed. But you may notice that it is centrally symmetric, while the previous one isn’t quite.

1 knight 200 moves
1 knight 200 moves

Let’s repeat the process beginning with two knights at b1 and g1. After five moves of each:

2 knights 5 moves
2 knights 5 moves

After ten moves:

2 knights 10 moves
2 knights 10 moves

After a large number of moves (does not matter, even or odd), the variety of colors is greatly reduced:

steady state distribution
steady state distribution

Indeed, this is the distribution which also describes the proportion of time that the knight (wherever it started) will spend at a given square Q.

Precisely, the proportion of time spent at Q is P(Q)=N(Q)/336 where N(Q) is the number of legal moves from Q. For the sixteen central squares P(Q) = 8/336 = 1/42, while for the corners we get 2/336 = 1/168.

Here is a quick argument to support the above. Let Q1 and Q2 be two squares such that the move from Q1 to Q2 is legal. The proportion of time that the knight makes this move is P(Q1)/N(Q1). Similarly, the time proportion of Q2-Q1 move is P(Q2)/N(Q2). Since the process is symmetric in time (we have a reversible Markov chain), P(Q1)/N(Q1)=P(Q2)/N(Q2). In other words, the ratio P/Q is the same for any two squares that are joined by a legal move; it follows that P/Q is the same for all cells. Finding the coefficient of proportionality is a matter of counting, since the sum of P(Q) over all Q is 1.

The Scilab code I wrote for this post is largely not knight-specific. The function update receives the initial distribution of a chess piece and the set of its legal moves. It computes the distribution after one random move.

function newState=update(state,moves)
    [maxK,n]=size(moves)
    [n,n]=size(state)
    newState=zeros(state)
    for i=1:n 
        for j=1:n
            neighborCount=0
            contribution=zeros(state)
            for k=1:maxK
                pos=[i,j]+moves(k,:)
                if (min(pos)>=1)&(max(pos)<=n) then
                   neighborCount=neighborCount+1
                   contribution(pos(1),pos(2))=state(i,j)
                end
            end
            newState=newState+contribution/neighborCount
        end
    end
endfunction

This set is given in the function knight which calls update iteratively.

function state=knight(state,iter)
    moves=[2 1;1 2;-2 1;1 -2;2 -1;-1 2;-2 -1;-1 -2]
    for i=1:iter
        state=update(state,moves)
    end
endfunction

For example, this is how I plotted the distribution after 10 random moves starting at b1:

initial = zeros(8,8)
initial(8,2)=1
state = knight(initial,10)
isoview(0,0,9,9)
Matplot(state*500)
f = gcf()
f.color_map = hotcolormap(32)

NB: getting the correct color representation of matrices from Matplot requires an up-to-date version of Scilab; in version 5.4.0 Matplot smoothens colors, producing intriguing but less informative images.

stars?
knight sky?

Institutions ranked by the number of AMS Fellows: 2014 update

Adding 2014 Fellows did not change much compared to 2013 rankings. Indiana gains 13 places, rising from #47(tie) to #34(tie). Syracuse loses three places, dropping from #331(tie) to #334(tie).

Podium 231

  • 1. University of California, Berkeley: 34
  • 2. University of California, Los Angeles: 33
  • 3. Rutgers The State University of New Jersey New Brunswick: 30
  • 4. University of Michigan: 29
  • 5. Massachusetts Institute of Technology: 28
  • 6. University of Wisconsin, Madison: 23
  • 7. Cornell University: 21
  • 7. New York University, Courant Institute: 21
  • 7. University of Texas at Austin: 21
  • 10. Princeton University: 20
  • 10. University of Chicago: 20
  • 10. University of Illinois, Urbana-Champaign: 20
  • 10. University of Washington: 20
  • 14. Stanford University: 18
  • 15. University of California, San Diego: 17
  • 16. Pennsylvania State University: 16
  • 16. University of California, Santa Barbara: 16
  • 16. University of Pennsylvania: 16
  • 19. Brown University: 15
  • 19. Purdue University: 15
  • 19. Stony Brook University: 15
  • 22. University of California, Irvine: 14
  • 22. University of Maryland: 14
  • 22. University of Minnesota-Twin Cities: 14
  • 25. Duke University: 13
  • 25. Eidgenössische Technische Hochschule Zürich (ETH Zürich): 13
  • 27. Georgia Institute of Technology: 12
  • 27. Ohio State University, Columbus: 12
  • 29. CUNY Graduate Center: 11
  • 29. Harvard University: 11
  • 29. Northwestern University: 11
  • 29. Texas A&M University: 11
  • 29. University of Illinois at Chicago: 11
  • 34. Indiana University, Bloomington: 10
  • 34. Johns Hopkins University, Baltimore: 10
  • 34. University of British Columbia: 10
  • 34. University of Toronto: 10
  • 38. Rice University: 9
  • 38. University of North Carolina at Chapel Hill: 9
  • 40. California Institute of Technology: 8
  • 40. Institute for Advanced Study: 8
  • 40. University of Nebraska-Lincoln: 8
  • 40. University of Utah: 8
  • 40. Vanderbilt University: 8
  • 45. Boston University: 7
  • 45. Brandeis University: 7
  • 45. University of Oregon: 7
  • 45. University of Southern California: 7
  • 49. Carnegie Mellon University: 6
  • 49. Columbia University: 6
  • 49. Michigan State University: 6
  • 49. Tel Aviv University: 6
  • 49. The Hebrew University of Jerusalem: 6
  • 49. Université Pierre et Marie Curie (Paris VI): 6
  • 49. University of Arizona: 6
  • 49. University of Georgia: 6
  • 49. University of Virginia: 6
  • 58. Microsoft Research: 5
  • 58. North Carolina State University: 5
  • 58. Polytechnic Institute of New York University: 5
  • 58. University of California, Davis: 5
  • 58. University of Notre Dame: 5
  • 58. University of Oxford: 5
  • 58. Williams College: 5
  • 65. Ecole Polytechnique Fédérale de Lausanne (EPFL): 4
  • 65. Louisiana State University, Baton Rouge: 4
  • 65. Northeastern University: 4
  • 65. Université Paris-Diderot: 4
  • 65. University of California, Riverside: 4
  • 65. University of Colorado, Boulder: 4
  • 65. University of Rochester: 4
  • 65. University of Tennessee, Knoxville: 4
  • 65. Virginia Polytechnic Institute and State University: 4
  • 65. Yale University: 4
  • 75. Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences: 3
  • 75. Australian National University: 3
  • 75. Barnard College, Columbia University: 3
  • 75. Florida State University: 3
  • 75. Harvey Mudd College: 3
  • 75. Københavns Universitet: 3
  • 75. KTH Royal Institute of Technology: 3
  • 75. KU Leuven: 3
  • 75. Mathematical Institute, Oxford University: 3
  • 75. McGill University: 3
  • 75. Norwegian University of Science and Technology: 3
  • 75. Tsinghua University: 3
  • 75. Università degli Studi di Milano: 3
  • 75. Université Paris-Sud (Paris XI): 3
  • 75. University of Cambridge: 3
  • 75. University of Connecticut, Storrs: 3
  • 75. University of Melbourne: 3
  • 75. University of Warwick: 3
  • 75. Washington University: 3
  • 75. Weizmann Institute of Science: 3
  • 95. Académie des sciences, Institut de France: 2
  • 95. American Mathematical Society: 2
  • 95. Auburn University: 2
  • 95. Bar-Ilan University: 2
  • 95. Boston College: 2
  • 95. Brigham Young University: 2
  • 95. Case Western Reserve University: 2
  • 95. Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV): 2
  • 95. Dartmouth College: 2
  • 95. Emory University: 2
  • 95. IDA Center for Communications Research: 2
  • 95. Imperial College: 2
  • 95. Indian Institute of Technology: 2
  • 95. Indiana University-Purdue University Indianapolis: 2
  • 95. Institut des Hautes Etudes Scientifiques (IHES): 2
  • 95. Institute of Mathematics, University of Paderborn: 2
  • 95. Lund University: 2
  • 95. Oklahoma State University: 2
  • 95. Oregon State University: 2
  • 95. Rheinische Friedrich-Wilhelms-Universität Bonn: 2
  • 95. Rutgers The State University of New Jersey Newark: 2
  • 95. Sapienza – Università di Roma: 2
  • 95. Shanghai Jiao Tong University: 2
  • 95. Smith College: 2
  • 95. Steklov Institute of Mathematics of the Russian Academy of Sciences: 2
  • 95. Tata Institute of Fundamental Research: 2
  • 95. Technical University of Denmark: 2
  • 95. The Fields Institute: 2
  • 95. Tulane University: 2
  • 95. Universidad Nacional Autónoma de México: 2
  • 95. Universität Wien: 2
  • 95. Université de Genève: 2
  • 95. Université de Montréal: 2
  • 95. University of Bristol: 2
  • 95. University of Florida: 2
  • 95. University of Freiburg: 2
  • 95. University of Kansas: 2
  • 95. University of Kentucky: 2
  • 95. University of Leeds: 2
  • 95. University of Manchester: 2
  • 95. University of Memphis: 2
  • 95. University of Missouri-Columbia: 2
  • 95. University of New Hampshire: 2
  • 95. University of Oklahoma: 2
  • 95. University of Oslo: 2
  • 95. Vienna University of Technology: 2
  • 95. Wayne State University: 2
  • 95. Wesleyan University: 2
  • 143. Aarhus University: 1
  • 143. Åbo Akademi University: 1
  • 143. American University: 1
  • 143. Amgen: 1
  • 143. Amherst College: 1
  • 143. Arizona State University: 1
  • 143. Ateneo de Manila University: 1
  • 143. Austrian Academy of Sciences: 1
  • 143. B.I.Verkin Institute for Low Temperature Physics and Engineering: 1
  • 143. Baylor University: 1
  • 143. Beijing Normal University: 1
  • 143. Bielefeld University: 1
  • 143. Carleton University: 1
  • 143. Center for Communications Research, Princeton, New Jersey: 1
  • 143. Centre for Quantum Geometry of Moduli Spaces (QGM), Aarhus University: 1
  • 143. Centre of Mathematics for Applications, University of Oslo: 1
  • 143. Centrum Wiskunde & Informatica and University of Amsterdam: 1
  • 143. Chalmers University of Technology: 1
  • 143. Chennai Mathematical Institute: 1
  • 143. Chern Institute of Mathematics, Nankai University: 1
  • 143. Chinese Academy of Sciences: 1
  • 143. City University of Hong Kong: 1
  • 143. Cleveland State University: 1
  • 143. CMLA, École Normale Supérieure de Cachan: 1
  • 143. CNRS and Université Paris-Sud (Paris XII): 1
  • 143. CNRS, École Normale Supérieure de Lyon: 1
  • 143. CNRS, Institut de Mathématiques de Toulouse: 1
  • 143. Collège de France: 1
  • 143. Croatian Academy of Sciences and Arts: 1
  • 143. CUNY Lehman College: 1
  • 143. Ecole des hautes études en sciences sociales (EHESS): 1
  • 143. El Colegio Nacional: 1
  • 143. Eötvös Loránd University: 1
  • 143. Freie Universität Berlin: 1
  • 143. Grambling State University: 1
  • 143. Haverford College: 1
  • 143. Hillman University: 1
  • 143. Hong Kong University of Science and Technology: 1
  • 143. IBM Research: 1
  • 143. IMFUFA, Roskilde University: 1
  • 143. Institució Catalana de Recerca i Estudis Avançats (ICREA): 1
  • 143. Institut de Mecanique Celeste et de Calcul des Ephemerides (IMCCE): 1
  • 143. Institut mathématique de Jussieu: 1
  • 143. Institut Universitaire de France: 1
  • 143. Institute for Information Transmission Problems: 1
  • 143. Institute of Mathematical Sciences, The Chinese University of Hong Kong: 1
  • 143. Institute of Mathematics, Academia Sinica, ROC: 1
  • 143. Instituto de Matematica Pura e Aplicada (IMPA): 1
  • 143. Instituto de Matemáticas Universidad Nacional Autónoma de México: 1
  • 143. Instituto de Matematicas, Universidad de Talca: 1
  • 143. Jacobs University: 1
  • 143. Johannes Kepler University Linz : 1
  • 143. Kent State University, Kent: 1
  • 143. King Abdullah University of Science and Technology: 1
  • 143. King’s College London: 1
  • 143. Kobe University: 1
  • 143. Korea Institute for Advanced Study: 1
  • 143. Kyoto University: 1
  • 143. Langorigami.com: 1
  • 143. Lawrence Berkeley National Laboratory: 1
  • 143. Lehigh University: 1
  • 143. Loyola University of Chicago: 1
  • 143. Ludwig-Maximilians-Universität München (LMU München): 1
  • 143. Macalester College: 1
  • 143. Massey University: 1
  • 143. Math for America: 1
  • 143. Mathematical Institute, Leiden University: 1
  • 143. Mathematical Institute, Linkoeping University: 1
  • 143. Mathematics Institute, Freiburg University: 1
  • 143. Max Planck Institute for Gravitational Physics (Albert Einstein Institute): 1
  • 143. Max Planck Institute for Mathematics in the Sciences: 1
  • 143. McMaster University: 1
  • 143. Montana State University: 1
  • 143. Moravian College: 1
  • 143. Mount Holyoke College: 1
  • 143. Muenster University: 1
  • 143. Nagoya University: 1
  • 143. Nanyang Technological University: 1
  • 143. National Science Foundation: 1
  • 143. National Security Agency: 1
  • 143. National Taiwan University: 1
  • 143. National Tsing Hua University, Taiwan: 1
  • 143. New Jersey Institute of Technology: 1
  • 143. New Mexico State University, Las Cruces: 1
  • 143. Nihon University: 1
  • 143. Northern Illinois Univeresity: 1
  • 143. Northrop Grumman Corporation: 1
  • 143. Open University, U.K.: 1
  • 143. Osaka University: 1
  • 143. Philipps-Universität Marburg: 1
  • 143. Pitzer College: 1
  • 143. Pohang University of Science and Technology (POSTECH): 1
  • 143. Polish Academy of Sciences: 1
  • 143. Pomona College: 1
  • 143. Pontifícia Universidade Católica do Rio de Janeiro: 1
  • 143. Queen Mary, University of London: 1
  • 143. Queen’s University: 1
  • 143. Reed College: 1
  • 143. Rensselaer Polytechnic Institute: 1
  • 143. Research Institute for Mathematical Sciences, Kyoto University: 1
  • 143. Roskilde University: 1
  • 143. Ruprecht-Karls-Universität Heidelberg: 1
  • 143. Rutgers The State University of New Jersey Camden: 1
  • 143. Saint Petersburg State University: 1
  • 143. San Francisco State University: 1
  • 143. Seoul National University: 1
  • 143. Silesian University of Technology: 1
  • 143. Sobolev Institute of Mathematics of the Siberian Branch of the Russian Academy of Sciences: 1
  • 143. Society for Industrial and Applied Mathematics (SIAM): 1
  • 143. Spelman College: 1
  • 143. St Louis University: 1
  • 143. St Olaf College: 1
  • 143. St. Petersburg Department of Steklov Institute of Mathematics of Russian Academy of Sciences: 1
  • 143. Stevens Institute of Technology: 1
  • 143. SUNY, Maritime College: 1
  • 143. Technion – Israel Institute of Technology: 1
  • 143. Technische Universität Berlin: 1
  • 143. Technische Universität Darmstadt: 1
  • 143. The Abdus Salam International Centre for Theoretical Physics: 1
  • 143. The Chinese University of Hong Kong: 1
  • 143. The Institute for System Research of the Russian Academy of Sciences: 1
  • 143. The Institute of Mathematical Sciences, Chennai, India: 1
  • 143. The OEIS Foundation Incorporated: 1
  • 143. The University of British Columbia: 1
  • 143. The University of Liverpool: 1
  • 143. The University of Western Australia: 1
  • 143. Tomakomai National College of Technology: 1
  • 143. TU Dortmund University: 1
  • 143. Tufts University: 1
  • 143. Univeristy of Edinburgh: 1
  • 143. Universidad Autónoma de Madrid: 1
  • 143. Universidad de Concepción: 1
  • 143. Universidad de Valladolid: 1
  • 143. Universidad del País Vasco: 1
  • 143. Universität Bielefeld: 1
  • 143. Universität des Saarlandes: 1
  • 143. Universität Konstanz, Germany: 1
  • 143. Universitat Politecnica de Catalunya: 1
  • 143. Universität Regensburg: 1
  • 143. Universität Zürich: 1
  • 143. Université Bordeaux 1: 1
  • 143. Université de Caen: 1
  • 143. Université du Québec à Montréal: 1
  • 143. Université Paris-Nord (Paris XIII): 1
  • 143. Universiteit Gent: 1
  • 143. Universiteit Leiden: 1
  • 143. Universiteit Utrecht: 1
  • 143. Universiteit van Amsterdam: 1
  • 143. University at Buffalo: 1
  • 143. University College London: 1
  • 143. University of Aberdeen: 1
  • 143. University of Alabama at Birmingham: 1
  • 143. University of Alaska Fairbanks: 1
  • 143. University of Auckland: 1
  • 143. University of Basel: 1
  • 143. University of Central Florida: 1
  • 143. University of Cincinnati: 1
  • 143. University of Edinburgh: 1
  • 143. University of Hawaii at Manoa: 1
  • 143. University of Heidelberg: 1
  • 143. University of Helsinki: 1
  • 143. University of Houston: 1
  • 143. University of Liverpool: 1
  • 143. University of Massachusetts Boston: 1
  • 143. University of Massachusetts, Amherst: 1
  • 143. University of Miami: 1
  • 143. University of Michoacan: 1
  • 143. University of Minnesota Rochester: 1
  • 143. University of Minnesota-Duluth: 1
  • 143. University of New Mexico: 1
  • 143. University of New South Wales: 1
  • 143. University of Nice: 1
  • 143. University of North Carolina at Charlotte: 1
  • 143. University of Pittsburgh: 1
  • 143. University of San Francisco: 1
  • 143. University of South Carolina: 1
  • 143. University of Texas at Dallas: 1
  • 143. University of Texas at San Antonio: 1
  • 143. University of Tokyo: 1
  • 143. University of Toledo: 1
  • 143. University of Vermont: 1
  • 143. University of Victoria: 1
  • 143. University of Warsaw: 1
  • 143. University of Wisconsin, Milwaukee: 1
  • 143. Victoria University of Wellington: 1
  • 143. Wake Forest University: 1
  • 143. Waseda University: 1
  • 143. Western Washington University: 1
  • 143. Westfälische Wilhelms-Universität Münster: 1
  • 143. Wolfram Research: 1
  • 143. York University: 1
  • 334. Syracuse University: 0

Structure of the 3n+1 stopping time

Returning to the stopping time of the {3n+1} process (first part here):

3n+1 flow chart
3n+1 flow chart

Here is the plot of the stopping time embellished with a few logarithmic functions.

Order in chaos
Order in chaos

This structure is explained by looking at the number of {3x+1} operations that a number experiences before reaching {1}.

No {3x+1} operations. Such number are obviously of the form {2^m}, with stopping time {m}. These creates the points {(2^m,m)} which lie on the curve {y=\log_2 x}.

One {3x+1} operation. To find such numbers, follow their orbit backward: a series of multiplication by {2}, then {(x-1)/3} operation, then more multiplications by {2}. This leads to numbers of the form {2^n \dfrac{2^{m}-1}{3}} where {m} must be even in order for {2^m-1} to be divisible by {3}. The stopping time is {n+m+1}. Since {2^n \dfrac{2^{m}-1}{3} \approx \dfrac{2^{n+m}}{3}}, the corresponding points lie close to the curve {y=1+\log_2(3x)}. Also notice that unlike the preceding case, clusters appear: there may be multiple pairs {(n,m)} with even {m} and the same {n+m}. The larger the sum {n+2m} is, the more such pairs occur.

Two {3x+1} operations. Tracing the orbit backwards again, we find that these are numbers of the form

{2^p\dfrac{2^n \dfrac{2^{m}-1}{3} -1}{3}}

It is straightforward to work out the conditions on {(m,n)} which allow both divisions by {3} to proceed. They are: either {n} is odd and {m \equiv 4 \mod 6}, or {n} is even and {m\equiv 2 \mod 6}. In any event, the stopping time is {m+n+p+2} and the number itself is approximately {2^{m+n+p}/9}. On the above chart, these points lie near the curve {y=2+\log_2(9x)}. Clustering will be more prominent than in the previous case, because we now deal with triples {(n,m,p)} that will be nearby each other if {n+m+p} is the same.

k) {k} operations of {3x+1} kind. These yield the points near the curve {y=k+\log_2(3^k x)}, or, to put it another way, {y=k\log_2 6+\log_2(x)}. The plot above shows such curves for {k=0,1,\dots,11}.

Down with sines!

Suppose you have a reasonable continuous function {f} on some interval, say {f(x)=e^x} on {[-1,1]}, and you want to approximate it by a trigonometric polynomial. A straightforward approach is to write

\displaystyle    f(x) \approx \frac12 A_0+\sum_{n=1}^N (A_n\cos n \pi x +B_n \sin n \pi x)

where {A_n} and {B_n} are the Fourier coefficients:

\displaystyle    A_n= \int_{-1}^1 e^x \cos \pi n x \,dx = (-1)^n \frac{e-e^{-1}}{1+ \pi^2 n^2 }

\displaystyle    B_n= \int_{-1}^1 e^x \sin \pi n x \,dx = (-1)^{n-1} \frac{\pi n (e-e^{-1})}{1+ \pi^2 n^2 }

(Integration can be done with the standard Calculus torture device). With {N=4}, we get

\displaystyle    e^{x} \approx 1.175 -0.216\cos\pi x +0.679 \sin \pi x +0.058\cos 2\pi x -0.365 \sin 2\pi x -0.026\cos 3\pi x +0.247 \sin 3\pi x +0.015\cos 4\pi x   -0.186 \sin 4\pi x

which, frankly, is not a very good deal for the price.

Would not buy again
Would not buy again

Still using the standard Fourier expansion formulas, one can improve approximation by shifting the function to {[0,2]} and expanding it into the cosine Fourier series.

\displaystyle    f(x-1) \approx \frac12 A_0+\sum_{n=1}^N A_n\cos \frac{n \pi x}{2}

where

\displaystyle    A_n= \int_{0}^2 e^{x-1} \cos \frac{\pi n x}{2} \,dx = \begin{cases} \dfrac{e-e^{-1}}{1+ \pi^2 k^2 }  \quad & n=2k \\ {} \\   (-1)^k \dfrac{4(e+e^{-1})}{4+ \pi^2 (2k-1)^2 } \quad & n = 2k-1 \end{cases}

Then replace {x} with {x+1} to shift the interval back. With {N=4}, the partial sum is

\displaystyle    e^x \approx 1.175 -0.890\cos \frac{\pi(x+1)}{2} +0.216\cos \pi(x+1)-0.133\cos \frac{3\pi(x+1)}{2} +0.058\cos 2\pi (x+1)

which gives a much better approximation with fewer coefficients to calculate.

When the sines don't get in the way.
When the sines don’t get in the way

To see what is going on, one has to look beyond the interval on which {f} is defined. The first series actually approximates the periodic extension of {f}, which is discontinuous because the endpoint values are not equal:

Periodic extension of a continuous function may be discontinuous
Periodic extension of a continuous function may be discontinuous

Cosines, being even, approximate the symmetric periodic extension of {f}, which is continuous whenever {f} is.

Symmetric periodic extension preserves continuity
Symmetric periodic extension preserves continuity

Discontinuities hurt the quality of Fourier approximation more than the lack of smoothness does.

Just for laughs I included the pure sine approximation, also with {N=4}.

Down with sines
Down with sines

3 calculus 3 examples

The function {f(x,y)=\dfrac{xy}{x^2+y^2}} might be the world’s most popular example demonstrating that the existence of partial derivatives does not imply differentiability.

xy/(x^2+y^2)
xy/(x^2+y^2)

But in my opinion, it is somewhat extreme and potentially confusing, with discontinuity added to the mix. I prefer

\displaystyle  f(x,y)=\frac{xy}{\sqrt{x^2+y^2}}

pictured below.

xy/sqrt(x^2+y^2)
xy/sqrt(x^2+y^2)

This one is continuous. In fact, it is Lipschitz continuous because the first-order partials {f_x} and {f_y} are bounded. The restriction of {f} to the line {y=x} is {f(x,y)=x^2/\sqrt{2x^2} = |x|/\sqrt{2}}, which is a familiar single-variable example of a nondifferentiable function.

To unify the analysis of such examples, let {f(x,y)=xy\,g(x^2+y^2)}. Then

\displaystyle    f_x = y g+ 2x^2yg'

With {g(t)=t^{-1/2}}, where {t=x^2+y^2}, we get

\displaystyle    f_x = O(t^{1/2}) t^{-1/2} + O(t^{3/2})t^{-3/2} = O(1),\quad t\rightarrow 0

By symmetry, {f_y} is bounded as well.

My favorite example from this family is more subtle, with a deceptively smooth graph:

Looks like xy
Looks like xy

The formula is

\displaystyle    f(x,y)=xy\sqrt{-\log(x^2+y^2)}

Since {f} decays almost quadratically near the origin, it is differentiable at {(0,0)}. Indeed, the first order derivatives {f_x} and {f_y} are continuous, as one may observe using {g(t)=\sqrt{-\log t}} above.

And the second-order partials {f_{xx}} and {f_{yy}} are also continuous, if just barely. Indeed,

\displaystyle    f_{xx} = 6xy g'+ 4x^3yg''

Since the growth of {g} is sub-logarithmic, it follows that {g'(t)=o(t^{-1})} and {g''(t)=o(t^{-2})}. Hence,

\displaystyle    f_{xx} = O(t) o(t^{-1}) + O(t^{2}) o(t^{-2}) = o(1),\quad t\rightarrow 0

So, {f_{xx}(x,y)\rightarrow 0 = f_{xx}(0,0)} as {(x,y)\rightarrow (0,0)}. Even though the graph of {f_{xx}} looks quite similar to the first example in this post, this one is continuous. Can’t trust these plots.

Despite its appearance, f_{xx} is continuous
Despite its appearance, f_{xx} is continuous

By symmetry, {f_{yy}} is continuous as well.

But the mixed partial {f_{xy}} does not exist at {(0,0)}, and tends to {+\infty} as {(x,y)\rightarrow (0,0)}. The first claim is obvious once we notice that {f_x(0,y)= y\, g(y^2)} and {g} blows up at {0}. The second one follows from

\displaystyle    f_{xy} = g + 2(x^2+y^2) g' + 4x^2y^2 g''

where {g\rightarrow\infty} while the other two terms tend to zero, as in the estimate for {f_{xx}}. Here is the graph of {f_{xy}}.

Up, up and away
Up, up and away

This example is significant for the theory of partial differential equations, because it shows that a solution of the Poisson equation {f_{xx}+f_{yy} = h } with continuous {h} may fail to be in {C^2} (twice differentiable, with continuous derivatives). The expected gain of two derivatives does not materialize here.

The situation is rectified by upgrading the continuity condition to Hölder continuity. Then {f} indeed gains two derivatives: if {h\in C^\alpha} for some {\alpha\in (0,1)}, then {f\in C^{2,\alpha}}. In particular, the Hölder continuity of {f_{xx} } and {f_{yy} } implies the Hölder continuity of {f_{xy} }.

Integrate by parts twice and solve for the integral

The dreaded calculus torture device that works for exactly two integrals, {\int e^{ax}\sin bx\,dx} and {\int e^{ax}\cos bx\,dx}.

Actually, no. A version of it (with one integration by parts) works for {\int x^n\,dx}:

\displaystyle    \int x^n\,dx = x^n x - \int x\, d(x^n) = x^{n+1} - n \int x^n\,dx

hence (assuming {n\ne -1})

\displaystyle  \int x^n\,dx = \frac{x^{n+1}}{n+1} +C

Yes, this is more of a calculus joke. A more serious example comes from Fourier series.

The functions {\sin nx}, {n=1,2,\dots}, are orthogonal on {[0,\pi]}, in the sense that

\displaystyle    \int_0^\pi \sin nx \sin mx \,dx =0 , \quad m\ne n

This is usually proved using a trigonometric identity that converts the product to a sum. But the double integration by parts give a nicer proof, because no obscure identities are needed. No boundary terms will appear because the sines vanish at both endpoints:

\displaystyle    \int_0^\pi \sin nx \sin mx \,dx = \frac{n}{m} \int_0^\pi \cos nx \cos mx \,dx = \frac{n^2}{m^2} \int_0^\pi \sin nx \sin mx \,dx

All integrals here must vanish because {n^2/m^2\ne 1}. As a bonus, we get the orthogonality of cosines, {\int_0^\pi \cos nx \cos mx \,dx=0}, with no additional effort.

The double integration by parts is also a more conceptual proof, because it gets to the heart of the matter: eigenvectors of a symmetric matrix (operator) that correspond to different eigenvalues are orthogonal. The trigonometric form is incidental, the eigenfunction property is essential. Let’s try this one more time, for the mixed boundary value problem {f(a)=0}, {f'(b)=0}. Suppose that {f} and {g} satisfy the boundary conditions, {f''=\lambda f}, and {g''=\mu g}. Since {fg'} and {f'g} vanish at both endpoints, we can pass the primes easily:

\displaystyle    \int_a^b fg= \frac{1}{\mu}\int_a^b fg'' = -\frac{1}{\mu}\int_a^b f'g' = \frac{1}{\mu}\int_a^b f''g = \frac{\lambda}{\mu} \int_a^b fg

If {\lambda\ne \mu}, all integrals must vanish.

Asymptotics of 3n+1 stopping time

It is a well-known open problem whether the following process terminates for every positive integer:

3n+1 flow chart
3n+1 flow chart

Experiments suggest that it does, possibly with unintended side effects.

Since for any odd integer {n} the number {3n+1} is even, it is convenient to replace the step {n:=3n+1} with {n:=(3n+1)/2}, as show below:

Optimized flow
(3n+1)/2 flow chart

As a function of {n}, the stopping time has a nice patterned graph:

Stopping time
Stopping time

An odd integer {n} is of the form {4k+1} or {4k+3}. In the first case, {(3n+1)/2 = 6k+2} is even, while in the second case {(3n+1)/2 = 6k+5} is odd. So, if {n} is picked randomly from all odd integers of certain size, the probability of {(3n+1)/2} being even is {1/2}. Similarly, for an even {n}, the number {n/2} can be even or odd with probability {1/2}.

This leads to a stochastic model of the process:

Stochastic flow
Stochastic flow

The graph of stopping time in the stochastic model is, of course random. It looks nothing like the nice pattern of the deterministic process.

Stopping time, stochastic version
Stopping time, stochastic version

However, smoothing out both graphs by moving window of width {200} or so, we see the similarity:

Moving averages, deterministic and stochastic
Moving averages, deterministic and stochastic

The stochastic process is much easier to analyze. Focusing on the logarithm {\log x}, we see that it changes either by {\log(1/2)} or by approximately {\log (3/2)}. The expected value of the change is {\displaystyle \frac12\log (3/4)}. This suggests that we can expect the logarithm to drop down to {\log 1=0} in about {\displaystyle \frac{2}{\log (4/3)}\log x} steps. (Rigorous derivation requires more tools from probability theory, but is still routine.)

The curve {\displaystyle \frac{2}{\log (4/3)}\log x} fits the experimental data nicely. (The red curve, being randomly generated, is different from the one on the previous graph.)

Logarithmic growth
Logarithmic growth

For an in-depth investigation, see Lower bounds for the total stopping time of 3X+1 iterates by Applegate and Lagarias.

For the computations, I used Scilab. The function hail(n,m) calculates the stopping times up to given value of n, and takes moving average with window size m (which can be set to 1 for no averaging).

function hail(n,m)
    steps=zeros(1:n);
    steps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>=i 
            if modulo(k,2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        steps(i)=s+steps(k);
    end
    total = cumsum(steps) 
    for i=1:n-m
        average(i)=(total(i+m)-total(i))/m;
    end
    plot(average,'+');
endfunction 

As soon as the result of computations drops below the starting value, the number of remaining steps is fetched from the array that is already computed. This speeds up the process a bit.

The second function follows the stochastic model, for which the aforementioned optimization is not available. This is actually an interesting point: it is conceivable that the stochastic model would be more accurate if it also used the pre-computed stopping time once {x} drops below the starting value. This would change the distribution of stopping times, resulting in wider fluctuations after averaging.

function randomhail(n,m)
    rsteps=zeros(1:n);
    rsteps(1)=0
    for i=2:n 
        k=i;
        s=0;
        while k>1 
            if grand(1,1,"bin",1,1/2)==0 then 
                k=k/2; 
                s=s+1;
            else 
                k=(3*k+1)/2;
                s=s+1;
            end
        end
        rsteps(i)=s;
    end
    rtotal = cumsum(rsteps) 
    for i=1:n-m
        raverage(i)=(rtotal(i+m)-rtotal(i))/m;
    end
    plot(raverage,'r+');
endfunction

Fun with TI-83: billions upon billions of cosines

Okay, maybe not billions. But by taking cosines repeatedly, one can find the solution of the equation {\cos x = x} with high precision in under a minute.

Step 1: Enter any number, for example 0, and press Enter.
Step 2: Enter cos(Ans) and press Enter

Step 2
Step 2

Step 3: Keep pushing Enter. (Unfortunately, press-and-hold-to-repeat does not work on TI-83). This will repeatedly execute the command cos(Ans).

Step 3
Step 3

After a few iterations, the numbers begin to settle down:

Convergence
Convergence

and eventually stabilize at 0.7390851332

Limit
Limit

Explanation: the graph of cosine meets the line {y = x} at one point: this is a unique fixed point of the function {f(x)=\cos x}.

Fixed point of f(x)=cos x
Fixed point of f(x)=cos x

Since the derivative {f'(x)=-\sin x} at the fixed point is less than 1 in absolute value, the fixed point is attracting.

Now try the same with the equation {10 \cos x =x}.

Beginning
Beginning

This time, the numbers flat out refuse to converge:

Chaos!
Chaos!

Explanation: the graph of {f(x)=10\cos x} meets the line {y = x} at seven point: thus, this function has seven fixed points.

Seven fixed points
Seven fixed points

And it so happens that {|f'(x)|>1} at each of those fixed points. This makes them repelling. The sequence has nowhere to converge, because every candidate for the limit pushes it away. All that’s left to it is to jump chaotically around the interval {[-10,10]}. Here are the first 1024 terms, plotted with OpenOffice:

There is some system in this chaos
There is a pattern in this chaos…

Clearly, the distribution of the sequence is not uniform. I divided the interval {[-10,10]} into subintervals of length {0.05} and counted the number of terms falling into each.

Distribution of terms
Distribution of terms

What is going on here? Stay tuned.