New Tricks for this Old Dog

Udacity is offering an introductory statistics course this summer, beginning June 25th.  I’ve enrolled, to see how the Big Boys do it.  This is going to put a lot of pressure on traditional universities–especially here in Texas, where we’re busily hammering out the $10,000 Bachelor’s degree.  I figure if I don’t get up with the leaders of the buffalo herd, I’m gonna get trampled or left behind.

Tip from Meep at the Conservative Commune.

Generating Irrational Probabilities

Over at Jan Nordgreen’s excellent Thnik Again! blog, frequent commenter Richard Sabey posed the question “I want to throw a fair coin and use the result to make some decision with probability 1/sqrt(2). How?”

This was initially a stumper; not a surprise since Richard is a pretty formidable mathematician who has more number theory in his left pinky than I do in my whole head.  Nevertheless, I chewed on this problem for a few days and set it aside until Jan decided to egg folks on by repeating the question.  While my solution (and Richard’s–he had one ready, of course) are posted at the link, I want to explain the Irrational Probability Algorithm here a bit more clearly.

The general problem is to generate a sequence of Bernoulli (binary, yes/no) events such that the long run probability (or proportion) of a YES event is some irrational number in the interval (0, 1), using only repeated tosses of a fair coin, which has a rational probability for heads or tails of 1/2.  The difficulty is that no irrational number can be expressed as ratio (that’s what a mathematician means by “irrational”), so there is no counting technique that will do the job with frequencies of heads or tails.

The solution comes from recognizing this inapplicability of counting methods.  Instead, the solution takes advantage of the fact that, while irrational numbers cannot be precisely expressed as ratios, they can be approximated by ratios to any desired degree of precision (“any” means as far as you want to go, a billion decimal places is certainly feasible).  So, express the irrational target proportion as a binary fraction (the binary point followed by a sequence of zeroes and ones) to the desired precision, and use it as the probability generator. Richard’s algorithm does this implicitly, using the most significant digit of a floating point representation that is repeatedly trimmed and doubled, mine uses the binary expansion explicitly as a series of digits.

Here are our two implementations:

Algorithm S:

 x <- irrational 
do { 
    x <- 2x 
    t <- cointoss() 
    if (t == TAILS) { 
       if (x < 1) 
          x <- x-1 
    } else { 
      if (x >=1) 
    } enddo 

Algorithm A:

digits <- binaryExpansion(irrational) 
digitpointer <- 0 
    digitpointer <- digitpointer + 1 
    t <- cointoss() 
until (t NOT= digits[digitpointer] ) 
if (t = HEADS) 

Algorithm S is limited by the number of digits available in the floating point representation of the irrational number, Algorithm A has a similar limitation in the number of digits in the (explicit) binary representation.  However, this isn’t as big a deal as it might first appear…

How fast does this thing run?  The basic loop in both algorithms is essentially checking to see if the sequence of coin tosses match the binary expansion; if so, it continues indefinitely.  The probability of a match at each step is just 1/2, so this is just a geometric series, with an expected length of 2.  The probability of requiring more than d steps is 2-d.  E.g. the chance of requiring more than 10 steps is less than 1 in a million (210 ≈ 103)