Over at Jan Nordgreen’s excellent Thnik Again! blog, frequent commenter Richard Sabey posed the question “I want to throw a fair coin and use the result to make some decision with probability 1/sqrt(2). How?”

This was initially a stumper; not a surprise since Richard is a pretty formidable mathematician who has more number theory in his left pinky than I do in my whole head. Nevertheless, I chewed on this problem for a few days and set it aside until Jan decided to egg folks on by repeating the question. While my solution (and Richard’s–he had one ready, of course) are posted at the link, I want to explain the Irrational Probability Algorithm here a bit more clearly.

The general problem is to generate a sequence of Bernoulli (binary, yes/no) events such that the long run probability (or proportion) of a YES event is some **irrational** number in the interval (0, 1), using only repeated tosses of a fair coin, which has a **rational** probability for heads or tails of 1/2. The difficulty is that no irrational number can be expressed as ratio (that’s what a mathematician means by “irrational”), so there is no counting technique that will do the job with frequencies of heads or tails.

The solution comes from recognizing this inapplicability of counting methods. Instead, the solution takes advantage of the fact that, while irrational numbers cannot be precisely expressed as ratios, they can be approximated by ratios to any desired degree of precision (“any” means as far as you want to go, a billion decimal places is certainly feasible). So, express the irrational target proportion as a binary fraction (the binary point followed by a sequence of zeroes and ones) to the desired precision, and use it as the probability generator. Richard’s algorithm does this implicitly, using the most significant digit of a floating point representation that is repeatedly trimmed and doubled, mine uses the binary expansion explicitly as a series of digits.

Here are our two implementations:

Algorithm S:

```
x <- irrational
do {
x <- 2x
t <- cointoss()
if (t == TAILS) {
if (x < 1)
return(NOEVENT)
else
x <- x-1
} else {
if (x >=1)
return(EVENT)
} enddo
```

Algorithm A:

```
digits <- binaryExpansion(irrational)
digitpointer <- 0
repeat
digitpointer <- digitpointer + 1
t <- cointoss()
until (t NOT= digits[digitpointer] )
if (t = HEADS)
return(NOEVENT)
else
return(EVENT)
```

Algorithm S is limited by the number of digits available in the floating point representation of the irrational number, Algorithm A has a similar limitation in the number of digits in the (explicit) binary representation. However, this isn’t as big a deal as it might first appear…

How fast does this thing run? The basic loop in both algorithms is essentially checking to see if the sequence of coin tosses match the binary expansion; if so, it continues indefinitely. The probability of a match at each step is just 1/2, so this is just a geometric series, with an expected length of 2. The probability of requiring more than *d* steps is *2*^{-d}. E.g. the chance of requiring more than 10 steps is less than 1 in a million (2^{10} ≈ 10^{3})

Indeed you have a method that approximates an irrational number. However, it can never equal the irrational. The sequence produced is bounded above by the irrational and the sequence has no greatest element.

True dat! The probability would only equal the irrational if an infinite number of random values were generated, some of which took an infinite amount of time, so they were precisely equal to the irrational number. However, those are set of measure zero, so the approximation is good enough.

I don’t think I agree – even with an infinite number of values. So, for example, if we consider the set of elements, S, strictly less than 1, then there is no element in S which is greater than all the others even though S is bounded above (for example by 1). For suppose that y were a greatest element . Then y would be an element of S and y < 1, and then we have y<1/2(y+1)<1, and so 1/2(y+1) is an element of S but is greater than the supposedly greatest element y. And we don't like contradictions do we.

See, for example, Wikipedia: Supremum ; least upper bound property with respect to Q (rational numbers).

Touche! We can never attain the infinite in finite time. You’ve provided the perfect example of why statisticians have a love-hate relationship with mathematicians. Two jokes below are pertinent. Bur first, Marcus’ argument, unwrapped for mere statisticians:

The “largest element” argument tells us that the two sets of random sequences (those larger than x and those smaller) DO INDEED have smallest and largest elements, respectively, above and below our target irrational number, 0 < X < 1. And therefore, there is an open interval (U-infimum, L-supremum) containing the target number, CONTAINING NO RANDOMLY GENERATED NUMBERS WHATSOEVER. And since this interval has non-zero width, the proportion of "hits" is not exactly X, but instead L/(L+U), where L and U indicate the number of random values falling into the Lower and Upper regions, respectively. Clearly, this will always be a rational number. Close to X, but still rational, so never, ever, exactly there. The fancy name for this is Convergence in Probability.

The size of the Unvisited Interval and the difference between X and its rational approximation are related to the length of the longest (observed) sequence used to generate a random variate. Straightfoward (but subtle) to calculate, and definitely a Good Homework Problem for my mathematical statistics students, all of whom need to pay closer attention in their Real Analysis courses.

JOKE 1: An engineer, a statistician, and a mathematician are traveling by rail through Colorado. The engineer looks out the window and exclaims "Look, the cattle here are black!" The statistician, well-accustomed to natural variation, chides him, saying "Some of the cattle here are black." Rolling his eyes, the mathematician corrects them both, "Here in Colorado, there exists at least one cow that is black, on one side."

JOKE 2: The freshman class at Some Institute of Technology is invited to a Welcoming Dance, and they dutifully attend. Being generally introverted, the students do not immediately mix to find dance partners, but line up along opposite walls of the hall, men on one side, women on the other. The Dean rushes to change this state of affairs by suggesting an Algorithmic Meeting. He will slowly count to ten, and on each count, the two groups will advance towards one another, far enough to cut the distance between lines in half. Before he can begin, an astute mathematics major objects "That's not going to work. The two lines will approach one each other, but never meet!" A rather more practical engineering major replies, "That may be true, but after 8 or 9 moves, we'll be CLOSE ENOUGH FOR ALL PRACTICAL PURPOSES."

Thank you for joke 2. Did you miss the punch line from joke 1 😉

Well, I didn’t highlight it. Did YOU miss it?

Bernoulli’s law of large numbers requires we first know the probability of the coin showing heads. It says nothing about the case where we are ignorant of the probability.

Also keep in mind the Law is true almost always, that is there exists sequences that do not converge to any particular number if any (true their measure is zero, but they do exist). Convergence in measure lacks Cauchy’s criteria , not knowing the initial probability leaves that criteria in the dark. The implication is that almost all subsequences converge but we can not prove they converge to the same number.

Even if I was told the probability is one over the square root of 2 I would still be in the dark with respect the Cauchy criteria because I can never be sure I wasn’t dealing with a bad sequence and even if the sequence was kosher a problem remains. I do not know what the final infinite digits look like. So who knows where I am heading?

I contend no real coin can ever have irrational probability.