I just began the new Coursera MOOC on probability, given by UPenn Dr. Santosh S. Venkatesh, and at the very first lecture, my interest was piqued by this intriguing idea, called the Chevalier de Méré's Paradox.. If I ask you which is more likely:

- Getting a "6" four times in four throws of a die
- Getting a "double 6" twenty-four times in twenty-four throws of a pair of dice

You probably won't come up with actual numbers, but your intuition will align nicely with reality in telling you that while the probability associated with (1) is pretty small.. the probability associated with (2) is astronomically so:

- $P(all\ 6s) = P(6) ^ 4 = \frac{1}{6} ^ 4 \approx ~0.00077$
- $P(all\ \left\{6,6\right\}s) = P(\left\{6,6\right\})^{24} = \frac{1}{36} ^{24} \approx ~4.45 \times 10^{-38}$

However if I change the wording of the question slightly:

- Getting a "6"
*at least once*in four throws of a die - Getting a "double 6"
*at least once*in twenty-four throws of a pair of dice

You might be then tempted to answer that they are now both equally likely, by a reasoning which could be similar to this : there are a certain number of independent events and I'm asked about the probability of their *union*, therefore I must consider the sum of their probabilities, i.e.:

- $P(at\ least\ one\ 6) = P(6) \cdot 4 = \frac{1}{6} \cdot 4 = \frac{2}{3}$
- $P(at\ least\ one\ \left\{6,6\right\}) = P(\left\{6,6\right\}) \cdot 24 = \frac{1}{36} \cdot {24} = \frac{2}{3}$

But in that case of course your intuition would be completely wrong (just consider what would happen to the probability of getting a "6" in *six* throws of a die, under this reasoning). In the case of (1), the independent events that we should be counting are not the individual results, from 1 to 6, but rather, the different *configurations* in which the four throws could end up, which is quite different. Let's examine a simpler case with only two throws instead of four. The 36 different configurations are:

```
pd.DataFrame([(t1, t2, 'Yes' if 6 in {t1, t2} else '')
for t1 in range(1, 7)
for t2 in range(1, 7)],
columns=['Throw 1', 'Throw 2', 'Has 6'])
```

The third row highlights the configurations where a "6" occurs, and you'll notice that there are 11 of them (and not 12, as intuition would lead you to falsely believe), which entails that:

$ P(at\ least\ a\ 6\ in\ two\ throws) = \frac{11}{36} = 0.30\overline{5} $

We notice that this is a bit less than $\frac{1}{3}$ of course. By a similar reasoning, and by noticing that it's actually easier to count the configurations which we must reject (instead of those that are of interest), we arrive at the conclusion that in four throws of a die and twenty-four throws of a pair of dice:

- $P(at\ least\ one\ 6) = 1 - \frac{5^4}{6^4} \approx 0.512 $
- $P(at\ least\ one\ \left\{6,6\right\}) = 1 - \frac{35^{24}}{36^{24}} \approx 0.491 $

Finally, if we are still a bit skeptical, nothing beats a little simulation to bring intuition, math and reality together:

```
from random import randint
n = 100000
print(sum([1 for s in [{randint(1, 6)
for _ in range(4)}
for _ in range(n)] if 6 in s]) / n)
print(sum([1 for s in [{(randint(1, 6), randint(1, 6))
for _ in range(24)}
for _ in range(n)] if (6, 6) in s]) / n)
```

## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.