## Randomizing a Parameter

### Randomizing a Parameter¶

In an earlier chapter we saw that Poissonizing the number of i.i.d. Bernoulli trials has a remarkable effect on the relation between the number of successes and the number of failures. In other situations too, randomizing the parameter of a standard model can affect supposedly well-understood relations between random variables.

In this section we will study one simple example of how randomizing a parameter affects dependence and independence.

### Tossing a Random Coin¶

Suppose I have three coins. Coin 1 lands heads with chance 0.25, Coin 2 with chance 0.5, and Coin 3 with chance 0.75. I pick a coin at random and toss it twice. Let's define some notation:

• $X$ is the label of the coin that I pick.
• $Y$ is the number of heads in the two tosses.

Then $X$ is uniform on $\{1, 2, 3\}$, and given $X$, the conditional distribution of $Y$ is binomial with $n=2$ and $p$ corresponding to the given coin. Here is the joint distribution table for $X$ and $Y$, along with the marginal of $X$.

x = make_array(1, 2, 3)
y = np.arange(3)
def jt(x, y):
if x == 1:
return (1/3)*stats.binom.pmf(y, 2, 0.25)
if x == 2:
return (1/3)*stats.binom.pmf(y, 2, 0.5)
if x == 3:
return (1/3)*stats.binom.pmf(y, 2, 0.75)
dist_tbl = Table().values('X', x, 'Y', y).probability_function(jt)
dist = dist_tbl.toJoint()
dist.marginal('X')

X=1 X=2 X=3
Y=2 0.020833 0.083333 0.187500
Y=1 0.125000 0.166667 0.125000
Y=0 0.187500 0.083333 0.020833
Sum: Marginal of X 0.333333 0.333333 0.333333

And here is the posterior distribution of $X$ given each different value of $Y$:

dist.conditional_dist('X', 'Y')

X=1 X=2 X=3 Sum
Dist. of X | Y=2 0.071429 0.285714 0.642857 1.0
Dist. of X | Y=1 0.300000 0.400000 0.300000 1.0
Dist. of X | Y=0 0.642857 0.285714 0.071429 1.0
Marginal of X 0.333333 0.333333 0.333333 1.0

As we have seen in earlier examples, when the given number of heads is low, the posterior distribution favors the coin that is biased towards tails. When the given number of heads is high, it favors the coin that is biased towards heads.

### Are the Two Tosses Independent?¶

We have always assumed that tosses of a coin are independent of each other. But within that assumption was another assumption, unspoken: we knew which coin we were tossing. That is, the chance of heads $p$ was a fixed number. But now we don't know which coin we are tossing, so we have to be careful.

Let $H_i$ be the event that Toss $i$ lands heads. Then

$$P(H_1) = \frac{1}{3}\cdot 0.25 ~+~ \frac{1}{3}\cdot 0.5 ~+~ \frac{1}{3}\cdot 0.75 ~=~ 0.5 ~=~ P(H_2)$$

So each toss is equally likely to be heads or tails. Now let's find $P(H_1H_2)$. If the two tosses are independent, our answer shoud be 0.25.

$$P(H_1H_2) = \frac{1}{3}\cdot 0.25^2 ~+~ \frac{1}{3}\cdot 0.5^2 ~+~ \frac{1}{3}\cdot 0.75^2 ~=~ 0.2917 ~ \ne P(H_1)P(H_2)$$
(1/3)*(0.25**2 + 0.5**2 + 0.75**2)

0.29166666666666663

The two tosses are not independent. Because the coin itself is random, knowing the result of Toss 1 tells you something about which coin was picked, and hence affects the probability that Toss 2 lands heads.

$$P(H_2 \mid H_1) = \frac{P(H_1H_2)}{P(H_1)} = \frac{0.2917}{0.5} = 0.5834 > 0.5 = P(H_2)$$

Knowing that the first coin landed heads makes it more likely that Coin 3 was picked, and hence increases the conditional chance that that the second toss will be a head.

This example shows that you have to be careful about how data can affect probabilities. To make justifiable conclusions based on your data, keep assumptions in mind when you calculate probabilities, and use the division rule to update probabilities as more data comes in.