Independence, Revisited
In this section we will remind ourselves about what can happen to independence when parameters are randomized. First, let’s go over some basics.
Average Conditional Probabilities
Let $X$ have density $f_X$ and let $A$ be an event. Then
So
In more compact notation, $P(A) = E(P(A \mid X))$. This is an example of finding expectation by conditioning.
One Toss of a Random Coin
Let $X$ have any density on the unit interval $(0, 1)$. Think of the value of $X$ as the the probability that a coin lands heads. Toss the coin once. Recall that our definition of “given $X=p$” means that
Let $X$ have density $f_X$. Then
Thus if $X$ is uniform on $(0, 1)$, then the chance that the coin lands heads is $1/2$. If $X$ has the beta $(r, s)$ distribution then the chance that the coin lands heads is $r/(r+s)$.
Two Tosses of the Random Coin
Let $X$ be uniform on $(0, 1)$. Given $X = p$, toss a $p$-coin twice and observe the results of the tosses.
We have just observed that $P(\text{first toss is a head}) = 1/2$. The first toss behaves like the toss of a fair coin. The same calculation shows that the chance that the second toss is a head (based on no knowledge of the first toss) is also $1/2$.
Now let’s figure out the chance that both the tosses land heads. We know that $P(\text{both tosses are heads} \mid X = p) = p^2$. So
That’s greater than $1/4$ which is the chance of two heads given that you are tossing a fair coin twice. The results of the two tosses are not independent.
Let’s see what’s going on here. We know that
Therefore
Clearly, knowing that the first toss is a head is telling us something about $X$, which is then reflected is the chance that the second toss is also a head.
To quantify this idea, we will find the posterior density of $X$ given that the first toss is a head. Let $A$ be the event that the first toss is heads. The posterior density given this event is proportional to the prior times the likelihood of $A$. Thus it can be calculated as
This posterior density of $X$ given that the first toss is a head is not uniform. It rises linearly and puts more of its mass on values near 1 than near 0.
This makes sense: given that the first toss is a head, we are more inclined to believe that the coin is biased towards heads than towards tails.
The constant of integration is easy to find. The posterior density given that the first toss is a head is $f_{X \vert A} (p) = 2p$ for $p \in (0, 1)$
To double-check our earlier calculation, we can find $P(\text{second toss is a head} \mid \text{first toss is a head})$ using this posterior density.
This is consistent with our earlier calculation.