21.2. The Beta-Binomial Distribution#

As in the previous section, let X have the beta (r,s) prior, and given X=p let the Sn be the number of heads in the first n tosses of a p-coin.

All the calculations we carried out in the previous section were under the condition that Sn=k, but we never needed to find the probability of this event. It was part of the constant that made the posterior density of X integrate to 1.

We can now find P(Sn=k) by writing the posterior density in two ways:

  • By recalling that it is the beta (r+k,s+nk) density:

fX|Sn=k(p) = C(r+k,s+nk)pr+k1(1p)s+nk1,    0<p<1
  • By using Bayes’ Rule:

fX|Sn=k(p) = C(r,s)pr1(1p)s1(nk)pk(1p)nkP(Sn=k),    0<p<1

Now equate constants:

C(r,s)(nk)P(Sn=k) = C(r+k,s+nk)

21.2.1. Beta-Binomial Probabilities#

So for k in the range 0 through n,

P(Sn=k) = (nk)C(r,s)C(r+k,s+nk)

where C(r,s) is the constant in the beta (r,s) density, given by

C(r,s) = Γ(r+s)Γ(r)Γ(s)

That’s not as awful as it looks. A better way to think of the formula is

P(Sn=k) = (nk)constant in the prior betaconstant in the posterior beta given k heads in n tosses

This discrete distribution is called the beta-binomial distribution with parameters r, s, and n. It is the distribution of the number of heads in n tosses of a coin that lands heads with a probability picked according to the beta (r,s) distribution.

See More

One (r,s) pair is particularly interesting: r=s=1. That’s the case when X has the uniform prior. The distribution of Sn reduces to

P(Sn=k) = n!k!(nk)!1!0!0!k!(nk)!(n+1)! = 1n+1

There’s no k in the answer! The conclusion is that if you choose p uniformly between 0 and 1 and toss a p-coin n times, the distribution of the number of heads is uniform on {0,1,2,,n}.

If you choose p uniformly between 0 and 1, then for the conditional distribution of Sn given that p was the selected value is binomial (n,p). But the unconditional distribution of Sn is uniform.

21.2.2. Checking by Integration#

If you prefer, you can find the distribution of Sn directly, by conditioning on X.

P(Sn=k) =01P(Sn=kX=p)fX(p)dp= 01(nk)pk(1p)nkC(r,s)pr1(1p)s1dp= (nk)C(r,s)01pr+k1(1p)s+nk1dp= (nk)C(r,s)1C(r+k,s+nk)

21.2.3. Expectation#

Given X=p, the conditional distribution of Sn is binomial (n,p). Therefore

E(SnX=p) = np

or, equivalently,

E(SnX) = nX

By iteration,

E(Sn) = E(nX) = nE(X) = nrr+s

The expected proportion of heads in n tosses is

E(Snn) = rr+s

which is the expectation of the prior distribution of X.

In the next section we will examine the long run behavior of this random proportion.

21.2.4. Endnote#

The unconditional probability P(Sn=k) appeared in the denominator of our calculation of the posterior density of X given Sn. Because of the simplifications that result from using conjugate priors, we were able to calculate the denominator in a couple of different ways. But often the calculation can be intractable, especially in high dimensional settings. Methods of dealing with this problem are covered in more advanced courses.