Sums of IID Samples

Interact

After the dry, algebraic discussion of the previous section it is a relief to finally be able to compute some variances.

Let $X_1, X_2, \ldots X_n$ be random variables with sum The variance of the sum is

We say that the variance of the sum is the sum of all the variances and all the covariances.

If $X_1, X_2 \ldots , X_n$ are independent, then all the covariance terms in the formula above are 0.

Therefore if $X_1, X_2, \ldots, X_n$ are independent then

Thus for independent random variables $X_1, X_2, \ldots, X_n$, both the expectation and the variance add up nicely:

When the random variables are i.i.d., this simplifies even further.

Sum of an IID Sample

Let $X_1, X_2, \ldots, X_n$ be i.i.d., each with mean $\mu$ and $SD$ $\sigma$. You can think of $X_1, X_2, \ldots, X_n$ as draws at random with replacement from a population, or the results of independent replications of the same experiment.

Let $S_n$ be the sample sum, as above. Then

This implies that as the sample size $n$ increases, the distribution of the sum $S_n$ shifts to the right and is more spread out.

Here is one of the most important applications of these results.

Variance of the Binomial

Let $X$ have the binomial $(n, p)$ distribution. We know that where $I_1, I_2, \ldots, I_n$ are i.i.d. indicators, each taking the value 1 with probability $p$. Each of these indicators has expectation $p$ and variance $pq = p(1-p)$. Therefore

For example, if $X$ is the number of heads in 100 tosses of a coin, then

Here is the distribution of $X$. You can see that there is almost no probability outside the range $E(X) \pm 3SD(X)$.

k = np.arange(25, 75, 1)
binom_probs = stats.binom.pmf(k, 100, 0.5)
binom_dist = Table().values(k).probability(binom_probs)
Plot(binom_dist, show_ev=True, show_sd=True)

png