Sums of IID Samples
After the dry, algebraic discussion of the previous section it is a relief to finally be able to compute some variances.
Let $X_1, X_2, \ldots X_n$ be random variables with sum The variance of the sum is
We say that the variance of the sum is the sum of all the variances and all the covariances.
If $X_1, X_2 \ldots , X_n$ are independent, then all the covariance terms in the formula above are 0.
Therefore if $X_1, X_2, \ldots, X_n$ are independent then
Thus for independent random variables $X_1, X_2, \ldots, X_n$, both the expectation and the variance add up nicely:
When the random variables are i.i.d., this simplifies even further.
Sum of an IID Sample
Let $X_1, X_2, \ldots, X_n$ be i.i.d., each with mean $\mu$ and $SD$ $\sigma$. You can think of $X_1, X_2, \ldots, X_n$ as draws at random with replacement from a population, or the results of independent replications of the same experiment.
Let $S_n$ be the sample sum, as above. Then
This implies that as the sample size $n$ increases, the distribution of the sum $S_n$ shifts to the right and is more spread out.
Here is one of the most important applications of these results.
Variance of the Binomial
Let $X$ have the binomial $(n, p)$ distribution. We know that where $I_1, I_2, \ldots, I_n$ are i.i.d. indicators, each taking the value 1 with probability $p$. Each of these indicators has expectation $p$ and variance $pq = p(1-p)$. Therefore
For example, if $X$ is the number of heads in 100 tosses of a coin, then
Here is the distribution of $X$. You can see that there is almost no probability outside the range $E(X) \pm 3SD(X)$.
k = np.arange(25, 75, 1)
binom_probs = stats.binom.pmf(k, 100, 0.5)
binom_dist = Table().values(k).probability(binom_probs)
Plot(binom_dist, show_ev=True, show_sd=True)