13.3. Sums of Independent Variables#

After the dry, algebraic discussion of the previous section it is a relief to finally be able to compute some variances.

See More

13.3.1. The Variance of a Sum#

Let X1,X2,Xn be random variables with sum

Sn=i=1nXi

The variance of the sum is

Var(Sn)=Cov(Sn,Sn)=i=1nj=1nCov(Xi,Xj)    (bilinearity)=i=1nVar(Xi)+1ijnCov(Xi,Xj)

We say that the variance of the sum is the sum of all the variances and all the covariances.

  • The first sum has n terms.

  • The second sum has n(n1) terms.

Since Cov(Xi,Xj)=Cov(Xj,Xi), the second sum can be written as 21i<jnCov(Xi,Xj). But we will use the form given above.

13.3.2. Sum of Independent Random Variables#

If X1,X2,Xn are independent, then all the covariance terms in the formula above are 0.

Therefore if X1,X2,,Xn are independent, then

Var(Sn)=i=1nVar(Xi)

Thus for independent random variables X1,X2,,Xn, both the expectation and the variance add up nicely:

E(Sn)=i=1nE(Xi),      Var(Sn)=i=1nVar(Xi)

When the random variables are i.i.d., this simplifies even further.

See More

13.3.3. Sum of an IID Sample#

Let X1,X2,,Xn be i.i.d., each with mean μ and SD σ. You can think of X1,X2,,Xn as draws at random with replacement from a population, or the results of independent replications of the same experiment.

Let Sn be the sample sum, as above. Then

E(Sn)=nμ          Var(Sn)=nσ2          SD(Sn)=nσ

This implies that as the sample size n increases, the distribution of the sum Sn shifts to the right and is more spread out. The expectation goes up linearly in n, but the SD goes up more slowly.

Quick Check

Suppose the sizes of 100 random households are i.i.d. with expectation 2.5 and SD 1.9. Let S be the total number of people in all 100 households, that is, the sum of all the household sizes.

(a) Pick one of the following values for E(S): 25, 250, 2500

(b) Pick one of the following values for SD(S): 19, 190, 1900

Here is an important application of the formula for the variance of an i.i.d. sample sum.

13.3.4. Variance of the Binomial#

Let X have the binomial (n,p) distribution. We know that

X=j=1nIj

where I1,I2,,In are i.i.d. indicators, each taking the value 1 with probability p. Each of these indicators has expectation p and variance pq=p(1p). Therefore

E(X)=np          Var(X)=npq          SD(X)=npq

For example, if X is the number of heads in 100 tosses of a coin, then

E(X)=100×0.5=50          SD(X)=100×0.5×0.5=5

Here is the distribution of X. You can see that there is almost no probability outside the range E(X)±3SD(X).

k = np.arange(25, 75, 1)
binom_probs = stats.binom.pmf(k, 100, 0.5)
binom_dist = Table().values(k).probabilities(binom_probs)
Plot(binom_dist, show_ev=True, show_sd=True)
../../_images/6124d172f08e5f655878f9d2b11970dd4d5405e87a6bcf0956bf8699982f2856.png

Quick Check

A die is rolled 45 times. Find the expectation and standard deviation of the number of times the face with six spots appears.

See More

13.3.5. Variance of the Poisson, Revisited#

We showed earlier that if X has the Poisson (μ) distribution then E(X)=μ, Var(X)=μ, and SD(X)=μ. Now we have a way to understand the formula for the SD.

One way in which a Poisson (μ) distribution can arise is as an approximation to a binomial (n,p) distribution where n is large, p is small, and np=μ. The expectation of the binomial becomes the parameter of the approximating Poisson distribution, which is also the expectation of the Poisson.

Now let’s compare the standard deviations. The standard deviation of the binomial is

npq  np    because the small p implies q1

But np=μ in this setting, so the SD of the binomial is approximately μ. That’s the SD of its approximating Poisson (μ) distribution.