Bilinearity in Matrix Notation
As a preliminary to regression, we will express bilinearity in a compact form using matrix notation. The results of this section are not new. They are simply restatements of familiar results about variances and covariances, using new notation and matrix representations.
Let X be a p×1 vector of predictor variables. We know that for an m×p matrix A and an m×1 vector b,
Var(AX+b) = AΣXATThe results below are special cases of this.
Linear Combinations
To define two generic linear combinations of elements of X, let
A = [a1a2⋯apc1c2⋯cp] = [aTcT] and b = [bd]Then
AX+b = [a1X1+a2X2+⋯+apXp+bc1X1+c2X2+⋯+cpXp+d] = [aTX+bcTX+d]Covariance of Two Linear Combinations
The covariance of the two linear combinations is the (1,2) element of the covariance matrix of AX+b, which is the (1,2) element of AΣXAT.
Cov(aTX+b,cTX+d) = aTΣXcVariance of a Linear Combination
The variance of the first linear combination is the (1,1) element of AΣXAT.
Var(aTX+b) = aTΣXaCovariance Vector
To predict Y based on X we will need to work with the covariance of Y and each of the elements of X. Let
σXi,Y = Cov(Xi,Y)and define the covariance vector of X and Y to be
ΣX,Y = [σX1,YσX2,Y⋮σXp,Y]It will be convenient to also have a notation for the transpose of the covariance vector:
ΣY,X = ΣTX,Y = [σX1,Y σX2,Y … σXp,Y]By the linearity of covariance,
Cov(aTX,Y) = aTΣX,Y