Random Vectors
A vector valued random variable, or more simply, a random vector, is a list of random variables defined on the same space. We will think of it as a column.
X = [X1X2⋮Xn]For ease of display, we will sometimes write X=[X1X2…Xn]T where MT is notation for the transpose of the matrix M.
The mean vector of X is μ=[μ1 μ2 … μn]T where μi=E(Xi).
The covariance matrix of X is the n×n matrix Σ whose (i,j) element is Cov(Xi,Xj).
The ith diagonal element of Σ is the variance of Xi. The matrix is symmetric because of the symmetry of covariance.
Linear Transformation: Mean Vector
Let A be an m×n numerical matrix and b an m×1 numerical vector. Consider the m×1 random vector Y=AX+b. Then the ith element of Y is
Yi = Ai∗X+b(i)where Ai∗ denotes the ith row of A and b(i) denotes the ith element of b. Written longhand,
Yi = ai1X1+ai2X2+⋯+ainXn+biwhere aij is the (i,j) entry of A and bi=b(i).
Thus Yi is a linear combination of the elements of X. Therefore by linearity of expectation,
E(Yi) = Ai∗μ+b(i)Let μY be the mean vector of Y. Then by the calculation above,
μY = Aμ+bLinear Transformation: Covariance Matrix
Cov(Yi,Yj) can be calculated using bilinearity of covariance.
Cov(Yi,Yj) = Cov(Ai∗X,Aj∗X)= Cov(n∑k=1aikXk,n∑l=1ajlXl)= n∑k=1n∑l=1aikajlCov(Xk,Xl)= n∑k=1n∑l=1aikCov(Xk,Xl)tlj where tlj=AT(l,j)This is the (i,j) element of AΣAT. So if ΣY denotes the covariance matrix Y, then
ΣY = AΣATConstraints on Σ
We know that Σ has to be symmetric and that all the elements on its main diagonal must be non-negative. Also, no matter what A is, the diagonal elements of ΣY must all be non-negative as they are the variances of the elements of Y. By the formula for ΣY this means
aΣaT ≥ 0 for all 1×n vectors awhich is the same as saying
aTΣa ≥ 0 for all n×1 vectors abecause aΣaT is a scalar and therefore the same as its transpose.
That is, Σ must be positive semidefinite. Usually, we will be working with positive definite covariance matrices, because if aTΣa=0 for some a then some linear combination of the elements of X is constant. Hence you can write some of the elements as linear combinations of the others and just study a reduced set of elements.