Joint Distributions


Joint Distributions

Suppose $X$ and $Y$ are two random variables defined on the same outcome space. For example, in three tosses of a coin, $X$ could be the number of heads in the first two tosses and $Y$ the number of heads in the last two tosses.

We will use the notation $P(X = x, Y = y)$ for the probability that $X$ has the value $x$ and $Y$ has the value $y$. In our example,

$$ P(X = 1, Y = 2) = P(\text{THH}) = \frac{1}{8} $$

The joint distribution of $X$ and $Y$ consists of all the probabilities $P(X=x, Y=y)$ where $x$ ranges over all the possible values of $X$ and $y$ ranges over all the possible values of $Y$.

In our example, both $X$ and $Y$ have values in the range 0, 1, 2, and so there are nine pairs of values. We could use product to list them all, but the domain method extends to two variables and is simpler to use. It takes as its arguments the name of one variable, the range of that variable, the name of the other variable, and the range of that variable.

joint_table = Table().domain('X', np.arange(3), 'Y', np.arange(3))
0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2

This display contains no probabilities yet, so let's put them in. For now, we will simply make an array of probabilities in the order in which the outcomes appear. Later we will see how to replace the array by a function that will compute the probability of each outcome.

probs = make_array(1/8, 1/8, 0, 1/8, 2/8, 1/8, 0, 1/8, 1/8 )
joint_table = joint_table.probability(probs)
X Y Probability
0 0 0.125
0 1 0.125
0 2 0
1 0 0.125
1 1 0.25
1 2 0.125
2 0 0
2 1 0.125
2 2 0.125

This table displays the joint distribution. To check that this is indeed a distribution, we can add up all the probabilities. The sum is 1, as it should be for a distribution.


A Joint Distribution Table

Though the table above does display the joint distribution, it is more conventional and also more useful to display the same data in a different way.

The prob140 method toJoint converts the table above into a JointDistribution object that is displayed as a conventional joint distribution table for $X$ and $Y$.

joint_dist = joint_table.toJoint()
X=0 X=1 X=2
Y=2 0.000 0.125 0.125
Y=1 0.125 0.250 0.125
Y=0 0.125 0.125 0.000

This way of displaying the information makes it easier to understand the relation between the two variables, as we will soon see. For now, observe that each cell corresponds to a pair $(x, y)$, where $x$ is a value of $X$ and $y$ a value of $Y$. In the cell you see $P(X = x, Y = y)$, the probability of the pair $(x, y)$.

For example, the cell whose labels are X=1 and Y=0 contains the probability 0.125. That is because $$ P(X = 1, Y = 0) = P(\text{HTT}) = \frac{1}{8} = 0.125 $$ You can check all the other cells in the same way.

The table shows it is most likely that both $X$ and $Y$ will be equal to 1. Two outcomes make this happen: HTT and TTH.

Finding Probabilities

The table contains complete information about $X$ and $Y$. To find the probabiilty of any event determined by $X$ and $Y$, simply identify the cells that make the event happen, and add up their chances. This is the random variable version of the Fundamental Method of finding probabilities (see Section 2.4).

For example,

\begin{align*} P(X > Y ) &= P(X = 1, Y = 0) + P(X = 2, Y = 0) + P(X = 2 , Y = 1) \\ &= 0.125 + 0 + 0.125 \\ &= 0.25 \end{align*}

results matching ""

    No results matching ""