4.1. Joint Distributions#
Suppose
The joint distribution of
4.1.1. Example#
In three tosses of a coin, let
All the other probabilities are
The constraints on
See More
4.1.2. Joint Distribution Table#
The prob140
library contains a method for displaying the joint distribution of two random variables. As a first step, you need the possible values of each of the two variables. In our example, both
k = np.arange(3)
Now let’s define a function that takes
def joint_probability(x, y):
if x == 1 & y == 1:
return 2/8
elif abs(x - y) < 2:
return 1/8
else:
return 0
The syntax for constructing a joint distribution object is analogous to that for constructing a univariate distribution, with some modifications due to the higher dimension.
We have to specify the name of each of the two variables as well as its possible values, and then we will specify the function that we have defined to find the joint probabilities. The call is
Table().values(variable_name_1, values_1, variable_name_2, values_2).probability_function(function_name)
where function_name
is a function that takes
joint_dist = Table().values('X', k, 'Y', k).probability_function(joint_probability)
joint_dist
X=0 | X=1 | X=2 | |
---|---|---|---|
Y=2 | 0.000 | 0.125 | 0.125 |
Y=1 | 0.125 | 0.250 | 0.125 |
Y=0 | 0.125 | 0.125 | 0.000 |
This display of the joint distribution object joint_dist
is called a joint distribution table for
Each cell corresponds to a pair
Joint distribution tables are analogous to the contingency tables you saw in Data 8 when you were analyzing the relation between two categorical variables. In contingency tables, each cell contains the number of individuals in one particular pair of categories. In joint distribution tables, such as the one above, each cell contains the probability of one particular pair of values.
To check that we do indeed have a distribution over all the possible values of the pair
joint_dist.total_probability()
1.0
In fact this is a double check, as the the method for constructing the joint distribution object returns an error if all the probabilities don’t sum to 1.
4.1.3. Finding Probabilities#
The table contains complete information about the relation between
See More
For example, consider the event
Let’s visualize this using the joint distribution table of
We will use a method that is of fundamental importance to everything that follows in this course: we will define a function called the indicator of the event. The function just returns a Boolean: 1 if the event occurs, and 0 otherwise. In this example, for any pair
def indicator_equal(i, j):
return i == j # Note the == sign. This is a comparison that results in a Boolean.
The event
method applied to a joint distribution object allows us to visualize the event and also displays the probability of the event. The arguments are:
The name of a function that is the indicator of the event; it takes two arguments, say
a
andb
, and returns the Boolean corresponding to whether or not the pair (a
,b
) is in the eventThe name of the random variable whose value is the first co-ordinate
a
The name of the random variable whose value is the second co-ordinate
b
joint_dist.event(indicator_equal, 'X', 'Y')
P(Event) = 0.5
X=0 | X=1 | X=2 | |
---|---|---|---|
Y=2 | 0.125 | ||
Y=1 | 0.25 | ||
Y=0 | 0.125 |
The display P(event) = 0.5
is consistent with our earlier answer
You can see that these cells all lie along the
However, if you just want to see the probability of the event without the table display, add a semicolon at the end of the line. That prevents the returned table from being printed.
joint_dist.event(indicator_equal, 'X', 'Y');
P(Event) = 0.5
def indicator_y_at_least_x(i, j):
return j >= i
joint_dist.event(indicator_y_at_least_x, 'X', 'Y')
P(Event) = 0.75
X=0 | X=1 | X=2 | |
---|---|---|---|
Y=2 | 0.000 | 0.125 | 0.125 |
Y=1 | 0.125 | 0.25 | |
Y=0 | 0.125 |
The visible cells form the upper triangle corresponding to points whose coordinates
Quick Check
Without using Python, find
Answer
4.1.4. The General Calculation#
As we have seen in these examples, saying that random variables
In the case of the event
In the case of the event
The probability of the event is
Identify all pairs of possible values
such that .Add the probabilities
of all those pairs.
Expressed more compactly,