{
"cells": [
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"tags": [
"remove_cell"
]
},
"outputs": [],
"source": [
"# HIDDEN\n",
"from datascience import *\n",
"from prob140 import *\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('fivethirtyeight')\n",
"import numpy as np\n",
"from itertools import product"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Equality ##"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We know what it means for two numbers to be equal: they are at the same spot on the number line. Equality of random variables, however, can be of more than one kind.\n",
"\n",
"### Equal ###\n",
"Two random variables $X$ and $Y$ defined on the same outcome space are *equal* if their values are the same for every outcome in the space. The notation is $X = Y$ and it means that\n",
"\n",
"$$\n",
"X(\\omega) = Y(\\omega) \\text{ for all } \\omega \\in \\Omega\n",
"$$\n",
"\n",
"This is the usual definition of the equality of two mathematical functions. Informally, it says that when $X$ has the value $10$ then $Y$ must be $10$ too; when $X$ is $11$, $Y$ must be $11$; and so on.\n",
"\n",
"An example will make this clear. Let $N_H$ be the number of heads in three tosses of a coin, and let $N_T$ be the number of tails in the same three tosses. \n",
"\n",
"Now consider the new random variable $M = 3 - N_T$. The two random variables $N_H$ and $M$ are equal. For every possible outcome of the three tosses, the value of $N_H$ is equal to the value of $M$.\n",
"\n",
"We write this simply as $N_H = M$. Equivalently, $N_H = 3 - N_T$."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"remove-input",
"hide-output"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" VIDEO"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# VIDEO: Two Kinds of Equality\n",
"from IPython.display import YouTubeVideo\n",
"\n",
"YouTubeVideo(\"Z-_RSmBktHM\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Equal in Distribution ###\n",
"$N_H$ and $N_T$, as defined above, are not equal. For example,\n",
"\n",
"$$\n",
"N_H(\\text{TTT}) = 0 ~~~ \\text{but} ~~~ N_T(\\text{TTT}) = 3\n",
"$$ \n",
"\n",
"However, there is a sense in which the number of heads \"behaves in the same way\" as the number of tails. The two random variables have the same probability distribution.\n",
"\n",
"The outcome space is `three_tosses`:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('H', 'H', 'H'),\n",
" ('H', 'H', 'T'),\n",
" ('H', 'T', 'H'),\n",
" ('H', 'T', 'T'),\n",
" ('T', 'H', 'H'),\n",
" ('T', 'H', 'T'),\n",
" ('T', 'T', 'H'),\n",
" ('T', 'T', 'T')]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"coin = make_array('H', 'T')\n",
"three_tosses = list(product(coin, repeat=3))\n",
"three_tosses"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are only eight outcomes, so it is easy to inspect the table and write the distributions of $N_H$ and $N_T$. Both take the values $0, 1, 2, 3$ with probabilities $1/8, 3/8, 3/8, 1/8$ respectively. This distribution is shown in the table below."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
"Value Probability 0 0.125 1 0.375 2 0.375 3 0.125

"
],
"text/plain": [
"Value | Probability\n",
"0 | 0.125\n",
"1 | 0.375\n",
"2 | 0.375\n",
"3 | 0.125"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dist = Table().values(np.arange(4)).probabilities(make_array(1, 3, 3, 1)/8)\n",
"dist"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We say that $N_H$ and $N_T$ are *equal in distribution*. \n",
"\n",
"In general, two random variables $X$ and $Y$ are equal in distribution if they have the same probability distribution. \n",
"\n",
"That is, they have the same set of possible values and the same probabilities for all those values. \n",
"\n",
"Equality in distribution is denoted as\n",
"\n",
"$$\n",
"X \\stackrel{d}{=} Y\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Relation between the Equalities ###\n",
"Equality is stronger than equality in distribution. If two random variables are the same, outcome by outcome, then they must have the same distribution because they are the same function on the outcome space. \n",
"\n",
"That is, for any two random variables $X$ and $Y$,\n",
"\n",
"$$\n",
"X = Y \\implies X \\stackrel{d}{=} Y\n",
"$$\n",
"\n",
"But as the example of heads and tails in three tosses shows, the converse need not be true."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Quick Check\n",
"The random variables below are defined on the probability space consisting of 16 equally likely outcomes of 4 tosses of a coin: $HHHH, HHHT, HHTH$, and so on.\n",
"\n",
"$X$: number of heads in the first two tosses\n",
"\n",
"$Y$: number of heads in the last two tosses\n",
"\n",
"$Z$: number of tails in the last two tosses\n",
"\n",
"For each of the following pairs, fill in the blank with $=$ or $\\stackrel{d}{=}$, picking the stronger one if both are applicable. If neither of the symbols applies, explain why not.\n",
"\n",
"(i) $X ~~ \\underline{~~~~~~~~~~~~~} ~~ Y$\n",
"\n",
"(ii) $Y ~~ \\underline{~~~~~~~~~~~~~} ~~ Z$\n",
"\n",
"(iii) $X ~~ \\underline{~~~~~~~~~~~~~} ~~ Z$\n",
"\n",
"(iv) $Y ~~ \\underline{~~~~~~~~~~~~~} ~~ 2-Z$\n",
"\n",
"(v) $X ~~ \\underline{~~~~~~~~~~~~~} ~~ 2-Z$\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Answer\n",
":class: dropdown\n",
"\n",
"Fill (i), (ii), (iii), and (v) with $\\stackrel{d}{=} $ and (iv) with $=$.\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example: Two Cards Dealt from a Small Deck ###\n",
"A deck contains 10 cards, labeled 1, 2, 2, 3, 3, 3, 4, 4, 4, 4. Two cards are dealt at random without replacement. Let $X_1$ be the label on the first card and $X_2$ be the label on the second card.\n",
"\n",
"**Question 1:** Are $X_1$ and $X_2$ equal?\n",
"\n",
"**Answer 1:** No, because for example the outcome could be $(3,1)$ in which case $X_1 = 3$ and $X_2 = 1$.\n",
"\n",
"**Question 2:** Are $X_1$ and $X_2$ equal in distribution?\n",
"\n",
"**Answer 2:** Let's find the two distributions and compare. Clearly the possible values are 1, 2, 3, and 4 in each case. The distribution of $X_1$ is easy: \n",
"\n",
"$$\n",
"P(X_1 = i ) = \\frac{i}{10} , ~~ i = 1, 2, 3, 4\n",
"$$\n",
"\n",
"When a distribution is defined by a formula like this, you can define a function that does what the formula says:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"def prob1(i):\n",
" return i/10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can create a probability distribution object for $X_1$ using `values` as before but now with the `probability_function` method.\n",
"\n",
"The argument to `probability_function` is the name of the function that takes $i$ as its argument and returns $P(X_1 = i)$."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
"Value Probability 1 0.1 2 0.2 3 0.3 4 0.4

"
],
"text/plain": [
"Value | Probability\n",
"1 | 0.1\n",
"2 | 0.2\n",
"3 | 0.3\n",
"4 | 0.4"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"possible_i = np.arange(1, 5, 1)\n",
"dist_X1 = Table().values(possible_i).probability_function(prob1)\n",
"dist_X1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Convince yourself that the function `prob2` below returns $P(X_2 = i)$ for each $i$. The event has been partitioned according to the value of $X_1$."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"def prob2(i):\n",
" if i == 1:\n",
" return (9/10)*(1/9)\n",
" else:\n",
" return (i/10)*((i-1)/9) + ((10-i)/10)*(i/9)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
"Value Probability 1 0.1 2 0.2 3 0.3 4 0.4

"
],
"text/plain": [
"Value | Probability\n",
"1 | 0.1\n",
"2 | 0.2\n",
"3 | 0.3\n",
"4 | 0.4"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dist_X2 = Table().values(possible_i).probability_function(prob2)\n",
"dist_X2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The two distributions are the same! Here is yet another example of symmetry in sampling without replacement. The conclusion is\n",
"\n",
"$$\n",
"X_1 \\stackrel{d}{=} X_2\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Quick Check\n",
"$X_1$ and $X_2$ are the results of the first and second of two draws made at random without replacement from the 10 digits $0, 1, 2, 3, 4, 5, 6, 7, 8, 9$. True or false:\n",
"\n",
"(i) $X_1$ has the uniform distribution on the 10 digits.\n",
"\n",
"(ii) $X_2$ has the uniform distribution on the 10 digits.\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Answer\n",
":class: dropdown\n",
"Both are true. For (ii), calculate $P(X_2 = 0)$ as an example to see what's going on.\n",
"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.4"
}
},
"nbformat": 4,
"nbformat_minor": 1
}