{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true,
"tags": [
"remove_cell"
]
},
"outputs": [],
"source": [
"# HIDDEN\n",
"from datascience import *\n",
"from prob140 import *\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('fivethirtyeight')\n",
"%matplotlib inline\n",
"import math\n",
"from scipy import stats\n",
"from scipy import misc\n",
"from myst_nb import glue"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Method of Indicators ##"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a powerful method for finding expected counts. It is based on the observation that among $n$ trials, the number of \"good\" results can be counted by first coding each \"good\" result as 1 and each of the other results as 0, and then adding the 1's and 0's.\n",
"\n",
"If $N$ is the total number of good results among $n$ trials, then\n",
"\n",
"$$\n",
"N = I_1 + I_2 + \\cdots + I_n\n",
"$$\n",
"\n",
"where for each $j$ in the range 1 through $n$, the random variable $I_j$ is the indicator of \"the result of the $j$th trial is good\". \n",
"\n",
"Now recall that if $I_A$ is the indicator of an event $A$, then $E(I_A) = P(A)$. That is, the expectation of an indicator is the probability of the event that it indicates.\n",
"\n",
"So\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(N) &= E(I_1) + E(I_2) + \\cdots + E(I_n) \\\\ \n",
"&= P(\\text{result of Trial } 1 \\text{ is good}) +\n",
"P(\\text{result of Trial } 2 \\text{ is good}) + \\cdots +\n",
"P(\\text{result of Trial } n \\text{ is good}) \\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"It is important to note that the additivity works regardless of whether the trials are dependent or independent. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" VIDEO"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# VIDEO: Method of Indicators\n",
"from IPython.display import YouTubeVideo\n",
"\n",
"vid_method_ind = YouTubeVideo('hOIcQUYUNsM')\n",
"glue(\"vid_method_ind\", vid_method_ind)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{dropdown} See More\n",
":icon: video\n",
"{glue:}`vid_method_ind`\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Expectation of the Binomial ###\n",
"Let $X$ have the binomial $(n, p)$ distribution. Then $X$ can be thought of as the number of successes in $n$ i.i.d. Bernoulli $(p)$ trials, and we can write\n",
"\n",
"$$\n",
"X = I_1 + I_2 + \\cdots + I_n\n",
"$$\n",
"\n",
"where for each $j$ in the range 1 through $n$, $I_j$ is the indicator of \"Trial $j$ is a success\". Thus\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(X) &= E(I_1) + E(I_2) + \\cdots + E(I_n) ~~~~ \\text{(additivity)} \\\\\n",
"&= np ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \\text{(}E(I_j) = p \\text{ for all } j \\text{)}\n",
"\\end{align*}\n",
"$$\n",
"\n",
"Examples of use:\n",
"- The expected number of heads in 100 tosses of a coin is $100 \\times 0.5 = 50$. \n",
"- The expected number of heads in 25 tosses is 12.5. Remember that the expectation of an integer-valued random variable need not be an integer. \n",
"- The expected number of times green pockets win in 20 independent spins of a roulette wheel is $20 \\times \\frac{2}{38} = 1.053$, roughly."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"k = np.arange(11)\n",
"probs = stats.binom.pmf(k, 10, 0.75)\n",
"bin_10_75 = Table().values(k).probabilities(probs)\n",
"Plot(bin_10_75, show_ev=True)\n",
"plt.title('Binomial (10, 0.75)');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that we didn't use independence. Additivity of expectation works whether or not the random variables being added are independent. This will be very helpful in the next example."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" VIDEO"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# VIDEO: Expectation: Binomial and Hypergeometric\n",
"\n",
"vid_exp_binom_hyp = YouTubeVideo('lz-UuQqvUOE')\n",
"glue(\"vid_exp_binom_hyp\", vid_exp_binom_hyp)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{dropdown} See More\n",
":icon: video\n",
"{glue:}`vid_exp_binom_hyp`\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Expectation of the Hypergeometric ###\n",
"Let $X$ have the hypergeometric $(N, G, n)$ distribution. Then $X$ can be thought of as the number of good elements in $n$ draws made at random without replacement from a population of $N = G+B$ elements of which $G$ are good and $B$ bad. Then\n",
"\n",
"$$\n",
"X = I_1 + I_2 + \\cdots + I_n\n",
"$$\n",
"\n",
"where for each $j$ in the range 1 through $n$, $I_j$ is the indicator of \"Draw $j$ results in a good element\". Thus\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E(X) &= E(I_1) + E(I_2) + \\cdots + E(I_n) ~~~~ \\text{(additivity)} \\\\ \\\\\n",
"&= n\\frac{G}{N} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \n",
"\\text{(}E(I_j) = \\frac{G}{N} \\text{ for all } j \\text{ by symmetry)}\n",
"\\end{align*}\n",
"$$\n",
"\n",
"This is the same answer as for the binomial, with the population proportion of good elements $G/N$ replacing $p$.\n",
"\n",
"Examples of use:\n",
"- The expected number of red cards in a bridge hand of 13 cards is $13 \\times \\frac{26}{52} = 6.5$. \n",
"- The expected number of Independent voters in a simple random sample of 200 people drawn from a population in which 10% of the voters are Independent is $200 \\times 0.1 = 20$. \n",
"\n",
"These answers are intuitively clear, and we now have a theoretical justification for them."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Number of hearts in a poker hand \n",
"N = 52\n",
"G = 13\n",
"n = 5\n",
"k = np.arange(6)\n",
"probs = stats.hypergeom.pmf(k, N, G, n)\n",
"hyp_dist = Table().values(k).probabilities(probs)\n",
"Plot(hyp_dist, show_ev=True)\n",
"plt.title('Hypergeometric (N=52, G=13, n=5)');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Quick Check\n",
"A deck contains $40$ blue cards and $60$ gold cards. Ten cards are drawn at random. Find the expected number of blue cards drawn\n",
"\n",
"(a) if the cards are drawn with replacement\n",
"\n",
"(b) if the cards are drawn without replacement\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Answer\n",
":class: dropdown\n",
"(a) $4$ by the binomial expectation formula\n",
"\n",
"(b) $4$ by the hypergeometric expectation formula\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Number of Missing Classes ###\n",
"A population consists of four classes of individuals, in the proportions 0.4, 0.3, 0.2, and 0.1. A random sample of $n$ individuals is chosen so that the choices are mutually independent. What is the expected number of classes that are missing in the sample?\n",
"\n",
"If $M$ is the number of missing classes, then\n",
"\n",
"$$\n",
"M = I_1 + I_2 + I_3 + I_4\n",
"$$\n",
"\n",
"where for each $j$, $I_j$ is the indicator of \"Class $j$ is missing in the sample\". \n",
"\n",
"For Class $j$ to be missing in the sample, all $n$ selected individuals have to be from the other classes. Thus\n",
"\n",
"$$\n",
"E(M) = E(I_1) + E(I_2) + E(I_3) + E(I_4)\n",
"= 0.6^n + 0.7^n + 0.8^n + 0.9^n\n",
"$$\n",
"\n",
"The four indicators aren't independent but that doesn't affect the additivity of expectation."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" VIDEO"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# VIDEO: Applying the Method of Indicators\n",
"vid_apply_moi = YouTubeVideo('mxj4Gr_QUCM')\n",
"glue(\"vid_apply_moi\", vid_apply_moi)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{dropdown} See More\n",
":icon: video\n",
"{glue:}`vid_apply_moi`\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Quick Check\n",
"A deck of 52 cards is dealt (at random without replacement) to four players, so that each player gets a hand of 13 cards. To find the expected number of hands that have no aces, which would you use?\n",
"\n",
"(i) Four indicators, one for each ace\n",
"\n",
"(ii) Four indicators, one for each hand\n",
"\n",
"(iii) Thirteen indicators, one for each card in a hand\n",
"\n",
"(iv) Fifty-two indicators, one for each card in the deck\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{admonition} Answer\n",
":class: dropdown\n",
"(ii)\n",
"\n",
"```"
]
}
],
"metadata": {
"anaconda-cloud": {},
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 1
}