(i.e., the axioms of probability)
Probabilities are proportions: $\hspace{2em} 0 \le \P\{A\} \le 1$
Everything: $\hspace{2em} \P\{ \Omega \} = 1$
Complements: $\hspace{2em} \P\{ \text{not } A\} = 1 - \P\{A\}$
Disjoint events: If $\hspace{2em} \P\{A \text{ and } B\} = 0$ then $\hspace{2em} \P\{A \text{ or } B\} = \P\{A\} + \P\{B\}$.
Independence: $A$ and $B$ are independent iff $\P\{A \text{ and } B\} = \P\{A\} \P\{B\}$.
Conditional probability: $$\P\{A \;|\; B\} = \frac{\P\{A \text{ and } B\}}{ \P\{B\} }$$
Probabilities via conditioning:
$$\P\{ A \text{ and } B\} = \P\{A\} \P\{B \;|\; A\} $$Bayes' rule:
$$\P\{B \;|\; A\} = \frac{\P\{B\} \P\{A \;|\; B\}}{ \P\{A\} } .$$The mathematical definition of a "random variable" is "a mapping from the probability space $\Omega$ into the real numbers".
A random variable is just a variable, with some extra structure: a probability distribution.
We create random variables in programming all the time:
X = rng.random()
What's X
? We don't know! But, we do know X
is a number,
and so can do algebra with it, e.g., declare that Y
is twice X
squared plus one:
Y = 2 * X**2 + 1
Do you want to know what X
is, for reals?
I could do print(X)
but too bad, I won't:
for the analogy to hold up,
X
should be the abstract instantiation of a draw from rng.random()
.
def sim_X(max_n=20):
G = rng.choice([0, 2], size=max_n)
W = [np.nan for _ in G]
for k in range(len(G)):
W[k] = np.prod(G[:(k+1)])
X = np.max(W)
return X
Well, $W_n$ could be 0 or $2^n$, and $\mathbb{P}\{ W_n = 2^n \} = 2^{-n}$, so $\mathbb{P}\{ W_n = 0 \} = 1 - 2^{-n}$.
Now, $X$ could be 0, 2, 4, 8, etcetera; and $X = 2^n$ for $n \ge 1$ if $W_n = 2^n$ but $W_{n+1} = 0$. This happens with probability $2^{-(n+1)}$, since we need the first $n$ values of $G_i$ to be 2, and $G_{n+1}$ to be 0. Or, $X = 0$ with probability 1/2.
Example: the martingale
Let $G_i$ be a random variable that takes values either 0 or 2, with probability 1/2 each. Let $W_n = G_1 G_2 \cdots G_n$, and $X = \max_n W_n$.
def sim_X(max_n=20):
G = rng.choice([0, 2], size=max_n)
W = [np.nan for _ in G]
for k in range(len(G)):
W[k] = np.prod(G[:(k+1)])
X = np.max(W)
return X
Well, $W_n$ could be 0 or $2^n$, and $\mathbb{P}\{ W_n = 2^n \} = 2^{-n}$, so $\mathbb{P}\{ W_n = 0 \} = 1 - 2^{-n}$.
Now, $X$ could be 0, 2, 4, 8, etcetera; and $X = 2^n$ for $n \ge 1$ if $W_n = 2^n$ but $W_{n+1} = 0$. This happens with probability $2^{-(n+1)}$, since we need the first $n$ values of $G_i$ to be 2, and $G_{n+1}$ to be 0. Or, $X = 0$ with probability 1/2.
To say how $X$ behaves, we need to specify the probability of each possible outcome. For instance:
$X$ is the number rolled on a fair die: $\P\{X = k\} = 1/6$ for $k \in \{1, 2, 3, 4, 5, 6\}$.
$X$ is uniformly chosen in $[0, 1)$: $\P\{X < x\} = x$ for $0 \le x \le 1$.
$X$ is the number of times I get "heads" when flipping a fair coin before my first "tails": $\P\{X \ge k\} = 2^{-k}$.
For some of these, the set of possible values is discrete, while for others it is continuous.
A continuous distribution has a density function, i.e., a function $f_X(x)$ so that $$ \P\{ a \le X \le b \} = \int_a^b f_X(x) dx . $$
Example: Let $X \sim \text{Binomial}(10, 1/4)$. What is $\P\{X \ge 2\}$?
Answer: By complements, and then additivity of disjoint events, $$
1 - \P{X = 0} - \P{X = 1} . $$ [By Wikipedia](https://en.wikipedia.org/wiki/Binomial_distribution), if $X \sim \text{Binomial}(n, p)$ then $\P\{X = k\} = \binom{n}{k} p^k (1-p)^{n-k}$, so this is $$ 1 - (1/4)^{10} - (3/4) (1/4)^9 . $$
np.sum(rng.binomial(n=10, p=0.75, size=1000000) < 2)
28
1 - 0.25 ** 10 - 10 * (3/4) * (0.25 ** 9)
0.9999704360961914
Example: Let $X \sim \text{Normal}(0, 1)$. What is $\P\{ |X| > 2 \}$?
Answer: By additivity of disjoint events, and then by complements, $$
\P{ X < - 2 }
1 - \P{ -2 \le X \le 2 } . $$ [By Wikipedia](https://en.wikipedia.org/wiki/Normal_distribution), $X$ has density $e^{x^2/2}/\sqrt{2\pi}$, so this is $$ 1 - \int_{-2}^2 \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}} dx . $$
For a numerical answer, we can go to scipy:
np.mean(np.abs(rng.normal(size=1000000)) > 2)
0.045344
from scipy.stats import norm
1 - (norm.cdf(2) - norm.cdf(-2))
0.04550026389635842
the cumulative distribution function (CDF) of $X$ is $$ F_X(x) = \P\{ X < x \} . $$
the probability density function (PDF) of $X$, if it exists, is $$ f_X(x) = \frac{d}{dx} \P\{ X < x \} , $$ or $$ \text{"} f_X(x) dx = \P\{ X = x \} dx \text{"} . $$
For discrete random variables, probabilities are sums. For continuous random variables, they are integrals of the PDF... but, integrals are just fancy sums, anyways.
Exercise: The Exponential(1) distribution
has cumulative distribution function $\P\{ X < x \} = 1 - e^{-x}$.
Plot this (a) empirically, using rng.exponential(size=1000000)
, and (b) by plotting this function.
x = rng.exponential(size=1000)
xvals = np.arange(6)
pvals = [np.mean(x < u) for u in xvals] # list comprehension
plt.plot(xvals, pvals, label='simulation')
plt.plot(xvals, 1 - np.exp(-xvals), label='theory')
plt.legend();
For a random variable $X$ and a function $f( )$, the expected value of $f(X)$ random variable is the weighted average of its possible values: $$ \E[f(X)] = \sum_x f(x) \P\{X = x\} . $$
The simplest example of this is the mean of $X$: $$ \E[X] = \sum_x x \P\{X = x\} . $$
Example: If $X$ is Binomial($n$, $p$), then $\P\{X = x\} = \binom{n}{x} p^x (1-p)^{n-x}$, so (by Wikipedia): $$ \E[X] = \sum_{x=0}^n x p^x (1-p)^{n-x} = np .$$
Example: If $X$ is Exponential($\lambda$), then $X$ has density $\lambda e^{-\lambda x}$ so $$ \E[X] = \int_0^\infty x \lambda e^{-\lambda x} dx = \frac{1}{\lambda} .$$
For any random variables $X$ and $Y$, $$ \E[X + Y] = \E[X] + \E[Y] . $$
Example: Suppose a random student has on average \$12.50 in their pocket. In a class of 30, what is the expected total amount of money in everyone's pockets?
Example: Suppose also that the average amount of federal loans for a UO student is \$7,000/year. What is the average sum of a student's annual loan amount and the amount in their pockets?
If $X$ and $Y$ are independent then $$ \E[X Y] = \E[X] \E[Y] .$$
Example: Let $U$ and $V$ be independent and Uniform on $[0, 1]$. What is the expected area of a rectangle with width $U$ and height $V$?
Non-example: Let $U$ be Uniform on $[0, 1]$. What is the expected area of a square with width $U$?
The variance of $X$ is $$ \var[X] = \E[X^2] - \E[X]^2 , $$ and the standard deviation is $$ \sd[X] = \sqrt{\var[X]} .$$
Equivalent definition: if $X$ has mean $\mu$, then $$ \var[X] = \E[(X-\mu)^2] . $$
The standard deviation tells us how much $X$ tends to differ from the mean.
If $X$ and $Y$ are independent, then $$ \var[X + Y] = \var[X] + \var[Y]. $$
Example: Suppose the amount of rain that falls each day is independent, with mean 0.1" and variance 0.25". What is the mean and variance of the total amount of rain in a week?
x = rng.exponential(scale=5, size=100000)
mean_x = np.mean(x)
sd_x = np.std(x)
plt.hist(x, bins=100)
plt.title(f"mean={mean_x:.2f}, sd={sd_x:.2f}")
plt.vlines(mean_x, 0, 10000, 'red')
plt.vlines([mean_x - sd_x, mean_x + sd_x], 0, 10000, 'red', ':');
x = rng.normal(loc=2, scale=7, size=100000)
mean_x = np.mean(x)
sd_x = np.std(x)
plt.hist(x, bins=100)
plt.title(f"mean={mean_x:.2f}, sd={sd_x:.2f}")
plt.vlines(mean_x, 0, 4000, 'red')
plt.vlines([mean_x - sd_x, mean_x + sd_x], 0, 4000, 'red', ':');
x = rng.binomial(n=50, p=0.4, size=100000)
mean_x = np.mean(x)
sd_x = np.std(x)
plt.hist(x, bins=100)
plt.title(f"mean={mean_x:.2f}, sd={sd_x:.2f}")
plt.vlines(mean_x, 0, 12000, 'red')
plt.vlines([mean_x - sd_x, mean_x + sd_x], 0, 12000, 'red', ':');