Instructions: Please answer the following questions and submit your work by editing this jupyter notebook and submitting it on Canvas. Questions may involve math, programming, or neither, but you should make sure to explain your work: i.e., you should usually have a cell with at least a few sentences explaining what you are doing.
Wikipedia tells us that there are two ways to define the Geometric distribution, so that if $X$ has the Geometric distribution with parameter $p$, then either $$ \mathbb{P}\{ X = k \} = (1 - p)^{k-1} p, $$ or $$ \mathbb{P}\{ X = k \} = (1 - p)^{k} p. $$ Find out, empirically, which of the two distributions is used by numpy, by simulating a large number of draws as follows:
import numpy as np
rng = np.random.default_rng()
# x = rng.geometric( ... )
and counting the proportion of those draws that take the value $k$. Do this for two different values of $p$ and $0 \le k \le 10$, and compare the proportions two the two formula above. Which is it?
For each $i \ge 1$, Let $D_i$ be a random number drawn independently and uniformly from $\{1, 2, 3, 4, 5, 6\}$. Let $$ K = \min\{ k \ge 1 \;:\; D_{k+1} < D_k \} , $$ i.e., $K$ is defined by the fact that $D_{K+1}$ is the first number that is smaller than the one before it. Finally, let $$ X = \sum_{i=1}^K D_i . $$
a. Describe in words how to simulate $X$ using fair dice.
b. Write a function to simulate $X$ (in python).
The function should have one argument, size
,
that determines the number of independent samples of $X$ that are returned.
c. Make a plot describing the distribution of $X$, and estimate its mean (by simulating at least $10^5$ values).