Homework 10: A bootstrap.¶
Instructions: Please answer the following questions and submit your work by editing this jupyter notebook and submitting it on Canvas. Questions may involve math, programming, or neither, but you should make sure to explain your work: i.e., you should usually have a cell with at least a few sentences explaining what you are doing.
Also, please be sure to always specify units of any quantities that have units, and label axes of plots (again, with units when appropriate).
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng(123)
1. Mosquitos, now uncertain¶
Recall the per-kid mosquito bite and odor (in ppm) data from homeworks #7 and #8. Here are the data:
bites = np.array([4, 5, 4, 2, 4, 8, 4, 6, 7, 5, 4, 0, 5, 7, 5, 3, 2, 0, 3, 4, 5, 3, 6, 1, 2, 3, 5,
20, 31]) # <-- new
odor = np.array([ 2.8, 4.4, 6.9, 2.3, 5.9, 10.2, 3.2, 7.6, 6.3, 4.5, 4.3,
0. , 8.2, 5.4, 7.6, 3.3, 3.9, 0.1, 2.7, 4.7, 2.1, 4.3,
11.3, 1.7, 2.8, 2.9, 8.5,
5.2, 9.8]) # <-- new
In homework #8 you fit a Negative Binomial model: if $Y_i$ is the number of bites the $i^\text{th}$ kid got, and $X_i$ is their "odor" value, then $$\begin{aligned} Y_i \sim \text{NegBinom}(\text{mean}= \exp(a X_i + b), \text{n}=n) , \end{aligned}$$ and you might have fit the values of $a$ and $b$ and $n$ with the following code:
import scipy.stats
def L(abn, bites):
a, b, n = abn
mu = np.exp(a * odor + b)
p = 1 / (1 + mu/n)
return -1 * np.sum(scipy.stats.nbinom.logpmf(bites, n=n, p=p))
initial_guess = [1.2, 0.7, 1]
mle = scipy.optimize.minimize(L, x0=initial_guess,
args=(bites,),
bounds=((None, None), (None, None), (0, None)))
mle.x
array([0.20504255, 0.52464076, 3.89894044])
Now, your job is to use the parametric bootstrap to describe the uncertainty of these estimates. To do this:
(a) Write a function that simulates new bites
data
form from this model at the estimated parameter values.
(b) Apply the same method of parameter estimation to at least 1,000 simulated datasets, and describe the resulting distribution of estimates.
(c) Communicate the results, in particular addressing how strongly odor affects number of mosquito bites and whether the Poisson model is a good fit (make sure to communicate uncertainty in these results).