Instructions: Please answer the following questions and submit your work by editing this jupyter notebook and submitting it on Canvas. Questions may involve math, programming, or neither, but you should make sure to explain your work: i.e., you should usually have a cell with at least a few sentences explaining what you are doing.
Also, please be sure to always specify units of any quantities that have units, and label axes of plots (again, with units when appropriate).
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng()
Make up a situation in which we'd have measured at least 3 quantitative variables in at least 500 observations. You should have some positively correlated pairs of variables and some negatively correlated pairs. It does not have to be realistic or serious.
(a) Describe it in words.
(b) Simulate some data that looks at least roughly like what you'd expect real data to look like.
(c) Make plots of the data: histograms of each variable, and scatter plots of each pair of variables.
(d) Compute the correlation matrix for your simulated dataset and explain why correlations are positive or negative.
Note: By "looks at least roughly like you'd expect", I mean that variables should be in real units and not totally unreasonable values. So, counts should be actually integers, weights should not be negative numbers, etcetera. For instance, if one of your variable is "number of pieces of candy obtained by a trick-or-treater", then these should be nonnegative integers, and should not be in the millions. (If it's in the thousands, that's probably not realistic, but close enough.)