Statistical Engineering

Bivariate Normal Density

Here is a simple algorithm for sampling from a bivariate normal distribution.

The joint probability of observing both \(x_1\) and \(x_2\) together is given by the bivariate normal probability density:

\[f(x_1, x_2)=const \times \exp \Bigg(- \frac{1}{2(1-\rho^2)} \Big( \frac{x_1 -\mu_1}{\sigma_1} \Big)^2 -2 \rho \frac{x_1 -\mu_1}{\sigma_1} \frac{x_2 -\mu_2}{\sigma_2} + \Big( \frac{x_2 -\mu_2}{\sigma_2} \Big)^2 \Bigg) \]

where \[const = \frac{1}{\big(2 \pi \sigma_1 \sigma_2 \sqrt{1 – \rho^2} \big)} \]

To sample from this density,

Generate two, uncorrelated, standard normal variates, \(z_1\) and \( z_2\) .

\[z \sim \Phi(\mu = 0, \sigma^2 = 1)\]

where “\(\sim\)” is read “… is distributed as …”

Compute the correlated \(x_1\) and \(x_2\).

\[x_1 =\mu_1+\sigma_1 z_1\]

\(x_1\) and \(x_2\) will have means \(\mu_1\) and \(\mu_2\) standard deviations \(\sigma_1\) and \(\sigma_2\), and correlation \(\rho\).

Cautions:

1) While it is almost always possible to calculate means and standard deviations, that doesn’t mean the data have a normal distribution.

2) Using a bivariate normal density because it is convenient without checking its verisimilitude with the data is dangerous.

3) Using estimates of parameters \(\bar{x}\) and \(s\) uncritically, as though they actually were the populations parameters, \(\mu\) and \(\sigma\) themselves, is also dangerous, especially with either small samples – small samples notoriously underestimate \(\sigma\) – or when estimating small probabilities, \(P_{fail}\) < 0.001.