Link Functions and Links

The link function links an unbounded continuous variable with a response bounded on (0, 1). The (0, 1) response can be thought as either binary (it happened or it did not) or as a probability of happening (continuous).

Example: Logistic Link

Logistic function definition: \(z = log(p/(1-p))\)

 

\[z = f(size = a) = \beta_0 + \beta_1 a\]

\[z = g(POD) = log(p/(1-p))\]

\[exp(z) = \frac{p}{1-p}\]

Solve for p:

\[exp(z) (1-p) = p\]

\[exp(z) – exp(z) p = p\]

\[exp(z) = p + exp(z) p = p(1 + exp(z))\]

\[p = \frac{exp(z)}{1+exp(z)}\]

logistic link:

\[POD(a) = \frac{\beta_0 + \beta_1 a}{1 + \beta_0 + \beta_1 a} \tag{1}\]

Finding values for \(\beta_0\) and \(\beta_1 \)

While the logistic transformation results in a linear relationship between a and z the resulting error1 structure is binomial, NOT Gaussian. That means the familiar methods of Least Squares regression are not appropriate. That doesn’t mean you can’t coerce a computer program to give you an answer using OLS with a binary or proportion response. It does mean, however, that the answer will be wrong.

Parameter values are determined by the method of maximum likelihood, i.e. finding values that maximize the Likelihood that the experiment turned out the way it did.

This is accomplished by defining a likelihood function and maximizing it numerically. The likelihood of an individual observation is given by \(POD(a_i)\), eqn 1, left, for “hits” and \(1 – POD(a_i)\), for “misses.” The total likelihood is the product of the individual likelihoods.

Logistic (and Probit) regression are discussed in detail in my POD Workshop and Short Course.


Notes:

  1. error here is the difference between the observed response (either 0 or 1) and the expected response provided by the model, \(POD(a) = \frac{\beta_0 + \beta_1 a}{1 + \beta_0 + \beta_1 a}\)