Statistical Engineering

Answer-Shopping

These data do not support the hypothesis.
Well … The first one does, but the second and third don’t.
Now the fourth …

“Outliers”

Many of us have encountered well-meaning colleagues culling their data for “outliers” – observations that don’t agree with their theory – the removal of which makes a muddy scenario into a clear one. Likely wrong – but clear. One is reminded of Thomas Huxley’s observation: The great tragedy of science – the slaying of a beautiful hypothesis by an ugly fact. (more about Outliers here)

But answer-shopping can be more than cherry-picking your data. Answer shopping can involve the dubious practice of comparing different approaches to problem-solving, and choosing the answer you like, rather than the most plausible.

This can be especially dangerous when the underlying theoretical differences among various statistical methods are not understood by a decision-maker who has no criterion for choosing the best answer, and thus chooses the most favorable.

Confidence bounds on the Weibull curve provide an example: We use the loglikelihood ratio and the asymptotic chi-square criterion to construct bounds on the probability model because the LLR is considered the statistical “gold standard,” i.e. its coverage is consistently close to nominal. (If we say the coverage is 95%, then it’s not, say, 80% that other methods might produce).

Historically constructing loglikelihood ratio bounds was computationally onerous, so approximate methods, such as inverting the Fisher information matrix, or constructing a Cheng and Iles\(^{(1)}\) ellipse were used. (I have used them myself 30+ years ago.)

But today readily accessible computing makes such approximate methods obsolete – unless you are answer-shopping. Providing a choice to use an inferior approximate method when a better method exists is disingenuous, so only two methods are used here, the asymptotic chi-square loglikelihood ratio for \(X\) and the binomial for \(F(X)\), discussed next.

There is another method that augments, rather than competes with, the loglikelihood ratio for constructing bounds. The binomial-based confidence intervals\(^{(2)}\) for \(F(X)\). The loglikelihood ratio method constructs simultaneous bounds on the model parameters directly, and thus indirectly on the range of \(X\), at a given (cumulative) probability, such as 1/1000. In contrast, the binomial bounds are on \(F(X)\), i.e. on the probability range of \(X\) for a given value of \(X\).

Distribution shopping.

Weibull is the safe choice unless you have a theoretical reason for insisting on another density, perhaps the exponential or Rayleigh distribution (which are special cases of the Weibull). You should let the data suggest a slope to avoid optimistic bounds if the slope is not (approximately) \(\beta = 1\) or \(\beta = 2\), respectively.

NOTES:

Cheng, R.C.H. and T.C. Iles, “Confidence Bands for Cumulative Distribution Functions of Continuous Random Variables,” Technometrics, vol. 25, no. 1, pp 77 – 86 (1983)
Meeker and Escobar, Statistical Methods for Reliability Data, Wiley, 1998, p5-52

Answer-Shopping

These data do not support the hypothesis. Well … The first one does, but the second and third don’t. Now the fourth …

“Outliers”

Distribution shopping.

NOTES:

These data do not support the hypothesis.
Well … The first one does, but the second and third don’t.
Now the fourth …