Statistical Engineering

Runouts

Analyzing Fatigue Data with Runouts

Here is a brief description of the statistical treatments of runouts in fatigue testing. A runout is a test which is interrupted before it fails. Examples include discontinued testing after \(10^7\) cycles, or specimen failure outside the gage after N cycles.

Runouts:

Runouts do not contain the same information about the placement of an s-N curve as failures do, and ignoring this fact will lead to serious errors in estimated material response. To illustrate this, consider this thought experiment: A test is performed very near the runout strength of a material and it does not fail before \(10^7\) cycles. A similar test, run at 20 ksi below runout, is also stopped after \(10^7\) cycles. An ordinary least-squares regression of all the data, including the first example, but not the second, would have little effect on the final position of the curve. But including the very low point – as if it were a failure – will greatly lower the resulting curve. (Similar anticonservative situations can also occur, so we can’t just shrug it off as being “conservative.”)

Ordinary Least Squares

When all fatigue specimens fail, Ordinary Least-Squares (ols) is the accepted method for estimating the parameters of the s-N model. This method has been the basis of engineering data analysis for the 200 years since Gauss popularized it. Here’s how it works.
Some mathematical model is proposed which relates stress (or strain, and/or temperature, or whatever) with cycles to failure, N. The goal is to choose parameters for the model which “best” fits the data. Gauss said that “best” means that the summed squared error of the residuals is a minimum. ( A residual is the difference between an observation and the model prediction.) Another way of saying the same thing is that the variance of the observations about the predicted behavior is as small as possible.

Given this criterion for goodness, the ols method first writes the equation for the sum of the squares of the differences between the observed and expected lives. This relationship is then differentiated with respect to each of the model parameters, and these derivatives set equal to zero. The simultaneous solution to these equations (the “Normal” equations) provides the desired least-squares estimates of the parameter values. (Statisticians don’t talk about “measuring” a parameter value; they “estimate” it. That’s because the estimate will change slightly given different, or new data, something that wouldn’t happen with something that could be measured without error.)

Now, if the equation chosen to represent the s-N behavior is linear in the model parameters, then the solution to the Normal equations can be written down directly: Consider \(y = X b\), where y column vector of the dependent variable observations (such as \(\log(N)\)) and \(X\) is an \(n \times m\) matrix of life-controlling variables (such as stress or log(stress), and/or temp, or whatever), n is the number of observations, and m is the number of model parameters, including the offset. \(b\) is the column vector of length, m, of model parameters. (The first column of \(X\) is all ones, unless the offset is defined to be zero, and the equation is forced to go through the origin.) The general solution is \(\hat b = (X’X)^{-1} X’Y\), where the prime indicates matrix transpose, and the \(^{-1}\) indicates its antecedent is to be inverted. The “hat” (a carat above a parameter) indicates a statistical estimate rather than a known value.

Censored!

The forgoing is a summary of current engineering practice, used with success for 200 years. But what about a “censored” observation? What about a specimen which didn’t fail after N cycles? How can you calculate a residual for that? (Quick answer: you can’t.) Because it could have failed at any point after it was suspended, the “error” (residual) can not be defined, thus it can’t be included in the summed squared error to be minimized, and so you can’t get there from here. You can PRETEND that it DID fail, and calculate a residual, and so on, but you would be WRONG for the reasons we explored with our thought experiment. What can we do?

R. A. Fisher’s Idea

This problem was solved in the comparatively recent past, and is based on some really groovy stuff thought up by R.A. Fisher in the early decades of the last century, and brought into engineering practice only about 15 years ago. Fisher looked at the problem of parameter estimation using a different criterion for goodness. Fisher said the “best” parameter value would be the one which maximized the likelihood that the experiment would have turned out the way it actually did. He said that you could choose any parameter values you wanted, but some would be more likely to be the true values, given the experimental results. Pretty simple, right? (Quick answer: Right. Pay attention here.)

Likelihood

What’s a “likelihood?” Picture the s-N data with a best-fit line through it. Now imagine a (normal) distribution of lives scattered about the line, at a constant stress, for example. The likelihood (of the line’s being in the right place) is the ordinate of the probability distribution which is centered at the model value. (It’s the height of the probability distribution at that value of N.) Obviously, if the line is nowhere near the data, the normal distribution won’t be centered appropriately, and the ordinates evaluated at the N values will be low. We want to put the curve through the data so its likelihood is maximized.

We can do this maximum likelihood stuff the same way we did the least-squares: beginning with the likelihood equation, which is just the product of all those individual likelihoods. For practical purposes it’s helpful to take the log of the likelihood equation because it turns all those products into sums (of logarithms). Next, differentiate this equation with respect the model parameters. (See how much easier it is to differentiate a sum than a series of products?) These derivatives are set equal to zero and solved simultaneously. This usually requires an iterative solution. Now, because the logarithm is a monotone function, it reaches a maximum when the variable of which it the logarithm reaches a maximum, so the solution to the maximum of the log of the likelihood occurs at the same parameter values as the maximum of the likelihood function itself. So we’re done. (See how easy this is?)

How Well Does it work?

So, how do parameters estimated with Fisher’s maximum likelihood criterion compare with those estimated using Gauss’s least-squares criterion? They are EXACTLY the same. Not close – exact. (OK, a little fine print: given that the errors are normally distributed, which is usually the case.) So what? That means that if there were NO censored observations, this new-fanged method still produces the identical results as the method we’ve been using for 200 years, a comforting situation.

So what about a runout? Well, it could be represented by the ordinate at the N cycles where it was discontinued, OR at the ordinate at a few cycles more, OR at even more cycles after that, since it could have failed at any of those cycle-counts. Since we don’t know exactly where the failure would have occurred, only that it has to be after the N observed cycles, the relative likelihood (of the curve being in the right place) is that fraction of the area under the normal curve to the right of the suspension, since the data were right censored. Pretty spiffy, ain’t it? This definition of likelihood also works for left censored observations and for interval censored tests. (An example of interval censoring could be a test which failed over the weekend. The cycle counter was working Friday afternoon, but the specimen was found failed Monday morning, and the cycle count is in doubt. Here the likelihood would be the area under a normal curve between the last known cycle count, and the cycle count estimated by the test frequency and the duration of the interval.)

How Do You Do It?

OK, OK, I left one teensy part out: How do you actually DO the censored regression? Well, I had to write my own FORTRAN routines when we first began using the method nearly three decades ago because none of the commercial software packages did it, including SAS. Turns out I did it the hard way, too, since I was unaware of IRLS at the time, but that’s another story.

I know of two commercial packages available now: SAS’s proc LIFEREG, and S-Plus. I use S-Plus. It has a steep, steep learning curve, but once you get the hang of it you can do almost anything statistical using it. Further, the newer releases beginning with S-Plus 2000 have a GUI (in addition to the old command line) and are much easier than earlier versions.

Update: Professor William Q. Meeker has graciously made his S-Plus adjunct software available to anyone using S-Plus. It includes the codes he used in his recent book, Meeker and Escobar, Statistical Methods for Reliability Data, Wiley, 1998. To get the software, which requires S-Plus to use, visit his website.

Further Update: When my son began his Ph.D. work at Purdue more than a decade ago, they used R, an open-source statistical package with a syntax similar to S-Plus. Over the intervening years I have switched from S-Plus to R, exclusively, for two reasons: R is free and supported by some of the best minds in international applied statistics, and has 20,000 (give or take) special purpose add-on programs (also free). S-Plus is expensive, and doesn’t have the breadth that R has. I estimate (from the user-group mailing volume) that R as about 100X the number of users that S-Plus has. Unfortunately, Prof. Meeker’s software only works with S-Plus, so I have written my own to use with R‘s routines. You can find out more about R here. To return to the StatisticalEngineering website, just \(\times\)-out the new window.

Well, that’s the long and short of it: You MUST use censored regressions with runout s-N data. The form of the equation you select to represent the behavior is your choice – log(life) vs. log(stress), or log(life) vs. 1/stress, whatever – but always use the correct procedure: censored regression.

References:

Meeker and Escobar, Statistical Methods for Reliability Data, Wiley, 1998. This is an excellent book, 680 pages chock-full of ideas, information, and techniques presented in an unstuffy and intuitive manner. Buy this book.

Another, older, reference is Lawless, Statistical Models & Methods for Lifetime Data, Wiley (1982). This, too, is a great book, but it’ll be rough sledding for a statistical newbie.