image thumbnail
from New Statistics Tools in MATLAB(R) and the Statistics ToolboxTM by Dan Doherty
M-files used in the webinar held on November 18, 2004.

survivalscript.m
%% Controler reliability
% Today we will use the Distribution Fitting Tool (DFT) to analyze data on
% the survival time for a motion controller.  This data set is "censored,"
% as we will explain.  However, the tool can be used to fit distributions
% to any type of data, not just reliability or survival data.
load webinardata
censplot(obstime,censored);

%%
% Let's bring up a plot of the survival times.  This plot is not generated
% by the DFT, but is just an illustration of censoring.  Each of the 100
% lines represents the survival time for a controller.  Notice there is a
% clump of times near 100 hours, then a smooth distribution of times up
% to 1600 hours.
%
% About 2/3 of the controllers failed before the end of the test at 1600
% hours, but the remainder were still in operation.  This is an example of
% "censoring," where the end of the test prevents us from seeing all of the
% survival times.

%% Distribution fitting tool
% Let's load the data set into the DFT using the command line to save time,
% even though it is more typical to load via the GUI.
dfittool(obstime,censored)

%%
% The spike at low values is visible.  This is common with reliability
% data.  It represents a phenomenon known as "infant mortality."  Some
% items with defects are prone to early failure, but the ones that survive
% past this early stage show a much better distribution of failure times.
%
% We have several options here.  We could fit the data as is.  We could use
% the "exclusion rule" of the distribution fitting tool to set aside the
% early failures and fit just to the rest of the data.  Or we could omit
% those early failures entirely.
%
% We usually detect early failures before we ship to customers.  That's
% because we go through a "burn-in" process where we run the controllers
% for a while in-house, discarding those that fail during the burn-in
% period.  For that reason we'll simply import a new data set that omits
% the failures before 200 hours.  Again, we could do that using the Data
% button, but instead we'll just re-start the tool.
close all force
obstime2 = obstime(obstime>100);
censored2 = censored(lifetime>100);
dfittool(obstime2, censored2)

%% Fitting distributions
% The DFT has about 16 distributions built in, plus the ability to fit
% custom distributions that you specify. A common distribution for modeling
% lifetime data is the Weibull distribution.  Let's fit that.
%
% Notice that the Weibull density provides a good fit to the histogram, but
% we only see the density up to 1600 hours.  That's because the test only
% ran that long, and points beyond there were censored.
%
% The DFT provides a number of display types in addition to the density.
% Another way to look at this distribution is to switch to the cumulative
% distribution function (cdf).  This is the probability of failure before a
% given time.  The jagged curve represents the observed proportions of
% failures before each time.  The smooth curve represents the Weibull fit.
%
% We'll also try fitting a couple of other distributions.  Electronic
% components are often modeled with an exponential distribution,
% when components appear not to age.  Mechanical components, on the other
% hand, do tend to wear out or age.  The exponential fit is not good.
% We'll also try a nonparametric or "smooth" fit.  This is just a
% smoothed-out version of the empirical curve.  It fits better, but
% looks no better then the Weibull.

%%
% It's easier to compare fits using a Weibull probability plot.  This is
% like a cdf plot but with the y axis distorted so that the Weibull fit
% looks like a straight line.  The line matches our data reasonably well.
% We'll delete the other two fits.
%
% Other distributions used for modeling lifetime data include the
% gamma and Burbaum-Saunders distributions, but we won't try them here.
% Go back to the density display.

h1 = probplot('weibull',obstime2,censored2);
set(h1(2),'color','g','linestyle','-');
pe = expfit(obstime2,censored2);
h = probplot(gca,@expcdf,pe);
set(h,'color','r','linestyle','-');
f = @(x)ksdensity(obstime2,x,'cens',censored2,'function','cdf');
h = probplot(gca,f);
set(h,'color','c','linestyle','-');

%% Predicting reliability
% We're satisfied with the Weibull fit.  Let's use it to predict the times
% at which various proportions our controllers would survive.  Go to the
% Evaluate screen, choose the Weibull fit, choose the Quantile function,
% enter ".1:.1:.9" (without the quotes) into the "At p=" field, and click
% on the Apply button.  Interpret the results.

pw = wblfit(obstime2,censored2);
[nlogl,pcov] = wbllike(pw,obstime2,censored2);
[x,xlo,xup] = wblinv((0.1:0.1:0.9)',pw(1),pw(2),pcov,0.05);
disp('estimated survival times')
sprintf('   %12.2f\t%12.2f\t%12.2f\n', [x xlo xup]')

%% Generate m code
% Finally, let's write an m file.  The Statistics ToolboxTM has command-line
% functions to do the things we just did with the Distribution Fitting
% Tool.  The tool can automatically write an m file that uses these
% command-line functions.
%
% I'll save the file under the name "generated.m".  If I run it with the
% data omitting early failures, I see that it reproduces the results in the
% tool.  Now run it with all the data.  Note that the Weibull distribution
% does not model the early failures well, even when they are used in
% estimating the parameters.
generated(obstime2,censored2);
figure; generated(obstime,censored)

%% Additional topics
% It would be possible to fit a mixture of two distributions (one for each
% failure mode), but we won't do that here. Take a look at the Statistics
% ToolboxTM demos "Fitting Custom Univariate Distributions" for information
% on that.  While we look at the demo page, we'll also point out the
% "Distribution Functions" demo for looking at pdf and cdf functions of
% different distributions, and the "Random Number Generation" demo for
% generating samples.  We'll run the latter, switch to the gamma
% distribution, manipulate the parameter values, and see how the random
% sample changes.  We could also export the sample to the workspace.

Contact us at files@mathworks.com