MATLAB Examples

Improving an Engine Cooling Fan Using Design for Six Sigma Techniques

This demo uses MATLAB, the Statistics Toolbox, the Curve Fitting Toolbox, and the Optimization Toolbox to improve the design of an engine cooling fan using Design for Six Sigma Techniques. The popular Define, Measure, Analyze, Improve, and Control (DMAIC) Six Sigma approach is followed. See the MATLAB Digest Article of same title (July 2006) that accompanies this demo at


Defining the Problem

We have an engine cooling fan design that is unable to pull enough airflow through the radiator to keep the engine cool during difficult conditions (i.e. stop and go traffic, hot weather). From a system level analysis, we estimate that we need airflow of at least 875 ft^3/min to keep the engine cool during difficult conditions. We need to evaluate the current design and develop an alternative design that can achieve our target airflow.

Measuring Cooling Fan Performance

We have a database of 10,000 measurements (historical production data) of the existing cooling fan performance. Let's evaluate the data to get an understanding of the current fan's performance.

load OriginalFan
ylabel('Max Airflow (ft^3/min)')
title('Historical Production Data')

The above plot shows that the data appears to be centered around 842 ft^3/min and has 4-6 ft^3/min of variation. While the plot is illuminating, it doesn't tell us much about the underlying data. Specifically, does it follow a normal distribution and is the process in control (high quality data)? Let's look into the data in more detail. The data in 'originalfan' is a series of measurements taken on a daily basis. 200 fans/day were tested over a 50 day period. We can use statistical process control (SPC) methods to check our data for anomalies. One common technique is to evaluate the data using Xbar and S charts.

xbarplot(reshape(originalfan,50,200)) % reshape the data into daily sets
ylabel('Daily Average Airflow (ft^3/min)')
ylabel('Daily Standard Deviation (ft^3/min)')

The Xbar chart shows the daily averages relative to the control limits (CL = average value, UCL/LCO are +/-3 standard deviations). The S-chart shows the variability (standard deviation) for each day with the control limits defined similar to the Xbar chart. For a process to be considered in control, only 3 out of every 1,000 samples should be outside of the control limits for each plot. Only one day was outside of the control limits (day 20, on the S-chart). No abnormalities were found upon review of the day 20 data and we conclude that the process is in control (the outlier was deemed to be one of the 3 out of 1,0000 possible occurrences).

Implicit in the use of the Xbar/S-chart the assumption that the data follows a normal distribution. Let's verify this is the case. Use the graphical tool, DFITTOOL to fit a normal distribution to the original fan dataset. Set the bin value to 100 under (Data --> select originaldata, Set Bin Rules). Create a distribution fit to the data (New Fit --> Fit Name: Normal Distribution, Apply).

 To start: dfittool(originalfan,[],[],'Original Fan')

You should get results similar to the picture below.

pic = imread('distfit.png','BackgroundColor',[1 1 1]); % white background
if exist('imshow')
    % If you don't have the image processing toolbox, use:
    image(pic), axis equal, axis off
clear pic % clean up memory/no longer needed

Note: You can load in the session used to generate the picture above using the "historical_data.dfit" file. It can be found in the same location as this M-file.

Alternatively, we can perform the same operations as in DFITTOOL at the MATLAB command line:

h1 = findobj(gca,'Type','patch');   % let's format the plot
clr = [.9 .9 1];
[historic_mean, historic_stdev] = normfit(originalfan)
xlabel('Airflow (ft^3/min)')
ylabel('Frequency (counts)')
title('Airflow Histogram')
shg % bring the graph forward
historic_mean =


historic_stdev =


Look at the distribution of historical data. We don't come close to the 875 ft^3/min that we want. We need to improve our design.

Analyzing Factors that Affect Fan Performance

We'll evaluate the cooling fan using design of experiments. The factors that we can modify and control are:

 Distance from radiator:  1 to 1.5 inches     (column 1)
 Pitch angle:             15 to 35 degrees    (column 2)
 Blade tip clearance:     1 to 2 inches       (column 3)

and the response is the cooling fan airflow rate (ft^3/min). The factors are stored in columns, rows are the test points.

Based upon prior experience, we know that there will be a nonlinear response to one or more factors. Therefore, we will use a response surface design to account for this nonlinearity.

Generate the experimental runs for a Box-Behnken design in coded (normalized) variables [-1, 0, +1]:

CodedValue = bbdesign(3)
CodedValue =

    -1    -1     0
    -1     1     0
     1    -1     0
     1     1     0
    -1     0    -1
    -1     0     1
     1     0    -1
     1     0     1
     0    -1    -1
     0    -1     1
     0     1    -1
     0     1     1
     0     0     0
     0     0     0
     0     0     0

Randomize the order of the runs, display in real-world units, and perform the experiment in the order specified.

format short g
runorder = randperm(15);    % random permutation of the runs
bounds = [1 1.5;15 35;1 2]; % min and max values for each factor
RealValue = coded2real(CodedValue,bounds);
disp({'Run Number','Distance','Pitch','Clearance'})
disp([runorder' RealValue])
    'Run Number'    'Distance'    'Pitch'    'Clearance'

            6            1           15          1.5
           14            1           35          1.5
           15          1.5           15          1.5
            3          1.5           35          1.5
           13            1           25            1
            8            1           25            2
           12          1.5           25            1
           10          1.5           25            2
            2         1.25           15            1
            4         1.25           15            2
            7         1.25           35            1
           11         1.25           35            2
            9         1.25           25          1.5
            5         1.25           25          1.5
            1         1.25           25          1.5

The experiment was performed and the following result was obtained:

TestResult =[837 864 829 856 880 879 872 874 834 833 860 859 874 876 875]';
disp({'Run Number','Distance','Pitch','Clearance','Airflow'})
disp(sortrows([runorder' RealValue TestResult]))
    'Run Number'    'Distance'    'Pitch'    'Clearance'    'Airflow'

            1         1.25           25          1.5          875
            2         1.25           15            1          834
            3          1.5           35          1.5          856
            4         1.25           15            2          833
            5         1.25           25          1.5          876
            6            1           15          1.5          837
            7         1.25           35            1          860
            8            1           25            2          879
            9         1.25           25          1.5          874
           10          1.5           25            2          874
           11         1.25           35            2          859
           12          1.5           25            1          872
           13            1           25            1          880
           14            1           35          1.5          864
           15          1.5           15          1.5          829

From the experimental test results, we see that the airflow rate is quite sensitive to the factors we have changed. We also note that there are four experimental runs that meet or exceed our target airflow rate of 875 ft^3/min (runs 1,4,7, and 13). However, we do not know which, if any, of these runs are the optimal. In addition, we do not know how robust the design is to variation in the factors. We'll create a model based upon the experimental data thus far and use the model to estimate the airflow sensitivity to expected variation in the factors at the optimal settings.

Improving the Cooling Fan Performance

The Box-Behnken design allows us to test for nonlinear (quadratic) effects. The form of our quadratic model is:

We can use REGSTATS to estimate and compare the coefficients and visually display the magnitudes of the coefficients (for normalized values) in a bar chart.

coeffname = {'Distance' 'Pitch' 'Clearance' 'D*P' 'D*C' ...
    'P*C' 'D^2' 'P^2' 'C^2'};
s = regstats(TestResult, CodedValue, 'quadratic');
h = bar(s.beta(2:10)); set(h,'facecolor',[.8 .8 .9]);
set(gcf,'units','normalized','position',[.05 .4 .7 .4])
ylabel('Airflow (ft^3/min)')
xlabel('Normalized Coefficient')
title('Quadratic Model Coefficients')

From the bar chart, we see that Pitch and Pitch^2 are dominant factors. We can also determine the magnitude and direction of the effect from the bar plot of the coefficients. In addition to the coefficients, we can get the p-values associated with the coefficients to determine their importance in our model.

disp([s.beta s.tstat.pval])
disp('Coefficient Names')
    'Coefficient'    'p-value'

          875  2.0882e-015
       -3.625   0.00013403
        13.25  2.2422e-007
       -0.125      0.73163
  1.9286e-013            1
         0.75      0.18443
            0            1
        0.625      0.27267
      -29.125  3.0313e-008
        0.625      0.27267

Coefficient Names

Based upon the p-values, the constant term, distance, pitch, and pitch^2 are all statistically significant. The R-squared value of our overall model and standard deviation are:

Rsquare = s.rsquare
rmse = sqrt(s.mse)
Rsquare =


rmse =


We can evaluate our model fit by looking at the response surface. Using the RSTOOL to visualize our model, we can estimate the optimal set point.

xname = {'Distance'; 'Pitch'; 'Clearance'};
yname = {'Airflow (ft^3/min)'};

From the response surface plots, we clearly see the nonlinear relationship of airflow with pitch. While the RSTOOL can be used to interactively determine the optimum factor settings, the Optimization Toolbox can automate the task.

We find the optimal factor settings using the constrained optimization function FMINCON with the coded values (simplifies definition of boundary conditions)

f = @(x) -x2fx(x, 'quadratic')*s.beta;
lb = [-1 -1 -1]; ub = [1 1 1]; x0 = [0 0 0];
[optfactors, fval] = fmincon(f,x0,[],[],[],[],lb,ub);
maxval = -fval;
maxloc = (optfactors + 1)';
bounds = [1 1.5;15 35;1 2];
add_term = maxloc*[(bounds(:,2) - bounds(:,1))/2]';
maxloc = bounds(:,1) + diag(add_term);
disp('Optimal Values:')
disp([maxloc' maxval])
Warning: Large-scale (trust region) method does not currently solve this type of problem,
 using medium-scale (line search) instead.
Optimization terminated: magnitude of directional derivative in search
 direction less than 2*options.TolFun and maximum constraint violation
  is less than options.TolCon.
Active inequalities (to within options.TolCon = 1e-006):
  lower      upper     ineqlin   ineqnonlin
Optimal Values:
    'Distance'    'Pitch'    'Clearance'    'Airflow'

            1       27.275            1       882.26

The new fan will be placed one inch from the radiator with a one inch clearance between the tips of the fan blades and the shroud. Because pitch angle has such a significant impact on airflow, we'll perform some additional testing to verify that a 27.3 degree pitch angle is optimal.

Tests were performed using a variable pitch test fan and are shown in the figure below with our model fit to the data. The results show that a quadratic model for pitch is appropriate.

load AirflowData
plot(pitch,airflow,'.r');title('Raw Data')
xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)')
ylim([840 885]); legend('Test data','Location','se')
fcubic = fit(pitch,airflow,'poly2')
plot(pitch,airflow,'.r'); hold on
line(pitch,fcubic(pitch),'color','b');title('Fitted Model and Data')
xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)')
legend('Test data','Quadratic model','Location','se')
fcubic =

     Linear model Poly2:
       fcubic(x) = p1*x^2 + p2*x + p3
     Coefficients (with 95% confidence bounds):
       p1 =     -0.2779  (-0.2851, -0.2706)
       p2 =       15.14  (14.77, 15.5)
       p3 =       676.2  (671.8, 680.6)

At this point we have an improved cooling fan design that meets our airflow requirements. We also have a model that approximates the fan performance well based upon the factors we can modify in the design. Before we can rest assured that our problems are solved, we need to ensure that the fan performance is robust to variability in manufacturing and installation.

We define our manufacturing uncertainty based upon historical experience to be:

                         Real Values              Coded Values
 Distance from radiator: 1.00 +/- 0.05 inch      -1.00 +/- 0.20
 Blade pitch angle:      27.3 +/- 0.25  degrees   0.227 +/- 0.028
 Blade tip clearance:    1.00 +/- 0.125 inch     -1.00 +/- 0.25

We must verify that these variations in factors will allow us to maintain a robust design around the target airflow. Since we are adhering to the philosophy of Six Sigma, we also need to achieve a defect rate of no more than 3.4 per 1,000,000 fans. That is, we need to hit our 875 ft^3/min target 99.999% of the time.

We'll verify our design using Monte Carlo Simulation. Generate random numbers for three factors with the specified tolerance for 10,000 samples. We'll also include a noise variable that is proportional to the noise in our model (i.e. the RMS Error of the fitted model). Note: our model coefficients were defined in coded variables, therefore we are generating dist, pitch, and clearance using the coded definition)

dist = normrnd(optfactors(1),0.20,[10000 1]);
pitch = normrnd(optfactors(2),0.028,[10000 1]);
clearance = normrnd(optfactors(3),0.25,[10000 1]);
noise = normrnd(0,rmse,[10000 1]);

Calculate airflow using our model.

simfactor = [dist pitch clearance];
X = x2fx(simfactor, 'quadratic');

Add noise to the model (the variation in the data our model did not account for)

simflow = X*s.beta + noise;

Evaluate the model performance using a histogram and a normal distribution fit to the resulting data.

[mufan,sigmafan] = normfit(simflow);
figure; histfit(simflow); hold on
text(mufan+2,300,['Mean: ' num2str(round(mufan))])
text(mufan+2,280,['Standard deviation: ' num2str(round(sigmafan))])
h1 = findobj(gca,'Type','patch');   % let's format the plot
clr = [.9 .9 1];
xlabel('Airflow (ft^3/min)')
title('Monte Carlo Simulation Results')

Our results look promising. We have achieved an average airflow of 882 ft^3/min and appear to be better than 875 ft^3/min for most of the data. Let's determine the probability we will be at 875 ft^3/min or below.

format long
pfail = normcdf(875,mufan,sigmafan)
pass = (1-pfail)*100
pfail =


pass =


Our design appears to be able to achieve at least 875 ft^3/min of airflow 99.999% of the time. We can use our simulation, and expected variance on the manufacturing, to estimate our process capability. First, let's define the upper and lower limits for our process. The lower limit was already defined to be 875 ft^3/min. The upper limit will be defined based upon the expected reliability of the electric drive motor. We estimate that motor will become limited in life if we pull more than 890 ft^3/min of air through the radiator. We'll set 890 ft^3/min as the upper limit and estimate the process capability.

[p,Cp,Cpk] = capable(simflow,[875.0 890])
pass = (1-p)*100
p =


Cp =


Cpk =


pass =


The Cp value is 1.6 (>= 1.6 is considered a high quality process). The Cpk shows that our process is centered (Cpk ~= Cp) with respect to the process limit definitions. Now we need to implement this design and monitor it in order to verify our design process and ensure the cooling fan delivers high quality performance.

Controlling Manufacturing of the Improved Cooling Fan

We monitor and evaluate the manufacturing and installation process with statistical process control (SPC) to ensure that our design works in practice. We will evaluate the first 30 days of production of our new cooling fan. Five cooling fans per day were produced initially.

load spcdata
xbarplot(spcflow) % reshape the data into daily sets
ylabel('Daily Average Airflow (ft^3/min)')
ylabel('Daily Standard Deviation (ft^3/min)')

According to the results, our manufacturing process is in statistical control, as indicated by the absence of violations of control limits or non-random patterns in the data over time. We can also run a capability analysis on the data to evaluate our process.

[row,col] = size(spcflow);
[p,Cp,Cpk] = capable(reshape(spcflow,row*col,1),[875.0 890])
pass = (1-p)*100
p =


Cp =


Cpk =


pass =


The Cp value of 1.8 is greater than our initial assessment of 1.6, indicating that the process is operating with lower variation than our model predicted. The Cpk value of 1.6 indicates that our process is not centered within our defined upper and lower bounds. We would only become concerned with a Cpk value less than 1.33, at which point the process will be highly skewed towards the upper or lower limit and we would have a higher likelihood of violating one our limits. Our process is well within our limits and we achieve the target airflow (875 ft^3/min)over 99.999% of the time.


We demonstrated the concept of Robust Design using the Six Sigma DMAIC approach with MATLAB, the Statistics Toolbox, and the Optimization Toolbox. The engine cooling fan design was evaluated and found not to be capable of pulling enough air through the radiator to cool the engine during difficult conditions. Design of Experiments was used to evaluate the factors that had the most influence on the fan performance. A model was developed from the data and used to optimize the fan design. The optimal design was verified by follow on testing and tolerances for the fan design were evaluated. The resulting fan design was able to achieve the target airflow rate of 875 ft^3/min or greater 99.999% of the time.

Appendix A: Coded to Real Function

type coded2real
function x = coded2real(z,bounds)
%CODED2REAL Converts coded variables into real values.
% CODED2REAL converts the coded values in Z to real values defined by
% BOUNDS.  Columns of Z are different factors (variables/observations).
% Rows of BOUNDS are the defining limits in order of increasing value.
% Example:
%   A Design of Experiment - Cooling Fan Factors (3):
%   Distance from radiator:  1 to 1.5 inches
%   Pitch angle:             15 to 35 degrees
%   Blade tip clearance:     1 to 2 inches
%   z = bbdesign(3) % Box-Behnken Response Surface Design with 3 factors
%   bounds = [1 1.5;    % min and max values for each factor (Distance)
%             15 35;    % Pitch Angle
%               1 2];   % Tip Clearance
%   x = coded2real(z,bounds)    % Real values from the coded values

for i = 1:size(z,2)
    zmax = max(z(:,i));
    zmin = min(z(:,i));
    x(:,i) = interp1([zmin zmax],bounds(i,:),z(:,i));