Accelerating the pace of engineering and science

# Statistics Toolbox

## Improving an Engine Cooling Fan Using Design for Six Sigma Techniques

This example uses MATLAB, Statistics Toolbox, and Optimization Toolbox to improve the design of an engine cooling fan using Design for Six Sigma (DFSS) techniques. The popular Define, Measure, Analyze, Improve, and Control (DMAIC) Six Sigma approach is followed.

Note: In order to run this example, you will need to download the data files AirflowData.mat, OriginalFan.mat and spcdata.mat (datafiles.zip).

### Defining the Problem

This example addresses an engine cooling fan design that is unable to pull enough air through the radiator to keep the engine cool during difficult conditions (such as stop-and-go traffic or hot weather). We estimate that we need airflow of at least 875 ft^3/min to keep the engine cool during difficult conditions. We need to evaluate the current design and develop an alternative design that can achieve our target airflow.

### Measuring Cooling Fan Performance

We have a database of 10,000 measurements (historical production data) of the existing cooling fan performance. Let's evaluate the data to get an understanding of the current fan's performance.

```load OriginalFan
plot(originalfan)
xlabel('Observation')
ylabel('Max Airflow (ft^3/min)')
title('Historical Production Data')
```

The above plot shows that the data appears to be centered around 842 ft^3/min and has a peak-to-peak spread of 8-12 ft^3/min. While the plot is illuminating, it doesn't tell us much about the underlying data. Let's look into the data in more detail by viewing its histogram and fitting a normal distribution to the data.

```histfit(originalfan)
format short g
[historic_mean, historic_stdev] = normfit(originalfan)
xlabel('Airflow (ft^3/min)')
ylabel('Frequency (counts)')
title('Airflow Histogram')
```
```historic_mean =

841.65

historic_stdev =

1.8768

```

The distribution of airflow values makes it clear that our current fan does not come close to the 875 ft^3/min that we want. We need to improve our design to achieve our target airflow.

### Analyzing Factors that Affect Fan Performance

We'll evaluate the factors that affect cooling fan performance using Design of Experiments. The factors that we can modify and control are:

``` Distance from radiator:  1 to 1.5 inches     (column 1)
Pitch angle:             15 to 35 degrees    (column 2)
Blade tip clearance:     1 to 2 inches       (column 3)```

The response is the cooling fan airflow rate (ft^3/min). The factors are stored in columns and test points are stored in rows.

In general, fluid systems have nonlinear behavior. Therefore, we will use a response surface design to estimate any nonlinear interactions among our factors. We will generate the experimental runs for a Box-Behnken design in coded (normalized) variables [-1, 0, +1]:

```CodedValue = bbdesign(3)
```
```CodedValue =

-1    -1     0
-1     1     0
1    -1     0
1     1     0
-1     0    -1
-1     0     1
1     0    -1
1     0     1
0    -1    -1
0    -1     1
0     1    -1
0     1     1
0     0     0
0     0     0
0     0     0

```

Next we randomize the order of the runs, display in real-world units, and perform the experiment in the order specified.

```runorder = randperm(15);     % random permutation of the runs
bounds = [1 1.5;15 35;1 2];  % min and max values for each factor

RealValue = zeros(size(CodedValue));
for i = 1:size(CodedValue,2) % Convert coded values to real-world units
zmax = max(CodedValue(:,i));
zmin = min(CodedValue(:,i));
RealValue(:,i) = interp1([zmin zmax],bounds(i,:),CodedValue(:,i));
end

TestResult =[837 864 829 856 880 879 872 874 834 833 860 859 874 876 875]';
disp({'Run Number','Distance','Pitch','Clearance','Airflow'})
disp(sortrows([runorder' RealValue TestResult]))
```
```    'Run Number'    'Distance'    'Pitch'    'Clearance'    'Airflow'

1         1.25           25          1.5          876
2         1.25           15            2          833
3         1.25           25          1.5          875
4         1.25           15            1          834
5         1.25           35            2          859
6            1           25            2          879
7         1.25           35            1          860
8          1.5           25            1          872
9            1           15          1.5          837
10          1.5           35          1.5          856
11         1.25           25          1.5          874
12            1           25            1          880
13            1           35          1.5          864
14          1.5           15          1.5          829
15          1.5           25            2          874

```

From the experimental test results, we see that the airflow rate is quite sensitive to the factors we have changed. We also note that there are four experimental runs that meet or exceed our target airflow rate of 875 ft^3/min (runs 1,3,6, and 11). However, we do not know which, if any, of these runs is the optimal one. In addition, we do not know how robust the design is to variation in the factors. We'll create a model based upon the experimental data thus far and use the model to estimate the optimal factor settings.

### Improving the Cooling Fan Performance

The Box-Behnken design allows us to test for nonlinear (quadratic) effects. The form of our quadratic model is:

where AF is the airflow rate and Bi is the coefficeints for the various terms. We can use the regstats function from the statistics toolbox to estimate the coefficients of the model above and display the magnitudes of the coefficients (for normalized values) in a bar chart.

```coeffname = {'Distance' 'Pitch' 'Clearance' 'D*P' 'D*C' ...
'P*C' 'D^2' 'P^2' 'C^2'};
h = bar(s.beta(2:10)); set(h,'facecolor',[.8 .8 .9]);
legend('Coefficient');
set(gcf,'units','normalized','position',[.05 .4 .35 .4])
set(gca,'xticklabel',coeffname);
ylabel('Airflow (ft^3/min)')
xlabel('Normalized Coefficient')
```

From this bar chart, we see that Pitch and Pitch^2 are dominant factors. We can look at the relationship between multiple input variables and one output variable by generating a response surface plot. Statistics Toolbox includes the Response Surface Tool (rstool), which lets you interactively generate response surface plots from your data. We will generate a response surface plot of our airflow data using a quadratic model.

```xname = {'Distance'; 'Pitch'; 'Clearance'};
yname = {'Airflow (ft^3/min)'};
```

From the response surface plots, we clearly see the nonlinear relationship of airflow with pitch. We can interactively move the blue lines around and see the effect the different factors have on airflow. While the rstool can be used to interactively determine the optimum factor settings, we can use the Optimization Toolbox to automate the task.

We find the optimal factor settings using the constrained optimization function fmincon. The objective function is a quadratic response surface fit to the data. The constraints are the upper and lower limits we tested (in coded values). We'll set the initial starting point to be the center of the Design of Experiment test matrix.

```f = @(x) -x2fx(x, 'quadratic')*s.beta;              % objective function
lb = [-1 -1 -1]; ub = [1 1 1];                      % lower/upper bounds
x0 = [0 0 0];                                       % starting point
options = optimset('LargeScale','off');             % Medium Scale problem

% invoke the solver
[optfactors, fval] = fmincon(f,x0,[],[],[],[],lb,ub,[],options);

% Convert the results to a maximization problem and show in real-world
% units.
maxval = -fval;
maxloc = (optfactors + 1)';
bounds = [1 1.5;15 35;1 2];
disp('Optimal Values:')
disp({'Distance','Pitch','Clearance','Airflow'})
disp([maxloc' maxval])
```
```Optimization terminated: magnitude of directional derivative in search
direction less than 2*options.TolFun and maximum constraint violation
is less than options.TolCon.
Active inequalities (to within options.TolCon = 1e-006):
lower      upper     ineqlin   ineqnonlin
1
3
Optimal Values:
'Distance'    'Pitch'    'Clearance'    'Airflow'

1       27.275            1       882.26

```

The optimization result suggests the new fan should be placed one inch from the radiator with a one inch clearance between the tips of the fan blades and the shroud.

Because pitch angle has such a significant impact on airflow, we performed some additional testing to verify that a 27.3 degree pitch angle is optimal.

Results of testing using a variable pitch test fan are shown in the figure below with our model fit to the data. The results show that a quadratic model for pitch is appropriate.

```load AirflowData
subplot(1,2,1)
plot(pitch,airflow,'.r');title('Raw Data')
xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)')
ylim([840 885]); legend('Test data','Location','se')

X = [ones(size(pitch)), pitch, pitch.^2];
[B,Bint,resid,rint,stats] = regress(airflow,X);
rmse = stats(4);
fcubic = B'*X';

subplot(1,2,2)
plot(pitch,airflow,'.r'); hold on
line(pitch,fcubic,'color','b');title('Fitted Model and Data')
xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)')
hold off
```

The additional testing confirms that a 27.3 degree pitch angle is optimal.

At this point we have an improved cooling fan design that meets our airflow requirements. We also have a model that approximates the fan performance well based on the factors we can modify in the design. Before we can rest assured that our problems are solved, we need to ensure that the fan performance is robust to variability in manufacturing and installation.

Based upon historical experience, we define our manufacturing uncertainty to be:

```                         Real Values              Coded Values
Distance from radiator: 1.00 +/- 0.05 inch      -1.00 +/- 0.20
Blade pitch angle:      27.3 +/- 0.25  degrees   0.227 +/- 0.028
Blade tip clearance:    1.00 +/- 0.125 inch     -1.00 +/- 0.25```

We must verify that these variations in factors will allow us to maintain a robust design around the target airflow. Because we are adhering to the philosophy of Six Sigma, we also need to achieve a defect rate of no more than 3.4 per 1,000,000 fans. That is, we need to hit our 875 ft^3/min target 99.999% of the time.

We'll verify our design using Monte Carlo Simulation. We generate 10,000 random numbers for three factors with the specified tolerance. We also include a noise variable that is proportional to the noise in our model (that is, the RMS Error of the fitted model).

Note: We defined our model coefficients in coded variables; therefore we are generating dist, pitch, and clearance using the coded definition. We are also treating the model parameters as known, since their uncertainty appears to be small compared to the random noise.

```% Set the state of the random number generators so results are consistent
% across different runs of this example.
rand('state',1)
randn('state',1)

% Now do the Monte Carlo Simulation
dist = normrnd(optfactors(1),0.20,[10000 1]);
pitch = normrnd(optfactors(2),0.028,[10000 1]);
clearance = normrnd(optfactors(3),0.25,[10000 1]);
noise = normrnd(0,rmse,[10000 1]);
```

Next, we calculate airflow for 10,000 random factor combinations using our model.

```simfactor = [dist pitch clearance];
```

We then add noise to the model (the variation in the data our model did not account for).

```simflow = X*s.beta + noise;
```

We evaluate the variation in the model's predicted airflow using a histogram and fit a normal distribution to estimate the mean and standard deviation.

```[mufan,sigmafan] = normfit(simflow);
subplot(1,1,1)
histfit(simflow); hold on
text(mufan+2,300,['Mean: ' num2str(round(mufan))])
text(mufan+2,280,['Standard deviation: ' num2str(round(sigmafan))])
hold off
xlabel('Airflow (ft^3/min)')
ylabel('Frequency')
title('Monte Carlo Simulation Results')
```

The results look promising. We have achieved an average airflow of 882 ft^3/min and appear to be better than 875 ft^3/min for most of the data. Let's determine the probability we will be at 875 ft^3/min or below.

```format long
pfail = normcdf(875,mufan,sigmafan)
pass = (1-pfail)*100
```
```pfail =

3.085426544316604e-008

pass =

99.999996914573458

```

Our design appears to be able to achieve at least 875 ft^3/min of airflow 99.999% of the time. We can use our simulation results to estimate our process capability. First, let's define the upper and lower limits for our process. The lower limit was already defined to be 875 ft^3/min. The upper limit will be defined based on the expected reliability of the electric drive motor. We estimate that motor will become limited in life if we pull more than 890 ft^3/min of air through the radiator. We'll set 890 ft^3/min as the upper limit and estimate the process capability.

```[p,Cp,Cpk] = capable(simflow,[875.0 890])
pass = (1-p)*100
```
```p =

3.611211862875052e-008

Cp =

1.855995417082555

Cpk =

1.804592797797869

pass =

99.999996388788134

```

The Cp value is 1.85 (>= 1.6 is considered a high quality process). The Cpk shows a similar result that our process is approximately centered (Cpk is close to Cp) with respect to the process limit definitions. Now we need to implement this design and monitor it in order to verify our design process and ensure the cooling fan delivers high quality performance.

### Controlling Manufacturing of the Improved Cooling Fan

We monitor and evaluate the manufacturing and installation process of our new fan using control charts. We will evaluate the first 30 days of production of our new cooling fan. Initially, five cooling fans per day were produced.

```load spcdata
subplot(2,1,1)
xbarplot(spcflow) % reshape the data into daily sets
xlabel('Day')
ylabel('Daily Average Airflow (ft^3/min)')

subplot(2,1,2)
schart(spcflow)
xlabel('Day')
ylabel('Daily Standard Deviation (ft^3/min)')
```

According to the results, our manufacturing process is in statistical control, as indicated by the absence of violations of control limits or non-random patterns in the data over time. We can also run a capability analysis on the data to evaluate our process.

```[row,col] = size(spcflow);
[p,Cp,Cpk] = capable(reshape(spcflow,row*col,1),[875.0 890])
pass = (1-p)*100
```
```p =

3.156838513929117e-007

Cp =

1.755756676295137

Cpk =

1.663547652525458

pass =

99.999968431614860

```

The Cp value of 1.76 is less than our model's assessment of 1.85, indicating that the process is operating with greater variation than our model predicted. The Cpk value of 1.66 is similar to the trend seen from our model. We would only become concerned with a Cpk value less than 1.33 which indicates the process has shifted significantly towards one of the process limits. Our process is well within our limits and we achieve the target airflow (875 ft^3/min) over 99.999% of the time.

### Summary

We used MATLAB, Statistics Toolbox, and Optimization Toolbox to improve the performance of an engine cooling fan through a Design for Six Sigma DMAIC approach. Our initial fan did not circulate enough air through the radiator to keep the engine cool during difficult conditions. We designed an experiment to investigate the effect of three performance factors: fan distance from the radiator, blade-tip clearance, and blade pitch angle. We used our test data to estimate optimum values for each factor, resulting in a design that produces airflows beyond our goal of 875 ft^3 per minute. Simulations verify that the new design should produce airflow according to our specifications in more than 99.999% of the fans manufactured.