Statistics Toolbox 

This example uses MATLAB, Statistics Toolbox, and Optimization Toolbox to improve the design of an engine cooling fan using Design for Six Sigma (DFSS) techniques. The popular Define, Measure, Analyze, Improve, and Control (DMAIC) Six Sigma approach is followed.
Note: In order to run this example, you will need to download the data files AirflowData.mat, OriginalFan.mat and spcdata.mat (datafiles.zip).
This example addresses an engine cooling fan design that is unable to pull enough air through the radiator to keep the engine cool during difficult conditions (such as stopandgo traffic or hot weather). We estimate that we need airflow of at least 875 ft^3/min to keep the engine cool during difficult conditions. We need to evaluate the current design and develop an alternative design that can achieve our target airflow.
We have a database of 10,000 measurements (historical production data) of the existing cooling fan performance. Let's evaluate the data to get an understanding of the current fan's performance.
load OriginalFan plot(originalfan) xlabel('Observation') ylabel('Max Airflow (ft^3/min)') title('Historical Production Data')
The above plot shows that the data appears to be centered around 842 ft^3/min and has a peaktopeak spread of 812 ft^3/min. While the plot is illuminating, it doesn't tell us much about the underlying data. Let's look into the data in more detail by viewing its histogram and fitting a normal distribution to the data.
histfit(originalfan) format short g [historic_mean, historic_stdev] = normfit(originalfan) xlabel('Airflow (ft^3/min)') ylabel('Frequency (counts)') title('Airflow Histogram')
historic_mean = 841.65 historic_stdev = 1.8768
The distribution of airflow values makes it clear that our current fan does not come close to the 875 ft^3/min that we want. We need to improve our design to achieve our target airflow.
We'll evaluate the factors that affect cooling fan performance using Design of Experiments. The factors that we can modify and control are:
Distance from radiator: 1 to 1.5 inches (column 1) Pitch angle: 15 to 35 degrees (column 2) Blade tip clearance: 1 to 2 inches (column 3)
The response is the cooling fan airflow rate (ft^3/min). The factors are stored in columns and test points are stored in rows.
In general, fluid systems have nonlinear behavior. Therefore, we will use a response surface design to estimate any nonlinear interactions among our factors. We will generate the experimental runs for a BoxBehnken design in coded (normalized) variables [1, 0, +1]:
CodedValue = bbdesign(3)
CodedValue = 1 1 0 1 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0
Next we randomize the order of the runs, display in realworld units, and perform the experiment in the order specified.
runorder = randperm(15); % random permutation of the runs bounds = [1 1.5;15 35;1 2]; % min and max values for each factor RealValue = zeros(size(CodedValue)); for i = 1:size(CodedValue,2) % Convert coded values to realworld units zmax = max(CodedValue(:,i)); zmin = min(CodedValue(:,i)); RealValue(:,i) = interp1([zmin zmax],bounds(i,:),CodedValue(:,i)); end TestResult =[837 864 829 856 880 879 872 874 834 833 860 859 874 876 875]'; disp({'Run Number','Distance','Pitch','Clearance','Airflow'}) disp(sortrows([runorder' RealValue TestResult]))
'Run Number' 'Distance' 'Pitch' 'Clearance' 'Airflow' 1 1.25 25 1.5 876 2 1.25 15 2 833 3 1.25 25 1.5 875 4 1.25 15 1 834 5 1.25 35 2 859 6 1 25 2 879 7 1.25 35 1 860 8 1.5 25 1 872 9 1 15 1.5 837 10 1.5 35 1.5 856 11 1.25 25 1.5 874 12 1 25 1 880 13 1 35 1.5 864 14 1.5 15 1.5 829 15 1.5 25 2 874
From the experimental test results, we see that the airflow rate is quite sensitive to the factors we have changed. We also note that there are four experimental runs that meet or exceed our target airflow rate of 875 ft^3/min (runs 1,3,6, and 11). However, we do not know which, if any, of these runs is the optimal one. In addition, we do not know how robust the design is to variation in the factors. We'll create a model based upon the experimental data thus far and use the model to estimate the optimal factor settings.
The BoxBehnken design allows us to test for nonlinear (quadratic) effects. The form of our quadratic model is:
where AF is the airflow rate and Bi is the coefficeints for the various terms. We can use the regstats function from the statistics toolbox to estimate the coefficients of the model above and display the magnitudes of the coefficients (for normalized values) in a bar chart.
coeffname = {'Distance' 'Pitch' 'Clearance' 'D*P' 'D*C' ... 'P*C' 'D^2' 'P^2' 'C^2'}; s = regstats(TestResult, CodedValue, 'quadratic'); h = bar(s.beta(2:10)); set(h,'facecolor',[.8 .8 .9]); legend('Coefficient'); set(gcf,'units','normalized','position',[.05 .4 .35 .4]) set(gca,'xticklabel',coeffname); ylabel('Airflow (ft^3/min)') xlabel('Normalized Coefficient') title('Quadratic Model Coefficients')
From this bar chart, we see that Pitch and Pitch^2 are dominant factors. We can look at the relationship between multiple input variables and one output variable by generating a response surface plot. Statistics Toolbox includes the Response Surface Tool (rstool), which lets you interactively generate response surface plots from your data. We will generate a response surface plot of our airflow data using a quadratic model.
xname = {'Distance'; 'Pitch'; 'Clearance'}; yname = {'Airflow (ft^3/min)'}; rstool(RealValue,TestResult,'quadratic',.05,xname,yname)
From the response surface plots, we clearly see the nonlinear relationship of airflow with pitch. We can interactively move the blue lines around and see the effect the different factors have on airflow. While the rstool can be used to interactively determine the optimum factor settings, we can use the Optimization Toolbox to automate the task.
We find the optimal factor settings using the constrained optimization function fmincon. The objective function is a quadratic response surface fit to the data. The constraints are the upper and lower limits we tested (in coded values). We'll set the initial starting point to be the center of the Design of Experiment test matrix.
f = @(x) x2fx(x, 'quadratic')*s.beta; % objective function lb = [1 1 1]; ub = [1 1 1]; % lower/upper bounds x0 = [0 0 0]; % starting point options = optimset('LargeScale','off'); % Medium Scale problem % invoke the solver [optfactors, fval] = fmincon(f,x0,[],[],[],[],lb,ub,[],options); % Convert the results to a maximization problem and show in realworld % units. maxval = fval; maxloc = (optfactors + 1)'; bounds = [1 1.5;15 35;1 2]; add_term = maxloc*((bounds(:,2)  bounds(:,1))/2)'; maxloc = bounds(:,1) + diag(add_term); disp('Optimal Values:') disp({'Distance','Pitch','Clearance','Airflow'}) disp([maxloc' maxval])
Optimization terminated: magnitude of directional derivative in search direction less than 2*options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e006): lower upper ineqlin ineqnonlin 1 3 Optimal Values: 'Distance' 'Pitch' 'Clearance' 'Airflow' 1 27.275 1 882.26
The optimization result suggests the new fan should be placed one inch from the radiator with a one inch clearance between the tips of the fan blades and the shroud.
Because pitch angle has such a significant impact on airflow, we performed some additional testing to verify that a 27.3 degree pitch angle is optimal.
Results of testing using a variable pitch test fan are shown in the figure below with our model fit to the data. The results show that a quadratic model for pitch is appropriate.
load AirflowData subplot(1,2,1) plot(pitch,airflow,'.r');title('Raw Data') xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)') ylim([840 885]); legend('Test data','Location','se') X = [ones(size(pitch)), pitch, pitch.^2]; [B,Bint,resid,rint,stats] = regress(airflow,X); rmse = stats(4); fcubic = B'*X'; subplot(1,2,2) plot(pitch,airflow,'.r'); hold on line(pitch,fcubic,'color','b');title('Fitted Model and Data') xlabel('Pitch angle (degrees)');ylabel('Airflow (ft^3/min)') legend('Test data','Quadratic model','Location','se') hold off
The additional testing confirms that a 27.3 degree pitch angle is optimal.
At this point we have an improved cooling fan design that meets our airflow requirements. We also have a model that approximates the fan performance well based on the factors we can modify in the design. Before we can rest assured that our problems are solved, we need to ensure that the fan performance is robust to variability in manufacturing and installation.
Based upon historical experience, we define our manufacturing uncertainty to be:
Real Values Coded Values Distance from radiator: 1.00 +/ 0.05 inch 1.00 +/ 0.20 Blade pitch angle: 27.3 +/ 0.25 degrees 0.227 +/ 0.028 Blade tip clearance: 1.00 +/ 0.125 inch 1.00 +/ 0.25
We must verify that these variations in factors will allow us to maintain a robust design around the target airflow. Because we are adhering to the philosophy of Six Sigma, we also need to achieve a defect rate of no more than 3.4 per 1,000,000 fans. That is, we need to hit our 875 ft^3/min target 99.999% of the time.
We'll verify our design using Monte Carlo Simulation. We generate 10,000 random numbers for three factors with the specified tolerance. We also include a noise variable that is proportional to the noise in our model (that is, the RMS Error of the fitted model).
Note: We defined our model coefficients in coded variables; therefore we are generating dist, pitch, and clearance using the coded definition. We are also treating the model parameters as known, since their uncertainty appears to be small compared to the random noise.
% Set the state of the random number generators so results are consistent % across different runs of this example. rand('state',1) randn('state',1) % Now do the Monte Carlo Simulation dist = normrnd(optfactors(1),0.20,[10000 1]); pitch = normrnd(optfactors(2),0.028,[10000 1]); clearance = normrnd(optfactors(3),0.25,[10000 1]); noise = normrnd(0,rmse,[10000 1]);
Next, we calculate airflow for 10,000 random factor combinations using our model.
simfactor = [dist pitch clearance];
X = x2fx(simfactor, 'quadratic');
We then add noise to the model (the variation in the data our model did not account for).
simflow = X*s.beta + noise;
We evaluate the variation in the model's predicted airflow using a histogram and fit a normal distribution to estimate the mean and standard deviation.
[mufan,sigmafan] = normfit(simflow); subplot(1,1,1) histfit(simflow); hold on text(mufan+2,300,['Mean: ' num2str(round(mufan))]) text(mufan+2,280,['Standard deviation: ' num2str(round(sigmafan))]) hold off xlabel('Airflow (ft^3/min)') ylabel('Frequency') title('Monte Carlo Simulation Results')
The results look promising. We have achieved an average airflow of 882 ft^3/min and appear to be better than 875 ft^3/min for most of the data. Let's determine the probability we will be at 875 ft^3/min or below.
format long
pfail = normcdf(875,mufan,sigmafan)
pass = (1pfail)*100
pfail = 3.085426544316604e008 pass = 99.999996914573458
Our design appears to be able to achieve at least 875 ft^3/min of airflow 99.999% of the time. We can use our simulation results to estimate our process capability. First, let's define the upper and lower limits for our process. The lower limit was already defined to be 875 ft^3/min. The upper limit will be defined based on the expected reliability of the electric drive motor. We estimate that motor will become limited in life if we pull more than 890 ft^3/min of air through the radiator. We'll set 890 ft^3/min as the upper limit and estimate the process capability.
[p,Cp,Cpk] = capable(simflow,[875.0 890]) pass = (1p)*100
p = 3.611211862875052e008 Cp = 1.855995417082555 Cpk = 1.804592797797869 pass = 99.999996388788134
The Cp value is 1.85 (>= 1.6 is considered a high quality process). The Cpk shows a similar result that our process is approximately centered (Cpk is close to Cp) with respect to the process limit definitions. Now we need to implement this design and monitor it in order to verify our design process and ensure the cooling fan delivers high quality performance.
We monitor and evaluate the manufacturing and installation process of our new fan using control charts. We will evaluate the first 30 days of production of our new cooling fan. Initially, five cooling fans per day were produced.
load spcdata subplot(2,1,1) xbarplot(spcflow) % reshape the data into daily sets xlabel('Day') ylabel('Daily Average Airflow (ft^3/min)') subplot(2,1,2) schart(spcflow) xlabel('Day') ylabel('Daily Standard Deviation (ft^3/min)')
According to the results, our manufacturing process is in statistical control, as indicated by the absence of violations of control limits or nonrandom patterns in the data over time. We can also run a capability analysis on the data to evaluate our process.
[row,col] = size(spcflow); [p,Cp,Cpk] = capable(reshape(spcflow,row*col,1),[875.0 890]) pass = (1p)*100
p = 3.156838513929117e007 Cp = 1.755756676295137 Cpk = 1.663547652525458 pass = 99.999968431614860
The Cp value of 1.76 is less than our model's assessment of 1.85, indicating that the process is operating with greater variation than our model predicted. The Cpk value of 1.66 is similar to the trend seen from our model. We would only become concerned with a Cpk value less than 1.33 which indicates the process has shifted significantly towards one of the process limits. Our process is well within our limits and we achieve the target airflow (875 ft^3/min) over 99.999% of the time.
We used MATLAB, Statistics Toolbox, and Optimization Toolbox to improve the performance of an engine cooling fan through a Design for Six Sigma DMAIC approach. Our initial fan did not circulate enough air through the radiator to keep the engine cool during difficult conditions. We designed an experiment to investigate the effect of three performance factors: fan distance from the radiator, bladetip clearance, and blade pitch angle. We used our test data to estimate optimum values for each factor, resulting in a design that produces airflows beyond our goal of 875 ft^3 per minute. Simulations verify that the new design should produce airflow according to our specifications in more than 99.999% of the fans manufactured.