Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Model Suburban Commuting Using Subtractive Clustering

This example shows how to apply the genfis function to model the relationship between the number of automobile trips generated from an area and the area's demographics. Demographic and trip data are from 100 traffic analysis zones in New Castle County, Delaware. Five demographic factors are considered: population, number of dwelling units, vehicle ownership, median household income, and total employment. Hence, the model has five input variables and one output variable.

Load and plot the data.

mytripdata
subplot(2,1,1)
plot(datin)
ylabel('input')
subplot(2,1,2)
plot(datout)
ylabel('output')

The mytripdata command creates several variables in the workspace. Of the original 100 data points, use 75 data points as training data (datin and datout)and 25 data points as checking data, (as well as for test data to validate the model). The checking data input/output pairs are denoted by chkdatin and chkdatout.

Generate a model from the data using subtractive clustering using the genfis command.

First, create a genfisOptions option set for subtractive clustering, specifying 'ClusterInfluenceRange' range property. The 'ClusterInfluenceRange' property indicates the range of influence of a cluster when you consider the data space as a unit hypercube. Specifying a small cluster radius usually yields many small clusters in the data, and results in many rules. Specifying a large cluster radius usually yields a few large clusters in the data, and results in fewer rules.

opt = genfisOptions('SubtractiveClustering','ClusterInfluenceRange',0.5);

Generate the FIS model using the training data and the specified options.

fismat = genfis(datin,datout,opt);

The genfis command uses a one-pass method that does not perform any iterative optimization. The model type for the generated FIS structure is a first order Sugeno model with three rules.

Verify the model. Here, trnRMSE is the root mean square error of the system generated by the training data.

fuzout = evalfis(datin,fismat);
trnRMSE = norm(fuzout-datout)/sqrt(length(fuzout))
trnRMSE = 0.5276

Next, apply the test data to the FIS to validate the model. In this example, the validation data is used for both checking and testing the FIS parameters. Here, chkRMSE is the root mean square error of the system generated by the validation data.

chkfuzout = evalfis(chkdatin,fismat);
chkRMSE = norm(chkfuzout-chkdatout)/sqrt(length(chkfuzout))
chkRMSE = 0.6179

Plot the output of the model, chkfuzout, against the validation data, chkdatout.

figure
plot(chkdatout)
hold on
plot(chkfuzout,'o')
hold off

The model output and validation data are shown as circles and solid blue line, respectively. The plot shows that the model does not perform well on the validation data.

At this point, you can use the optimization capability of anfis to improve the model. First, try using a relatively short anfis training (20 epochs) without using validation data, and then test the resulting FIS model against the testing data.

anfisOpt = anfisOptions('InitialFIS',fismat,'EpochNumber',20,...
                        'InitialStepSize',0.1);
fismat2 = anfis([datin datout],anfisOpt);
ANFIS info: 
	Number of nodes: 44
	Number of linear parameters: 18
	Number of nonlinear parameters: 30
	Total number of parameters: 48
	Number of training data pairs: 75
	Number of checking data pairs: 0
	Number of fuzzy rules: 3


Start training ANFIS ...

   1 	 0.527607
   2 	 0.513727
   3 	 0.492996
   4 	 0.499985
   5 	 0.490585
   6 	 0.492924
   7 	 0.48733
Step size decreases to 0.090000 after epoch 7.
   8 	 0.485036
   9 	 0.480813
  10 	 0.475097
Step size increases to 0.099000 after epoch 10.
  11 	 0.469759
  12 	 0.462516
  13 	 0.451177
  14 	 0.447856
Step size increases to 0.108900 after epoch 14.
  15 	 0.444356
  16 	 0.433904
  17 	 0.433739
  18 	 0.420408
Step size increases to 0.119790 after epoch 18.
  19 	 0.420512
  20 	 0.420275

Designated epoch number reached --> ANFIS training completed at epoch 20.

Minimal training RMSE = 0.420275

After the training is complete, validate the model.

fuzout2 = evalfis(datin,fismat2);
trnRMSE2 = norm(fuzout2-datout)/sqrt(length(fuzout2))
trnRMSE2 = 0.4203
chkfuzout2 = evalfis(chkdatin,fismat2);
chkRMSE2 = norm(chkfuzout2-chkdatout)/sqrt(length(chkfuzout2))
chkRMSE2 = 0.5894

The model has improved a lot with respect to the training data, but only a little with respect to the validation data. Plot the improved model output obtained using anfis against the testing data.

figure
plot(chkdatout)
hold on
plot(chkfuzout2,'o')
hold off

The model output and validation data are shown as circles and solid blue line, respectively. This plot shows that subtractive clustering with genfis can be used as a standalone, fast method for generating a fuzzy model from data, or as a preprocessor to to determine the initial rules for anfis training. An important advantage of using a clustering method to find rules is that the resultant rules are more tailored to the input data than they are in an FIS generated without clustering. This result reduces the problem of an excessive propagation of rules when the input data has a high dimension.

Overfitting can be detected when the checking error starts to increase while the training error continues to decrease.

To check the model for overfitting, use anfis with validation data to train the model for 200 epochs.

First configure the ANFIS training options by modifying the existing anfisOptions option set. Specify the epoch number and validation data. Since the number of training epochs is larger, suppress the display of training information to the Command Window.

anfisOpt.EpochNumber = 200;
anfisOpt.ValidationData = [chkdatin chkdatout];
anfisOpt.DisplayANFISInformation = 0;
anfisOpt.DisplayErrorValues = 0;
anfisOpt.DisplayStepSize = 0;
anfisOpt.DisplayFinalResults = 0;

Train the FIS.

[fismat3,trnErr,stepSize,fismat4,chkErr] = anfis([datin datout],anfisOpt);

Here,

  • fismat3 is the FIS structure when the training error reaches a minimum.

  • fismat4 is the snapshot FIS structure when the validation data error reaches a minimum.

  • stepSize is a history of the training step sizes.

  • trnErr is the RMSE using the training data

  • chkErr is the RMSE using the validation data for each training epoch.

After the training completes, validate the model.

fuzout4 = evalfis(datin,fismat4);
trnRMSE4 = norm(fuzout4-datout)/sqrt(length(fuzout4))
trnRMSE4 = 0.3393
chkfuzout4 = evalfis(chkdatin,fismat4);
chkRMSE4 = norm(chkfuzout4-chkdatout)/sqrt(length(chkfuzout4))
chkRMSE4 = 0.5834

The error with the training data is the lowest thus far, and the error with the validation data is also slightly lower than before. This result suggests perhaps there is an overfit of the system to the training data. Overfitting occurs when you fit the fuzzy system to the training data so well that it no longer does a good job of fitting the validation data. The result is a loss of generality.

View the improved model output. Plot the model output against the checking data.

figure
plot(chkdatout)
hold on
plot(chkfuzout4,'o')
hold off

The model output and validation data are shown as circles and solid blue line, respectively.

Next, plot the training error, trnErr.

figure
plot(trnErr)
title('Training Error')
xlabel('Number of Epochs')
ylabel('Training Error')

This plot shows that the training error settles at about the 60th epoch point.

Plot the checking error, chkErr.

figure
plot(chkErr)
title('Checking Error')
xlabel('Number of Epochs')
ylabel('Checking Error')

The plot shows that the smallest value of the validation data error occurs at the 52nd epoch, after which it increases slightly even as anfis continues to minimize the error against the training data all the way to the 200th epoch. Depending on the specified error tolerance, the plot also indicates the ability of the model to generalize the test data.

You can also compare the output of fismat2 and fistmat4 against the validation data, chkdatout.

figure
plot(chkdatout)
hold on
plot(chkfuzout4,'ob')
plot(chkfuzout2,'+r')

See Also

|

Related Topics

Was this topic helpful?