How to use RBF NN for classification?

6 views (last 30 days)
Sir, I need to use RBF NN for a classification problem. My input is 8*646 and target is 1*646. My aim is out of 8 features which i am giving as input , network should classify whether it belongs to class A or class B (Using same data set for testing also).Out of 646 data, first 233 belongs to class A and rest belongs to class B. So i give target as a row vector like: zero’s for the first 233 & ones for rest. [0,0,0.......(233nos.),1,1,1.......] (1)Is it the right way? Then i used newrb like this: Net=newrb (P,T,eg,sc); And used the same input to simulate the network and plotted the confusion matrix. Y=net(P); plotconfusion(T,Y).
(2)How to set this eg and sc values? Means how can i find the optimal values of eg and sc for my problem?? SETTING EG=.02 and sc=0.1,When i plot confusion it is giving 98% correct classification and output Y I got correctly one for class B and 0.0528 for class A. (3) How can i get exact zeros for class A??? (4) How can i use LOGSIG transfer fn in second layer instead of PURELIN???? I tried rbf with nntool also. We will get output and errors after simulation. (5) What does it mean????? And same as question (3) in o/p class A is not zeros but Class B is ones. Why is it so? Later i need to include more type of classes and for a given set of same features network should tell in which group it belongs to. (6) For my problem which one u think is more helpful- RBF EXACT FIT OR RBF FEWER NEURONS??????

Accepted Answer

Greg Heath
Greg Heath on 5 Apr 2013
%How to use RBF NN for classification?
%Asked by rakesh r on 28 Mar 2013 at 11:29
%Sir, I need to use RBF NN for a classification problem. My input is 8*646 and target is 1*646. My aim is out of 8 features which i am giving as input, network should classify whether it belongs to class A or class B (Using same data set for testing also).
Network evaluation using training data is, typically, highly BIASED and UNRELIABLE for estimating performance on nontraining data.
IF POSSIBLE, NEURAL NETWORKS SHOULD ALWAYS BE EVALUATED BY THEIR ESTIMATED PERFORMANCE ON NONTRAINING DATA
If the data set is not large enough to yield adequately sized holdout validation and/or holdout test subsets, then I recommend f-fold cross-validation (XVAL).
help/doc crossval
help/doc cvpartition
%Out of 646 data, first 233 belongs to class A and restbelongs to class B. So i give target as a row vector like: zero’s for the first 233 & ones for rest.[0,0,0.......(233nos.),1,1,1.......]
%(1)Is it the right way?
Yes, although some effort may be needed to compensate for the ~2:1 imbalance in class size which tends to favor correct class 1 classifications.
%Then i used newrb like this: Net = newrb (P,T,eg,sc);
Did you standardize P to have zero mean and unit variance?
Why not use the MATLAB variable names X,T and parameter names that reveal the role of the parameter, e.g., MSEgoal and spread?
What values did you use for the latter? You cannot rely on the defaults
%And used the same input to simulate the network and plotted the confusion matrix. Y=net(P); plotconfusion(T,Y).
Again, training data results tend to be extremely biased and unreliable for predicting performance on nontraining data. trn/val/tst data division is much preferable.
% (2)How to set this eg and sc values? Means how can i find the optimal values of eg and sc for my problem??
Trial and error:
Since the basis functions are isotropic, standardize your inputs
help/doc zscore
help/doc mapstd
and use the default spread0 = 1 to initialize a search for an acceptable value. I usually home in on a reasonable range by changing spread values by a factor of 2. Then, if finer detail is needed, I use a binary search.
For reference purposes, assume a trn/val/tst data division and consider the Naive Constant Output Model (denoted by the subscript "00" )
[ O Ntrn ] = size(ttrn) % Training target matrix, ttrn
Ntrneq = prod(size(ttrn)) % No of training equations
= Ntrn*O
Constant O-dimensional output, regardless of input
meanttrn2 = mean(ttrn,2)
Nw00 = numel(meanttrn2) % No. of estimated weights
= O
Ndof00 = Ntrneq-Nw00 % No. of estimation degrees of freedom
= (Ntrn-1)*O
ytrn00 = repmat( meanttrn2, 1, Ntrn );
SSEtrn00 = sse(ttrn-ytrn00)
MSEtrn00 = mse(ttrn-ytrn00)
= SSEtrn00/Ntrneq
= mean(var(ttrn',1)) % Biased variance
MSEtrn00a = SSEtrn00/Ndof00 % DOF "a"djustment
= mean(var(ttrn',0)) % Unbiased variance
= mean(var(ttrn'))
A reasonable training goal is to have BOTH
1. A positive number of estimation degrees of freedom
Ndof = Ntrneq - Nw > 0
Ntrneq = Number of training equations
Nw = Number of unknown weights to estimate
2. A low degree-of-freedom-"a"djusted (DOFA) scale-independent normalized-mean-square-error
NMSEtrna = MSEtrna/MSEtrn00a
= ( SSEtrn/Ndof ) / ( SSEtrn00/Ndof00 )
= ( Ndof00/Ndof ) * (SSEtrn/SSEtrn00)
This is equivalent to a large degree-of-freedom-"a"djusted (DOFA) coefficient of determination (see Wikipedia).
R2trna = 1 - NMSEtrna
The ordinary coefficient of determination (aka R^2)
R2 = 1- SSE/SSE00
= 1- (MSE/Neq)/(MSE00/Neq)
= 1- MSE/MSE00
= 1 - NMSE
does not take into account the loss in the estimation Ndof caused by using the same data to train and evaluate a model. Therefore, it is valid for evaluating models using nontraining (e.g., validation and testing) data using
MSEval00 = mse(tval-meanttrn2)
MSEtst00 = mse(ttst-meanttrn2)
The training goal
MSEtrngoal = 0.01*Ndof*MSEtrn00a/Ntrneq
yields
NMSEtrna = 0.01
and
R2trna = 0.99
which is interpretted as the NN Model characterizing (aka "explaining") 99% of the unbiased target variance
%SETTING EG=.02 and sc=0.1,When i plot confusion it is giving 98 correct classification and output Y I got correctly one for class B and 0.0528 for class A.
Not clear what you mean by that number.
% (3) How can i get exact zeros for class A???
Although nice to have, that is not a realistic goal. The most realistic goal is to minimize the classification error rate on nontraining data that can be assumed to come from the same source as the training data.
Stop! Think about it:
"Use TRAINING data to minimize DISCONTINUOUS classification rates on NONTRAINING data"!
The typical NN approach is to use scale free training data specifications (e.g., NMSEtrn, NMSEtrna, R2trn or R2trna) that are converted to SSEtrn or MSEtrn goals which are sought via continuous function gradient descent search. If the resulting classification rates on nontraining data are unsatisfactory, the goals are changed and training is continued untill the nontraining error rates are either satisfactory or minimimized.
NEWRB does not use a gradient descent search. Therefore, it could have been coded to have the "Early Stopping" (aka "Stopped Training") option for directly minimizing training set classification errors while monitoring validation set classification errors. Training would stop when the validation set classification error rate stopped decreasing.
Unfortunately, NEWRB is rigidly coded to directly minimize MSEtrn without considering classification error rates or performance on nontraining data.
% (4) How can i use LOGSIG transfer fn in second layer instead of PURELIN???? I tried rbf with nntool also. We will get output and errors after simulation.
Unfortunately, NEWRB does not have that option. An alternative is to use goals of -/+n instead of 0/1 and then use logsig after training:
logsig(-7:7) = [ 0.0009 0.0025 0.0067 0.0180 0.0474 0.1192 0.2689
0.5 0.7311 0.8808 0.9526 0.9820 0.9933 0.9975 0.9991
% (5) What does it mean????? And same as question (3) in o/p class A is not zeros but Class B is ones. Why is it so? Later i need to include more type of classes and for a given set of same features network should tell in which group it belongs to.
In general, for c classes, the columns in the target matrix are columns of the c-dimensional unit matrix eye(c). The transformation between the unit vectors and the class indices are
target = ind2vec(trueclassind)
trueclassind = vec2ind(target)
output = net(input);
assignedclassind = vec2ind(output)
Nerr = sum(assignedclassind ~= trueclassind)
However, since sum(target) = ones(1,N), one of the rows of the target and output matrices could be omitted. However, I only do this when c = 2 as in your case.
Furthermore, by keeping track of the trn/val/tst data division indices, those individual tabulations are readily obtained.
% (6) For my problem which one u think is more helpful- RBF EXACT FIT OR RBF FEWER NEURONS??????
The answer is always: the one that works best on nontraining data that are assumed to come from the same source as the training data.
Randomly divide the data into trn/val/tst subsets using the dividerand function
help dividerand
doc dividerand
With [I Ntrn] = size(xtrn), [ O Ntrn ] = size(ttrn) and H hidden nodes, the number of unknown weights to estimate is Nw = 1 + I*H + (H+1)*O. Therefore Ndof = Ntrneq-Nw is positive when H <= Hub where
Hub = -1 +ceil( (Ntrneq-O-1)/(I + O) )
with a typical MATLAB trn/val/tst division ratio of 0.7/0.15/0.15,
Hub = -1 + ceil( (0.7*646 -1-1)/ (8 + 1) ) = 50
Since H changes during training, first try
net = newrb( xtrn, ttrn, MSEgoal, spread, Hub);
with
MSEgoal = 0.01*MSEtrn00
spread = 1
Hub = 50
If unsuccessful try looping over spread values and tabulate the training and validatation set performances. Choose the net with the best val set performance and evaluate it with the test set prformance.
If unsatisfactory, you can consider repeating with a different random trn/val/tst division
Hope this helps.
Thank you for formally accepting my answer
Greg
  2 Comments
rakesh r
rakesh r on 10 May 2013
Edited: rakesh r on 10 May 2013
thank u sir. but sir how this newrb is fixing the radial centers?
Greg Heath
Greg Heath on 11 May 2013
Each epoch the input vector with the worst performance is chosen to be the center of a new hidden layer Gaussian transfer function

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!