3.66667

3.7 | 16 ratings Rate this file 187 Downloads (last 30 days) File Size: 3.15 MB File ID: #22970
image thumbnail

Feature Selection using Matlab

by

 

13 Feb 2009 (Updated )

Select the subset of features that maximizes Correct Classification Rate.

| Watch this File

File Information
Description

The DEMO includes 5 feature selection algorithms:
• Sequential Forward Selection (SFS)
• Sequential Floating Forward Selection (SFFS)
• Sequential Backward Selection (SBS)
• Sequential Floating Backward Selection (SFBS)
• ReliefF

Two CCR estimation methods:
• Cross-validation
• Resubstitution

After selecting the best feature subset, the classifier obtained can be used for classifying any pattern.

 Figure: Upper panel is the pattern x feature matrix
             Lower panel left are the features selected
             Lower panel right is the CCR curve during feature selection steps
             Right panel is the classification results of some patterns.

This software was developed using Matlab 7.5 and Windows XP.

Copyright: D. Ververidis and C.Kotropoulos
                 AIIA Lab, Thessaloniki, Greece,
                 jimver@aiia.csd.auth.gr
                 costas@aiia.csd.auth.gr

In order to run the DEMO:

In order to run the demo:
- A PC with Windows XP is needed.
- Use Matlab7.5 or later to run DEMO.m

1) Select the ‘finalvec.mat’ dataset (patterns x [features+1] matrix) from 'PatTargMatrices' folder. The last column of ‘finalvec.mat’ are the targets.
2) Press the run button on the panel. It is the second one.
3) After the selection of the optimum feature set, select a set of patterns for classification using the open folder button (last button). It can be the same data-set that was used for training the feature selection algorithm

% REFERENCES:
[1] D. Ververidis and C. Kotropoulos, "Fast and accurate feature subset selection applied into speech emotion recognition," Els. Signal Process., vol. 88, issue 12, pp. 2956-2970, 2008.
[2] D. Ververidis and C. Kotropoulos, "Information loss of the Mahalanobis distance in high dimensions: Application to feature selection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2275-2281, 2009.

MATLAB release MATLAB 7.5 (R2007b)
Other requirements Matlab 7.5
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (39)
12 Aug 2014 alireza

hi all
thx for reply
i read all comment but i cant solve my problem
i save my dataset in this way
save ('feature.mat','feature')
but when i open this file in demo i faced to this error
Error using load
Unable to read file 'feature': no such file or directory.

Error in DataLoadAndPreprocess (line 21)
load([DatasetToUse]);

Error in DEMO>OpenDataFile_ClickedCallback (line 64)
[Patterns, Targets] = DataLoadAndPreprocess(handles.file);

Error in gui_mainfcn (line 96)
feval(varargin{:});

Error in DEMO (line 19)
gui_mainfcn(gui_State, varargin{:});

Error while evaluating uipushtool ClickedCallback
whats problem?what i must do it?

06 Jul 2014 mehdi

After using the Feature Selection, how can I use the selected features for classification, such that I can see the performance of classification?(How can I see the output of classification in workspace?)

19 May 2014 bjut

good

10 Mar 2014 Emad

After using the Feature Selection, how can I use the selected features for classification, such that I can see the performance of classification?(How can I see the output of classification in workspace?)

09 Jul 2013 Lei Yang

Very good

15 Nov 2012 Sajjad Taghvaee

After using the Feature Selection, how can I use the selected features for classification, such that I can see the performance of classification?(How can I see the output of classification in workspace?)

15 Nov 2012 Sajjad Taghvaee

After using the Feature Selection, how can I use the selected features for classification, such that I can see the performance of classification?(How can I see the output of classification in workspace?)

25 May 2012 Dimitrios Ververidis

The problem is that Matlab is not compatible with previous versions.
The problem is within 'gui_mainfcn.m'

Solution:

Backward compatabily for Matlab <7.5 for code written with >7.7 ****
HINT: if you have Matlab 7.5 do this

go to 'gui_mainfcn.m' to line 226 and replace

226 guidemfile('restoreToolbarToolPredefinedCallback',gui_hFigure)

with

%----Check Matlab Version (Original on 7.7 support for 7.5 also) ---
MatlabVersion = version;
MatlabVersion = str2double(MatlabVersion(1:3));
%-------------------------------------------------------------------
if MatlabVersion >=7.7
guidemfile('restoreToolbarToolPredefinedCallback',gui_hFigure);
elseif MatlabVersion ==7.5
guidemfile('restoreToolbarToolPredefinedCallback',get(gui_hFigure));
end
%-------------------------------------------------------------------

with this way you have functionality for code written for either at older than 7.5 or newer than 7.7 :)

24 May 2012 mai

hai.. i got this problem. and i found that most of them ask the same problem but do not have any exactly answer.
TQ

??? Reference to non-existent field 'file'.

Error in ==> DEMO>RunFeatSelection_ClickedCallback at 114
[FeatureWeightsOrdered, FeaturesIndexOrdered, ...

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uipushtool ClickedCallback

24 Apr 2012 zhang yue

When the FS steps larger than 200,I came up with the Error.
??? Error using ==> delete
Invalid handle object.

Error in ==> ForwSel_main at 458
delete( HYelLines(LinesToDeleteFeatInd));

Error in ==> DEMO>RunFeatSelection_ClickedCallback at 111
[ResultMat, ConfMatOpt, Tlapse, handles.OptimumFeatureSet,...

Error in ==> gui_mainfcn at 75
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uipushtool ClickedCallback.

Thank you very much.

01 Apr 2012 Yan

Dear Dimitrios,
I have been trying to run your code but it gives me error.
??? Reference to non-existent field 'FeatSelCurve'.

Error in ==> DEMO>OpenDataFile_ClickedCallback at 70
axes(handles.FeatSelCurve);cla reset;

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

Error in ==> DEMO>OpenDataMenu_Callback at 245
DEMO('OpenDataFile_ClickedCallback',gcbo,[],guidata(gcbo));

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uimenu Callback

Can you tell me how to solve it?

09 Jan 2012 Dimitrios Ververidis

Go to options and make confidence interval smaller. This will increase the number of cross-validation repetitions, thus execution time gets longer. The features selected will be almost identical per run. This is due to the fact that cross-validation involves a random selection of training and testing set.

09 Jan 2012 Patrick

Dear all,
I am new to feature selection topic. I have found that this program is useful for my data. However each time i run the program, it ended up with different answers/different features. So could anybody tell me how to choose the best features for my data.

31 Dec 2011 Ahmad Azar  
12 Nov 2011 belal

hello all,

I have the same problem of tabzim, that when I use the following code it returns just one feature>> could you help me if you solve your problem??

cc = cvpartition(y,'k',10);
opts = statset('display','iter');
[fs,history] = sequentialfs( @fc,X,y,'cv',cc,'direction','backward','options',opts)%,'nfeatures',5)

function [sm]=fc(Xtrain,Ytrain,Xtest,Ytest)
yy=svmclassify(svmtrain(Xtrain,Ytrain),Xtest);
sm=sum(~strcmp(Ytest,yy))
-----

so please help>>

07 Sep 2011 Sangtae Ahn

I have an error as below
What shall I do ?

-----------------------

??? Error using ==> eval
Undefined function or variable 'data'.

Error in ==> DataLoadAndPreprocess at 26
[NPatterns, KInitialFeatures] = eval(['size(' DatasetToUse ')']);

Error in ==> DEMO>OpenDataFile_ClickedCallback at 64
[Patterns, Targets] = DataLoadAndPreprocess(handles.file);

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uipushtool ClickedCallback

14 Jul 2011 TabZim

Can anyone tell me how to implement a wrapper with Support vector machines.I've been trying to use the following code snippet for the purpose but it is always returning me one feature(which is the first one in case forward selection and last one in case of backward selection ).Can anyone explain to me why this is happening or give some other example as a demo to explain the feature selection process by using SVM. Many thanks in advance

%% FISHERIRIS DATA

load fisheriris
X = randn(150,20);
X(:,1:4)= meas(:,:);
y = species(1:100,:);
groups = ismember(species,'setosa');
y= groups(:,:)
X= scaleData(X); % to scale data in range [0,1]

%% CROSS VALIDATING

cc = cvpartition(y,'k',10);

%% SVM TRAINING AND TESTING FOR FEATURE %% SELECTION

opts = statset('display','iter');
OPTIONS=optimset('MaxIter',1000);
fun = @(Xtrain,Ytrain,Xtest,Ytest)...
(sum(~strcmp(Ytest,svmclassify(svmtrain(Xtrain,Ytrain),Xtest))))

[fs,history] = sequentialfs(fun,X,y,'cv',cc,'options',opts,'nfeatures',3)

%% END OF CODE

24 Jun 2011 Guillermo

Other implementations of FS methods in Matlab can be found at http://www.prtools.org , at http://cmp.felk.cvut.cz/cmp/software/stprtool
For a comprehensive list of alternative FS related projects as well as other resources including benchmarking data see http://fst.utia.cz/?relres

07 Apr 2011 Jaime Delgado Saa

I have no take a look on this but why are you posting a demo here... I dont think this is the purpose of this

16 Jan 2011 Tiger Deng

I just read the ForwSet_main.m file,
and find

%=============== Load The Patterns ===============================

% NPatterns: The number of Patterns
% KFeatures: The number of features
% CClasses : The number of features

% Patterns, features and Targets in a single matrix of
% NPatterns X (KFeatures + 1) dimensionality.
% The additional feature column is the Targets.
% Patterns: FLOAT numbers in [0,1]
% Targets: INTEGER in {1,2,...,C}, where C the number of classes.

So, in finalVecKKidsVR.m file, there are 2792 patterns, and 113 features?

Is that correct?

15 Jan 2011 Tiger Deng

Dear all,

Is there anyone knowing the structure of input data, say finalVecKKidsVR, I know there are 2792 objects, and the feature vector is 90 dimension.

But which columns?
And also, what's the rest meaning for?

Thanks beforehand

10 Jan 2011 chen ??

I have solve the problem,thank you for your software, it is helpful for me.

09 Jan 2011 chen ??

When the FS steps larger than 200,I came up with the Error.Can you tell me.......
??? Error using ==> delete
Invalid handle object.

Error in ==> ForwSel_main at 458
delete( HYelLines(LinesToDeleteFeatInd));

Error in ==> DEMO>RunFeatSelection_ClickedCallback at 111
[ResultMat, ConfMatOpt, Tlapse, handles.OptimumFeatureSet,...

Error in ==> gui_mainfcn at 75
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uipushtool ClickedCallback.

Thank you.

20 Sep 2010 Dimitrios Ververidis

Response to Ly Lu:

The problem is that you used 'test' instead of 'test.mat' and therefore matlab can not find your file.

check if stringvariable DatasetToUse is 'test.mat'

18 Sep 2010 ly lu

I want to know the format of input data.
When I choose my own input file "test.mat", error occurs like below:
??? Error using ==> eval
Undefined function or variable 'test'.

Error in ==> DataLoadAndPreprocess at 26
[NPatterns, KInitialFeatures] = eval(['size(' DatasetToUse ')']);

Error in ==> DEMO>OpenDataFile_ClickedCallback at 64
[Patterns, Targets] = DataLoadAndPreprocess(handles.file);

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

Error in ==> DEMO>OpenDataMenu_Callback at 245
DEMO('OpenDataFile_ClickedCallback',gcbo,[],guidata(gcbo));

Error in ==> gui_mainfcn at 96
feval(varargin{:});

Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uimenu Callback

Can you tell me ,what's the problem?

23 Jun 2010 Omar A

Dear Dimitrios,
I have been trying to run your code but it gives me error. First i run it using the data set provided "finalvecKidsVR.mat", it start fine but by the end gives me error. Then I tried to use my own data set, the program did not let me to load/open my data.
When i change the file name of the data set you provided, the program did not run as well. I changed my data set file name to the same name of your data set "finalvecKidsVR.mat", and i faced the same problem.
can you help me with that?
thanks

12 May 2010 kanaan

Hi all i need the code of SFS and SBS that return one feature at each step because i want to implement the IIFS algorithm see :

An improvement on floating search algorithms for feature subset selection songyot nakariyakul plz i need help and if there code for this paper.

12 May 2010 electronic engineering department Information school of Fudan university

Hi Ververidis,
If we must use it only by the GUI style?
Could you write a simple operation manual for us. I also have the proplem that lele hu said.

01 Mar 2010 Dimitrios Ververidis

Hi,

1)There is no missing end in the "DataLoadAndPreprocess.m". As regards the reduction of 114 features to 90, it is happening because there were some features with NaNs in the certain dataset that were removed.

You can remove such features (if you have NaNs of Infs in your data) by adding some lines in function DataLoadAndPreprocess.m.

2) Well, there was a minor bug:

At the ForwSel_main.mat replace lines 256 to 273 with the following:

if ~isempty(handles)
axes(handles.YelLinesAxes);
hold on
if (NPatterns > KFeatures)
axis([0 NPatterns 0 KFeatures]); axis manual
HYelLines(FeatureToInclude)=plot([0 NPatterns+2],...
(FeatureToInclude-0.5)*ones(1,2),'y','linewidth',3);
else
axis([0 KFeatures 0 NPatterns]); axis manual
HYelLines(FeatureToInclude)= plot(...
FeatureToInclude*ones(1,2),[0 KFeatures+2],'y','linewidth',3);
end
set(gca,'Visible','off');
drawnow
set(findobj(gcf,'Tag','ListSelFeats'), 'String', ...
sort(SelectedFeatPool));
axes(handles.FeatSelCurve);
end

3) There is no need to change the name of your data. It works with any name. For example I debugged the previous error by generating a two class problem of 200 patterns (100 per class) and 2000 features estimated on them:

>> x = [0.25+0.1*randn(100,2000); 0.35+0.25*randn(100,2000)];
>> Data = [x [ones(100,1); 2*ones(100,1)]];

That correspond to patternsXfeatures matrix 200 X 2000
and Targets vector 200 X 1

I saved the "Data" variable as "Data.mat" in the [PatTargMatrices] folder and I loaded it with the GUI and I pressed run.

I don't have the linux version of Matlab and I don't know the differences. I don't think that there is any.

BR,
Dimitrios

01 Mar 2010 lele hu

Hi
Thank you very much for your attention.
my first question :
finalvecKidsVR.mat your supply me is a 2792*114 matrix, but after the first step, I got Patterns.mat, which is a 2792*90 matrix, please tell the reason. I think this is because there is a "end" missed in the file "DataLoadAndPreprocess.m" .

the second question:
I remake the file finalvecKidsVR.mat with 50*114 size, which is the first 50 rows of your original finalvecKidsVR.mat. But after I started feature selection, the error message given is
"??? Undefined function or variable 'IndexFeature'.
Error in ==> ForwSel_main at 264
HYelLines(IndexFeature)= plot((FeatureToInclude-0.5)...
Error in ==> DEMO>RunFeatSelection_ClickedCallback at 111
[ResultMat, ConfMatOpt, Tlapse, handles.OptimumFeatureSet,...
Error in ==> gui_mainfcn at 96
feval(varargin{:});
Error in ==> DEMO at 19
gui_mainfcn(gui_State, varargin{:});

??? Error while evaluating uipushtool ClickedCallback
"

the third question :
What is the filename of the test dataset I should use to ensure the program work

the fourth question:
Do you have this program for linux ?

03 Jan 2010 gasmi karim

Hello
I am doing a project in image processing. J'extraire 14 First feature of each image and then I want to do feature selection with genetic algorithm :: individual : size 14 and each case 0 if the feature does not participate in the classification and 1 otherwise
with the fitness function is calculated by occuracy svm. But until now I can not implement the GA matlab "demo" because I do not know what to change in the function bioagfit.m "please help me how I can change its methods and if there are other change in other methods and thank you

01 Jan 2010 toto11

Hi Ververidis,
thanks for the reply
I mean by the cross-correlation, eliminate redundancy of coeffecients strongly correlated.
for the GA algorithm feature selection function not yet, If I have managed to add it, I will send it to you

29 Dec 2009 Dimitrios Ververidis

hi toto11,

1. Which coefficients ? Do you mean those in ReliefF function ? If you mean them, cross-correlation was not exploited.

2. Yes, you can use GA algorithm for feature selection:
a) add your function GenetAlgo.m in DEMO.m (similarly as ReliefF.m), e.g.
%=============================================

elseif strcmp(FSSettings.FSMethod,'ReliefF')
[FeatureWeightsOrdered, FeaturesIndexOrdered, ...
handles.OptimumFeatureSet] = ReliefF(handles.file,...
FSSettings,handles);
elseif strcmp(FSSettings.FSMethod,'GenetAlgo')
[FeatureWeightsOrdered, FeaturesIndexOrdered, ...
handles.OptimumFeatureSet] = GenetAlgo(handles.file,
FSSettings,handles);
%=================================================

Do not forget to add 'GenetAlgo' as a string option in the menu of figure. The default is 'SFS'.

Structure FSSettings.YourSettings allows the use of any variables for GA (mutation vars etc.).

NOTE: If you have managed to add it, please send it to me, I will acknowledge you (is there any more formal name than toto11?)

21 Dec 2009 toto11

hello Ververidis,
my questions
is your code taking in account the cross-correlation between coefficients?
and if I use the GA (genetic algorithm) for feature selection
how can I formulate the objectif function?
any ideas are the welcome.

17 Dec 2009 Dimitrios Ververidis

User can select between SFS and SFFS. See menu options. They are not combined.

They both belong to the general forward selection function. SFS is actually a part of SFFS.

With this move I avoided writing the same code twice.

15 Dec 2009 Autumn

Hi, why did you combine the SFS and SFFS in a single program? Where do u separate those two methods?

26 Feb 2009 Dimitrios Ververidis

The 'InvDet' function was not used at all.

You should have seen that I prefered to use the matlab functions 'det' and 'inv' instead of mine 'InvDet' function.

There is a variety of methods to invert the determinant, however i consider Matlab's 'det' and 'inv' as the best ones.

The 'InvDet' is an implementation of Gauss-Jordan method, which may not be the best.

Search if a function is used before commenting it !!!

18 Feb 2009 John D'Errico

Spam. Only a demo anyway. Nothing of true value here except for an advertisement.

(Do not rate your own work. This defeats the purpose of a rating system.)

This author has an inflated opinion of his own work anyway. There is no help. Absolutely none. Even the low level functions have ZERO comments. Ok, not exactly zero. Here is one of the very few comments in the invdet function:

detM = abs(detM); % Rem: Why ?

Yeah, right. A truly valuable comment.

Mlint points out multiple problems. At the very least, it shows deadwood in the code, often a sign of potentially buggy code.

18 Feb 2009 Dimitrios Ververidis

ok, it is the best. What should I say, it is mine. However, please sent me any bug when something goes wrong.

Updates
23 Feb 2009

Latest version 5.1.5 includes also
a) an executable version.
Requirements: MSIInstaller.exe of Matlab7.5
b) A pdf describing the method

26 Feb 2009

News: Version 5.1.6 supports up to 7 classes.

07 Mar 2009

5.1.8 Help menu added
        Settings menu added

29 Aug 2010

PAMI paper accepted

Contact us