One of the fundamental motivations for feature selection is to overcome the curse of dimensionality problem. This code presents a novel feature selection method utilizing a combination of differential evolution (DE) optimization method and a proposed repair mechanism based on feature distribution measures. For more info please refer to http://dx.doi.org/10.1016/j.eswa.2011.03.028
Rami Khushaba (2020). Differential Evolution Based Channel and Feature Selection (https://www.mathworks.com/matlabcentral/fileexchange/30877differentialevolutionbasedchannelandfeatureselection), MATLAB Central File Exchange. Retrieved .
1.4.0.0  1Code revision to run on different versions of Matlab.


1.3.0.0  Fixed a small issue preventing the code from running on older versions of Matlab 

1.1.0.0  Dependency on two files from the optimization toolbox corrected. 
Create scripts with code, output, and formatted text in a single executable document.
In your code calling format [Err,Subset] = DEFS(data_tr,data_ts,DNF,PSIZE,Ld,classif,GEN), I did not get what is you are referring to by PSIZE:population size. It would help me to use it further.
When i run the example code i got following error
[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,0,100)
Iter: 1 Acc: 0.0000 Subset Selected: 2 3 4
Undefined function or variable 'f'.
Error in DEFS (line 232)
if f <= Fit(j)
I used the code. it worked. and helped me to the max in identifying the best feature subset. thanks.
@Dr Rami
thank you for the response Dr Rami.
From my extracted features i want to classify four classes(left hand, right hand, foot and tongue) but
I didn't put these responses because i am also not sure which class label must go to which row in the last column.
can you please advice or assist on how to add these responses to my extracted features.
@clifford
Did you put your class label as the last feature in FeatureEX matrix? Please read comments inside the code first.
Dear Dr Rami
I am new on EEG, i currently have a 672528X22 data sets from a publicly available database (BCI competition) and i used the code from your previous post to extract features (Feature Extraction Using Multisignal Wavelet Packet Decomposition(getmswpfeat.m and mswpd.m)), below is how i called the function:
features=getmswpfeat(x',51,32,5,'matlab');
And the extracted features i am getting are 21015X1386 but when i take these features and use them on the below function:
[Err,Subset] = DEFS(FeatureEx(1:2:end,1:end),FeatureEx(2:2:end,:),3,50,0,'KNN',100);
I get the following error message:
Error using internal.stats.parseArgs (line 42)
Parameter name must be text.
Error in classreg.learning.paramoptim.parseOptimizationArgs (line 5)
[OptimizeHyperparameters,~,~,RemainingArgs] = internal.stats.parseArgs(...
Error in fitcknn (line 257)
[IsOptimizing, RemainingArgs] = classreg.learning.paramoptim.parseOptimizationArgs(varargin);
Error in DEFS (line 95)
Ac1 = fitcknn(data_ts(:,val),data_tr(:,val),data_tr(:,end),3);
Please kindly assist Dr Rami.
@AliSameer
The second to last input is the classifier name, not a number, please read comments inside the code. SO you should run thus like below
[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'LDA',100)
@Abhishek
The iris example has only 4 features, and you run it correctly. Obviously, with the NB classifier, the subset made of features 1, 3, and 4 is not different from the subset of 1, 2, and 4 in terms of classification accuracy.
hello sir
the code worked well few minutes and then I faced some errors
the errors were
Out of memory. The likely cause is an infinite recursion within the program.
Error in DEFS (line 36)
[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,0,100)
when i am running this algorithm ,it is giving below outcome ,is this output correct or i am missing something
>> [Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'NB',50)
Iter: 1 Acc: 4.0000 Subset Selected: 1 2 4
Iter: 2 Err: 4.0000 Subset Selected: 2 3 4
Iter: 3 Err: 4.0000 Subset Selected: 2 3 4
Iter: 4 Err: 4.0000 Subset Selected: 1 3 4
Iter: 5 Err: 4.0000 Subset Selected: 2 3 4
Iter: 6 Err: 4.0000 Subset Selected: 1 2 4
Iter: 7 Err: 4.0000 Subset Selected: 1 2 4
Iter: 8 Err: 4.0000 Subset Selected: 2 3 4
Iter: 9 Err: 4.0000 Subset Selected: 1 3 4
Iter: 10 Err: 4.0000 Subset Selected: 2 3 4
Iter: 11 Err: 4.0000 Subset Selected: 2 3 4
Iter: 12 Err: 4.0000 Subset Selected: 1 2 4
Iter: 13 Err: 4.0000 Subset Selected: 1 3 4
Iter: 14 Err: 4.0000 Subset Selected: 1 3 4
Iter: 15 Err: 4.0000 Subset Selected: 1 2 4
Iter: 16 Err: 4.0000 Subset Selected: 1 2 4
Iter: 17 Err: 4.0000 Subset Selected: 1 2 4
Iter: 18 Err: 4.0000 Subset Selected: 1 2 4
Iter: 19 Err: 4.0000 Subset Selected: 2 3 4
Iter: 20 Err: 4.0000 Subset Selected: 1 3 4
Iter: 21 Err: 4.0000 Subset Selected: 1 3 4
Iter: 22 Err: 4.0000 Subset Selected: 1 3 4
Iter: 23 Err: 4.0000 Subset Selected: 1 2 4
Iter: 24 Err: 4.0000 Subset Selected: 1 2 4
Iter: 25 Err: 4.0000 Subset Selected: 1 2 4
Iter: 26 Err: 4.0000 Subset Selected: 1 3 4
Iter: 27 Err: 4.0000 Subset Selected: 2 3 4
Iter: 28 Err: 4.0000 Subset Selected: 1 2 4
Iter: 29 Err: 4.0000 Subset Selected: 1 2 4
Iter: 30 Err: 4.0000 Subset Selected: 2 3 4
Iter: 31 Err: 4.0000 Subset Selected: 1 2 4
Iter: 32 Err: 4.0000 Subset Selected: 1 2 4
Iter: 33 Err: 4.0000 Subset Selected: 1 2 4
Iter: 34 Err: 4.0000 Subset Selected: 1 2 4
Iter: 35 Err: 4.0000 Subset Selected: 1 2 4
Iter: 36 Err: 4.0000 Subset Selected: 1 2 4
Iter: 37 Err: 4.0000 Subset Selected: 1 2 4
Iter: 38 Err: 4.0000 Subset Selected: 1 2 4
Iter: 39 Err: 4.0000 Subset Selected: 1 2 4
Iter: 40 Err: 4.0000 Subset Selected: 2 3 4
Iter: 41 Err: 4.0000 Subset Selected: 1 3 4
Iter: 42 Err: 4.0000 Subset Selected: 1 3 4
Iter: 43 Err: 4.0000 Subset Selected: 1 2 4
Iter: 44 Err: 4.0000 Subset Selected: 1 2 4
Iter: 45 Err: 4.0000 Subset Selected: 1 2 4
Iter: 46 Err: 4.0000 Subset Selected: 1 2 4
Iter: 47 Err: 4.0000 Subset Selected: 2 3 4
Iter: 48 Err: 4.0000 Subset Selected: 2 3 4
Iter: 49 Err: 4.0000 Subset Selected: 1 3 4
Iter: 50 Err: 4.0000 Subset Selected: 1 2 4
Err =
Columns 1 through 15
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Columns 16 through 30
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Columns 31 through 45
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Columns 46 through 50
4 4 4 4 4
Subset =
1 2 4
is this output correct or i am missing something
Is there anybody tried iris.m for DEFS_chs ? I wondered, how can we do channel selection ?
I used emotiivepoc headset and it has 14 channels. I selected 7 features for each channel. I have 4 classes and number of trial is 10. For each class, I have 140x8 data. Totally I have 560x8 data. Last coloumn is class. Is there anybody explain how can I pick up best 5 channels ?
Thank you sir for your quick response.I have organized the '9_Tumors' data shifting the class label i.e., the 1st column to the last colum.and it is working fine when i am calling DEFS with 'KNN'. But when i have call DEFS with 'NB',a erros has occured,like:
Error using NaiveBayes.fit>gaussianFit (line 371)
The withinclass variance in each feature of TRAINING must be positive. The withinclass variance in feature 17 in class 7 are
not positive.
Error in NaiveBayes.fit (line 335)
obj = gaussianFit(obj, training, gindex);
Error in DEFS (line 115)
O1 = NaiveBayes.fit(data_tr(:,val),data_tr(:,end));%,'dist',F);
can you please explain the problem belongs to Naive Bayes please sir and some instructions to solve it?
can you please explain the reason and solution of the problem,sir.
with regards
vijoy
Vijoy, thanks for your inquiry.
Did you actually run the iris examples? if so you will find that the data is organized as features in columns and class label in the last column. Comparing to 9_Tumors or Brain_Tumor2, I believe that those datasets have the class label in the first column not the last!!!
There is a note in the code saying "with NP1 patterns x NF+1 features with last column being the training class label"
if you didn't get it, just reorganize your data in the same format as iris and that's it.
thank you sir,it's a great work.
when i call the function DEFS on IRIS data set with:
iris = xlsread('iris_file_edited.xlsx');
[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'NB',150)
the code shows some output.But i call with large data set (like 9_Tumors or Brain_Tumor2) using
tumar = xlsread('9_Tumors.mat');
[Err,Subset] = DEFS(tumar(1:2:end,1:end),tumar(2:2:end,:),30,100,0,'NB',150)
it shows error:
Index exceeds matrix dimensions.
Error in DEFS (line 83)
Pop(:,j) = FF(1:D)'; % within b.constraints
So how can i apply this code on large data set like Colon, Lymphoma, NCI, 9_Tumors etc and apply naive bayes classifier?
Thank you
Bilas
Obvioisly you need to learn to code with Matlab.
The code I provided for DEFS does already have an option to classify with NaiveBayes classifier, and I previously mentioned how you can format your matrices as inputs, although I already provided an example inside the code!!!
Please read inside the code and learn how to program in matlab, it's very easy.
Regarding your algorithm description, you talk about binary DE and the one I posted here is not binary, as at the time I developed DEFS there were not many DE feature selection available with DE. However, your algorithm description is similar to DEFS, though I didn't get why you needed the probabilities.
Learn matlab and read the code before asking for what's already there.
Thanks
Here is the algorithm that i want to implement.Can you please help me out giving some suggestions?
DIFFERENTIAL EVOLUTIONNAIVE BAYES(DENB):
DENB selects an attribute subset through the whole space of attributes by carrying out a differential evolution search process. And it uses naive Bayes’classification accuracy as the fitness function to evaluate alternative subsets of attributes and selects the individual with the maximum classification accuracy after a fixed number of generations.
The outline of the proposed algorithm lists as follows:
Step1. Initialize population X by binary code, each individual is composed of a string of feature selection bit.
Step2. Remove attributes which are not selected from the training samples attribute to get the training data T, according to the feature of each individual selection bit.
Step3. Calculate priori probability P(yi) of each class of training data.
Step4. Calculate conditional probability P(xyi) of each attribute’ division.
Step5. Calculate priori probability P(yi)*P(xyi) of each class.
Step6. Select the maximum priori probability P(yi)* P(xyi) as the class x belongs to.
Step7. Calculate the entire sample classification accuracy as the classification
accuracy BestAccuracy and its corresponding feature selection Bestf.
Step8. Determine whether the current accuracy and number of iterations reaches the end of the condition, if reached, go to step13, otherwise the next step go into the next generation iterative process.
Step9. Choose two individuals from population X. Execute differential mutation on each individual in population X to generate population Y.
Step10. Execute Crossover between individual in population X and individual in population Y to generate population Z.
Step11. Repeat step2 to step7 to get the current best accuracy BestAccuracy_temp and its corresponding subset of feature selection. If BestAccuracy_temp > BestAccuracy, then BestAccuracy = BestAccuracy _temp, Bestf =f_temp.
Step12. Choose the best one into population X in the next generation between individual in the population X and individual in the population Z in a way of onetoone selection according to classification accuracy.
Step13. Repeat step2 to step7 according to training data selected by Bestf attributes to get final classified accuracy
sir, if i want to use differential evolution for selecting attribute subset from the whole set of attributes and use Naive bayes classifier[ p(yi)*p(xyi) ] as objective function, classification accuracy as fitness function to improve the classification accuracy, what should the format of the code or how your code can help me?
Regards
Bilas
Hello,
Please read the example provided in the code.
If you have done that and still not sure, then please note that the rows of the feature matrices are samples and columns are features to be selected. On the other hand, the class label has values of 1,2,3,... which represent the different classes, i.e., each class is given a number.
Hope that helps.
Rami
sir, i'm a undergraduate student.i'm working on a topics related to this one.Please help me giving the data set format that you have used in your matlab code as i couldn't run this code properly.my email address: sagarbilas@yahoo.com
very good
I have another question. I think you just just default parameters for the classifiers. But how and when should I train (select) the best parameters for the classifier? Should I first run your algorithm with default parameters and then do on the selected features parameter selection (e.g. via crossvalidaton)?
Thank you very much for your great work. Can it also be used for fMRI data, i.e. where each feature e.g. corresponds to a voxel? And if yes what should I choose for the population size and number of patterns (I don't know what they mean)? Is number of patterns, just number of data points?
Dear Mayank
As the title of this post suggests, it is a "feature selection" technique not really "feature extraction". Feature extraction includes feature construction and maybe selection. In this post only feature selection is done, i.e., assuming that you already extracted the features of interest.
Respected sir
for my project based on diagnosis of alzheimers
can i use this code for feature extraction and training my dataset
plz reply asap
Dear subbmdee
I am really sorry for the late reply, I haven't seen your comment till day.
Please use LIBSVM (multiclass SVM) as it is very fast and easy to include in DEFS.
LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
LIBLINEAR: http://www.csie.ntu.edu.tw/~cjlin/liblinear/
Kind Regards
Rami
Dear Dr Rami,
I would like to use MultiSVM as a classifier. It seems to take much time for that.
Any input in that regard?
Hi Krishna
Iris dataset is made of 4 features only, so there is really no point of using feature selection on it. However, most probably you set the parameters in a wrong way. See the below example and let me know if it still not working with you:
>> load iris.dat
>> [Err,Subset] = DEFS(iris(1:2:end,:),iris(2:2:end,:),3,50,0,'NB',100)
Dear Dr Rami,
I downloaded your code and try to run iris data.
i have installed optimization toolbox.But iam getting error message
"undefined function f" line 232 in DEFS.m code.
Please help to run this code for sample dataset
Regards
krishna
Kindly note that if the LDA classifier option did not work on your own dataset (for which you usually get the error: pooled covariance matrix ....), then this doesn't mean that there is a problem with the algorithm but with the LDA classifier itself. When there is high a high degree of correlation among the features, LDA usually fails giving the above error message. In such a case, and if you have to use LDA, you can get an alternative implementation of LDA, like the one available in MatlabArsenal toolbox.
Enjoy your research.
You welcome Ali
Thanks a lot for sharing your code
Hello There,
These are part of the Optimization toolbox. We are not allowed to share original Matlab files. Please get a copy of optimization toolbox from Mathworks. I will write my own modified version of these soon and will update you.
Dr. Rami Khushaba
Some functions are missing:
fitscalingrank
selectionstochunif
...
A fantastic algorithm. The algorithm selects features effectively and efficiently without trying all combinations randomly. It has been used on many of my applications.
Dear All
This might not be coded in the optimal way, but I mainly put the code so that other researchers can compare the performance of their code with this method. This was requested many times by others.
If you have any suggestions or comments, am more than happy to communicate.
Kind Regards
Dr. Rami Khushaba