version 1.4.0.0 (7.67 KB) by
Rami Khushaba

http://dx.doi.org/10.1016/j.eswa.2011.03.028

One of the fundamental motivations for feature selection is to overcome the curse of dimensionality problem. This code presents a novel feature selection method utilizing a combination of differential evolution (DE) optimization method and a proposed repair mechanism based on feature distribution measures. For more info please refer to http://dx.doi.org/10.1016/j.eswa.2011.03.028

Rami Khushaba (2021). Differential Evolution Based Channel and Feature Selection (https://www.mathworks.com/matlabcentral/fileexchange/30877-differential-evolution-based-channel-and-feature-selection), MATLAB Central File Exchange. Retrieved .

Created with
R2009a

Compatible with any release

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

Kannadasan KTohfa HaqueIn your code calling format [Err,Subset] = DEFS(data_tr,data_ts,DNF,PSIZE,Ld,classif,GEN), I did not get what is you are referring to by PSIZE:population size. It would help me to use it further.

SATISH KUMARWhen i run the example code i got following error

[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,0,100)

Iter: 1 Acc: 0.0000 Subset Selected: 2 3 4

Undefined function or variable 'f'.

Error in DEFS (line 232)

if f <= Fit(j)

NAYANA BRI used the code. it worked. and helped me to the max in identifying the best feature subset. thanks.

clifford maswanganyi@Dr Rami

thank you for the response Dr Rami.

From my extracted features i want to classify four classes(left hand, right hand, foot and tongue) but

I didn't put these responses because i am also not sure which class label must go to which row in the last column.

can you please advice or assist on how to add these responses to my extracted features.

Rami Khushaba@clifford

Did you put your class label as the last feature in FeatureEX matrix? Please read comments inside the code first.

clifford maswanganyiDear Dr Rami

I am new on EEG, i currently have a 672528X22 data sets from a publicly available database (BCI competition) and i used the code from your previous post to extract features (Feature Extraction Using Multisignal Wavelet Packet Decomposition(getmswpfeat.m and mswpd.m)), below is how i called the function:

features=getmswpfeat(x',51,32,5,'matlab');

And the extracted features i am getting are 21015X1386 but when i take these features and use them on the below function:

[Err,Subset] = DEFS(FeatureEx(1:2:end,1:end),FeatureEx(2:2:end,:),3,50,0,'KNN',100);

I get the following error message:

Error using internal.stats.parseArgs (line 42)

Parameter name must be text.

Error in classreg.learning.paramoptim.parseOptimizationArgs (line 5)

[OptimizeHyperparameters,~,~,RemainingArgs] = internal.stats.parseArgs(...

Error in fitcknn (line 257)

[IsOptimizing, RemainingArgs] = classreg.learning.paramoptim.parseOptimizationArgs(varargin);

Error in DEFS (line 95)

Ac1 = fitcknn(data_ts(:,val),data_tr(:,val),data_tr(:,end),3);

Please kindly assist Dr Rami.

Rami Khushaba@AliSameer

The second to last input is the classifier name, not a number, please read comments inside the code. SO you should run thus like below

[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'LDA',100)

Rami Khushaba@Abhishek

The iris example has only 4 features, and you run it correctly. Obviously, with the NB classifier, the subset made of features 1, 3, and 4 is not different from the subset of 1, 2, and 4 in terms of classification accuracy.

Ali sameerhello sir

the code worked well few minutes and then I faced some errors

the errors were

Out of memory. The likely cause is an infinite recursion within the program.

Error in DEFS (line 36)

[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,0,100)

Abhishek Dixitwhen i am running this algorithm ,it is giving below outcome ,is this output correct or i am missing something

>> [Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'NB',50)

Iter: 1 Acc: 4.0000 Subset Selected: 1 2 4

Iter: 2 Err: 4.0000 Subset Selected: 2 3 4

Iter: 3 Err: 4.0000 Subset Selected: 2 3 4

Iter: 4 Err: 4.0000 Subset Selected: 1 3 4

Iter: 5 Err: 4.0000 Subset Selected: 2 3 4

Iter: 6 Err: 4.0000 Subset Selected: 1 2 4

Iter: 7 Err: 4.0000 Subset Selected: 1 2 4

Iter: 8 Err: 4.0000 Subset Selected: 2 3 4

Iter: 9 Err: 4.0000 Subset Selected: 1 3 4

Iter: 10 Err: 4.0000 Subset Selected: 2 3 4

Iter: 11 Err: 4.0000 Subset Selected: 2 3 4

Iter: 12 Err: 4.0000 Subset Selected: 1 2 4

Iter: 13 Err: 4.0000 Subset Selected: 1 3 4

Iter: 14 Err: 4.0000 Subset Selected: 1 3 4

Iter: 15 Err: 4.0000 Subset Selected: 1 2 4

Iter: 16 Err: 4.0000 Subset Selected: 1 2 4

Iter: 17 Err: 4.0000 Subset Selected: 1 2 4

Iter: 18 Err: 4.0000 Subset Selected: 1 2 4

Iter: 19 Err: 4.0000 Subset Selected: 2 3 4

Iter: 20 Err: 4.0000 Subset Selected: 1 3 4

Iter: 21 Err: 4.0000 Subset Selected: 1 3 4

Iter: 22 Err: 4.0000 Subset Selected: 1 3 4

Iter: 23 Err: 4.0000 Subset Selected: 1 2 4

Iter: 24 Err: 4.0000 Subset Selected: 1 2 4

Iter: 25 Err: 4.0000 Subset Selected: 1 2 4

Iter: 26 Err: 4.0000 Subset Selected: 1 3 4

Iter: 27 Err: 4.0000 Subset Selected: 2 3 4

Iter: 28 Err: 4.0000 Subset Selected: 1 2 4

Iter: 29 Err: 4.0000 Subset Selected: 1 2 4

Iter: 30 Err: 4.0000 Subset Selected: 2 3 4

Iter: 31 Err: 4.0000 Subset Selected: 1 2 4

Iter: 32 Err: 4.0000 Subset Selected: 1 2 4

Iter: 33 Err: 4.0000 Subset Selected: 1 2 4

Iter: 34 Err: 4.0000 Subset Selected: 1 2 4

Iter: 35 Err: 4.0000 Subset Selected: 1 2 4

Iter: 36 Err: 4.0000 Subset Selected: 1 2 4

Iter: 37 Err: 4.0000 Subset Selected: 1 2 4

Iter: 38 Err: 4.0000 Subset Selected: 1 2 4

Iter: 39 Err: 4.0000 Subset Selected: 1 2 4

Iter: 40 Err: 4.0000 Subset Selected: 2 3 4

Iter: 41 Err: 4.0000 Subset Selected: 1 3 4

Iter: 42 Err: 4.0000 Subset Selected: 1 3 4

Iter: 43 Err: 4.0000 Subset Selected: 1 2 4

Iter: 44 Err: 4.0000 Subset Selected: 1 2 4

Iter: 45 Err: 4.0000 Subset Selected: 1 2 4

Iter: 46 Err: 4.0000 Subset Selected: 1 2 4

Iter: 47 Err: 4.0000 Subset Selected: 2 3 4

Iter: 48 Err: 4.0000 Subset Selected: 2 3 4

Iter: 49 Err: 4.0000 Subset Selected: 1 3 4

Iter: 50 Err: 4.0000 Subset Selected: 1 2 4

Err =

Columns 1 through 15

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Columns 16 through 30

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Columns 31 through 45

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Columns 46 through 50

4 4 4 4 4

Subset =

1 2 4

Abhishek Dixitis this output correct or i am missing something

mesut sekerIs there anybody tried iris.m for DEFS_chs ? I wondered, how can we do channel selection ?

I used emotiiv-epoc headset and it has 14 channels. I selected 7 features for each channel. I have 4 classes and number of trial is 10. For each class, I have 140x8 data. Totally I have 560x8 data. Last coloumn is class. Is there anybody explain how can I pick up best 5 channels ?

Vijoy RajThank you sir for your quick response.I have organized the '9_Tumors' data shifting the class label i.e., the 1st column to the last colum.and it is working fine when i am calling DEFS with 'KNN'. But when i have call DEFS with 'NB',a erros has occured,like:

Error using NaiveBayes.fit>gaussianFit (line 371)

The within-class variance in each feature of TRAINING must be positive. The within-class variance in feature 17 in class 7 are

not positive.

Error in NaiveBayes.fit (line 335)

obj = gaussianFit(obj, training, gindex);

Error in DEFS (line 115)

O1 = NaiveBayes.fit(data_tr(:,val),data_tr(:,end));%,'dist',F);

can you please explain the problem belongs to Naive Bayes please sir and some instructions to solve it?

Vijoy Rajcan you please explain the reason and solution of the problem,sir.

with regards

vijoy

Rami KhushabaVijoy, thanks for your inquiry.

Did you actually run the iris examples? if so you will find that the data is organized as features in columns and class label in the last column. Comparing to 9_Tumors or Brain_Tumor2, I believe that those datasets have the class label in the first column not the last!!!

There is a note in the code saying "with NP1 patterns x NF+1 features with last column being the training class label"

if you didn't get it, just reorganize your data in the same format as iris and that's it.

Vijoy Rajthank you sir,it's a great work.

when i call the function DEFS on IRIS data set with:

iris = xlsread('iris_file_edited.xlsx');

[Err,Subset] = DEFS(iris(1:2:end,1:end),iris(2:2:end,:),3,50,0,'NB',150)

the code shows some output.But i call with large data set (like 9_Tumors or Brain_Tumor2) using

tumar = xlsread('9_Tumors.mat');

[Err,Subset] = DEFS(tumar(1:2:end,1:end),tumar(2:2:end,:),30,100,0,'NB',150)

it shows error:

Index exceeds matrix dimensions.

Error in DEFS (line 83)

Pop(:,j) = FF(1:D)'; % within b.constraints

So how can i apply this code on large data set like Colon, Lymphoma, NCI, 9_Tumors etc and apply naive bayes classifier?

Thank you

Rami KhushabaBilas

Obvioisly you need to learn to code with Matlab.

The code I provided for DEFS does already have an option to classify with NaiveBayes classifier, and I previously mentioned how you can format your matrices as inputs, although I already provided an example inside the code!!!

Please read inside the code and learn how to program in matlab, it's very easy.

Regarding your algorithm description, you talk about binary DE and the one I posted here is not binary, as at the time I developed DEFS there were not many DE feature selection available with DE. However, your algorithm description is similar to DEFS, though I didn't get why you needed the probabilities.

Learn matlab and read the code before asking for what's already there.

Thanks

bilas talukdarHere is the algorithm that i want to implement.Can you please help me out giving some suggestions?

DIFFERENTIAL EVOLUTION-NAIVE BAYES(DE-NB):

DE-NB selects an attribute subset through the whole space of attributes by carrying out a differential evolution search process. And it uses naive Bayes’classification accuracy as the fitness function to evaluate alternative subsets of attributes and selects the individual with the maximum classification accuracy after a fixed number of generations.

The outline of the proposed algorithm lists as follows:

Step1. Initialize population X by binary code, each individual is composed of a string of feature selection bit.

Step2. Remove attributes which are not selected from the training samples attribute to get the training data T, according to the feature of each individual selection bit.

Step3. Calculate priori probability P(yi) of each class of training data.

Step4. Calculate conditional probability P(x|yi) of each attribute’ division.

Step5. Calculate priori probability P(yi)*P(x|yi) of each class.

Step6. Select the maximum priori probability P(yi)* P(x|yi) as the class x belongs to.

Step7. Calculate the entire sample classification accuracy as the classification

accuracy BestAccuracy and its corresponding feature selection Bestf.

Step8. Determine whether the current accuracy and number of iterations reaches the end of the condition, if reached, go to step13, otherwise the next step go into the next generation iterative process.

Step9. Choose two individuals from population X. Execute differential mutation on each individual in population X to generate population Y.

Step10. Execute Crossover between individual in population X and individual in population Y to generate population Z.

Step11. Repeat step2 to step7 to get the current best accuracy BestAccuracy_temp and its corresponding subset of feature selection. If BestAccuracy_temp > BestAccuracy, then BestAccuracy = BestAccuracy _temp, Bestf =f_temp.

Step12. Choose the best one into population X in the next generation between individual in the population X and individual in the population Z in a way of one-to-one selection according to classification accuracy.

Step13. Repeat step2 to step7 according to training data selected by Bestf attributes to get final classified accuracy

bilas talukdarsir, if i want to use differential evolution for selecting attribute subset from the whole set of attributes and use Naive bayes classifier[ p(yi)*p(x|yi) ] as objective function, classification accuracy as fitness function to improve the classification accuracy, what should the format of the code or how your code can help me?

Regards

Bilas

Rami KhushabaHello,

Please read the example provided in the code.

If you have done that and still not sure, then please note that the rows of the feature matrices are samples and columns are features to be selected. On the other hand, the class label has values of 1,2,3,... which represent the different classes, i.e., each class is given a number.

Hope that helps.

Rami

bilas talukdarsir, i'm a undergraduate student.i'm working on a topics related to this one.Please help me giving the data set format that you have used in your matlab code as i couldn't run this code properly.my email address: sagarbilas@yahoo.com

sahar Abboudvery good

SeppI have another question. I think you just just default parameters for the classifiers. But how and when should I train (select) the best parameters for the classifier? Should I first run your algorithm with default parameters and then do on the selected features parameter selection (e.g. via cross-validaton)?

SeppThank you very much for your great work. Can it also be used for fMRI data, i.e. where each feature e.g. corresponds to a voxel? And if yes what should I choose for the population size and number of patterns (I don't know what they mean)? Is number of patterns, just number of data points?

Rami KhushabaDear Mayank

As the title of this post suggests, it is a "feature selection" technique not really "feature extraction". Feature extraction includes feature construction and maybe selection. In this post only feature selection is done, i.e., assuming that you already extracted the features of interest.

Mayank TRespected sir

for my project based on diagnosis of alzheimers

can i use this code for feature extraction and training my dataset

plz reply asap

ShahabRami KhushabaDear subbmdee

I am really sorry for the late reply, I haven't seen your comment till day.

Please use LIBSVM (multiclass SVM) as it is very fast and easy to include in DEFS.

LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/

LIBLINEAR: http://www.csie.ntu.edu.tw/~cjlin/liblinear/

Kind Regards

Rami

subbmdeeDear Dr Rami,

I would like to use MultiSVM as a classifier. It seems to take much time for that.

Any input in that regard?

QasemRami KhushabaHi Krishna

Iris dataset is made of 4 features only, so there is really no point of using feature selection on it. However, most probably you set the parameters in a wrong way. See the below example and let me know if it still not working with you:

>> load iris.dat

>> [Err,Subset] = DEFS(iris(1:2:end,:),iris(2:2:end,:),3,50,0,'NB',100)

krishnaDear Dr Rami,

I downloaded your code and try to run iris data.

i have installed optimization toolbox.But iam getting error message

"undefined function f" line 232 in DEFS.m code.

Please help to run this code for sample dataset

Regards

krishna

Rami KhushabaKindly note that if the LDA classifier option did not work on your own dataset (for which you usually get the error: pooled covariance matrix ....), then this doesn't mean that there is a problem with the algorithm but with the LDA classifier itself. When there is high a high degree of correlation among the features, LDA usually fails giving the above error message. In such a case, and if you have to use LDA, you can get an alternative implementation of LDA, like the one available in MatlabArsenal toolbox.

Enjoy your research.

Rami KhushabaYou welcome Ali

AliThanks a lot for sharing your code

Rami KhushabaHello There,

These are part of the Optimization toolbox. We are not allowed to share original Matlab files. Please get a copy of optimization toolbox from Mathworks. I will write my own modified version of these soon and will update you.

Dr. Rami Khushaba

Wolfgang SchellenbergerSome functions are missing:

fitscalingrank

selectionstochunif

...

LeoA fantastic algorithm. The algorithm selects features effectively and efficiently without trying all combinations randomly. It has been used on many of my applications.

Rami KhushabaDear All

This might not be coded in the optimal way, but I mainly put the code so that other researchers can compare the performance of their code with this method. This was requested many times by others.

If you have any suggestions or comments, am more than happy to communicate.

Kind Regards

Dr. Rami Khushaba