Why different feature selection functions give so different results?

11 views (last 30 days)
Hello all,
I am using 3 different Filter Type Feature Selection functions to have a preliminary assessment of my features. More specifically, the fscmrmr, fscchi2, and relieff. My problem is a multi-class classification (i.e. 3 different classes) with all continuous features. Thus, as explain in the MATLAB's introduction to feature selection, all these functions fit into my problem.
However, when I run the functions, I get very different results. I wasn't expecting identical results, but I have features that have selected as one of the most important in one method and one of the least in another. Is it normal due to the underlying algorithm of these functions? If so, which one would you recommend? Or am I doing something wrong?
P.S.: I have normalized my features with z-score and averaging the scores across folds of my CV.
Thanks in advance! :)

Accepted Answer

Aditya Patil
Aditya Patil on 15 Jul 2020
Selected features will differ due to both algorithmic differences, as well as nature of data. To give an hypothetical example, if there are two features, foo and bar, where bar = 2 * foo + noise then one algorithm might select bar and drop foo, while other might select foo, and put bar as lowest importance.
You can decide which one to use based on the accuracy of the resulting model.

More Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!