Suppose you have a very large feature vector X, used to predict a a vector of expected values y.
Is the sequential linear linear regression,
e.g.: coeff=regress(y, X);
followed by sequential feature reduction,
e.g. [coeff_subset] = sequentialfs(fun, X, y, 'direction', 'backward'); % where: fun = @(XT,yT,Xt,yt)(rmse(regress(yT, XT)'*Xt')', yt);
the easiest/best approach to get the a reasonable sized feature vecture when no other information is known?
It seems that, from my testing, this method rarely captures the features that matter the most, and I obtained better results by randomly selecting some of the features.
No products are associated with this question.
If you prefer linear regression, use function stepwisefit or its new incarnation LinearModel.stepwise. For example, for backward elimination with an intercept term you can do
load carsmall X = [Acceleration Cylinders Displacement Horsepower]; y = MPG; stepwisefit([ones(100,1) X],y,'inmodel',true(1,5))
In general, there is no "best" approach to feature selection. What you can do depends on what assumptions you are willing to make (such as linear model), how many features you have and how much effort you want to invest.