sequentialfs
Sequential feature selection using custom criterion
Syntax
inmodel = sequentialfs(fun,X,y)
inmodel = sequentialfs(fun,X,Y,Z,...)
[inmodel,history] = sequentialfs(fun,X,...)
[] = sequentialfs(...,param1
,val1
,param2
,val2
,...)
Description
inmodel = sequentialfs(fun,X,y)
selects a subset of features from the
data matrix X
that best predict the data in y
by
sequentially selecting features until there is no improvement in prediction. Rows of
X
correspond to observations; columns correspond to variables or
features. y
is a column vector of response values or class labels for each
observation in X
. X
and y
must have
the same number of rows. fun
is a function handle to a function that
defines the criterion used to select features and to determine when to stop. The output
inmodel
is a logical vector indicating which features are finally
chosen.
Starting from an empty feature set, sequentialfs
creates candidate
feature subsets by sequentially adding each of the features not yet selected. For each
candidate feature subset, sequentialfs
performs 10fold crossvalidation
by repeatedly calling fun
with different training subsets of
X
and y
, XTRAIN
and
ytrain
, and test subsets of X
and
y
, XTEST
and ytest
, as
follows:
criterion = fun(XTRAIN,ytrain,XTEST,ytest)
XTRAIN
and ytrain
contain the same subset of rows of
X
and Y
, while XTEST
and
ytest
contain the complementary subset of rows. XTRAIN
and XTEST
contain the data taken from the columns of X
that correspond to the current candidate feature set.
Each time it is called, fun
must return a scalar value
criterion
. Typically, fun
uses
XTRAIN
and ytrain
to train or fit a model, then
predicts values for XTEST
using that model, and finally returns some
measure of distance, or loss, of those predicted values from
ytest
. In the crossvalidation calculation for a given candidate feature
set, sequentialfs
sums the values returned by fun
and
divides that sum by the total number of test observations. It then uses that mean value to
evaluate each candidate feature subset.
Typical loss measures include sum of squared errors for regression models
(sequentialfs
computes the meansquared error in this case), and the
number of misclassified observations for classification models
(sequentialfs
computes the misclassification rate in this case).
Note
sequentialfs
divides the sum of the values returned by
fun
across all test sets by the total number of test observations.
Accordingly, fun
should not divide its output value by the number of test
observations.
After computing the mean criterion
values for each candidate feature
subset, sequentialfs
chooses the candidate feature subset that minimizes
the mean criterion value. This process continues until adding more features does not decrease
the criterion.
inmodel = sequentialfs(fun,X,Y,Z,...)
allows any number of input
variables X
, Y
, Z
, ... .
sequentialfs
chooses features (columns) only from X
,
but otherwise imposes no interpretation on X
, Y
,
Z
, ... . All data inputs, whether column vectors or matrices, must have
the same number of rows. sequentialfs
calls fun
with
training and test subsets of X
, Y
, Z
,
... as follows:
criterion = fun(XTRAIN,YTRAIN,ZTRAIN,..., XTEST,YTEST,ZTEST,...)
sequentialfs
creates XTRAIN
,
YTRAIN
, ZTRAIN
, ... , XTEST
,
YTEST
, ZTEST
, ... by selecting subsets of the rows of
X
, Y
, Z
, ... .
fun
must return a scalar value criterion
, but may
compute that value in any way. Elements of the logical vector inmodel
correspond to columns of X
and indicate which features are finally
chosen.
[inmodel,history] = sequentialfs(fun,X,...)
returns information on
which feature is chosen at each step. history
is a scalar structure with
the following fields:
Crit
— A vector containing the criterion values computed at each step.In
— A logical matrix in which rowi
indicates the features selected at stepi
.
[] = sequentialfs(...,
specifies optional parameter name/value pairs from the following table.param1
,val1
,param2
,val2
,...)
Parameter  Value 

'cv'  The validation method used to compute the criterion for each candidate feature subset.
The default value is Socalled wrapper methods use a function

'mcreps'  A positive integer indicating the number of MonteCarlo repetitions for
crossvalidation. The default value is 
'direction'  The direction of the sequential search. The default is

'keepin'  A logical vector or a vector of column numbers specifying features that must be included. The default is empty. 
'keepout'  A logical vector or a vector of column numbers specifying features that must be excluded. The default is empty. 
'nfeatures'  The number of features at which 
'nullmodel'  A logical value, indicating whether or not the null model (containing no
features from 
'options'  Options structure for the iterative sequential search algorithm, as created
by
To compute in parallel, you need Parallel Computing Toolbox™. 
Examples
Perform sequential feature selection for classification of noisy features:
load fisheriris rng('default') % For reproducibility X = randn(150,10); X(:,[1 3 5 7])= meas; y = species; c = cvpartition(y,'k',10); opts = statset('Display','iter'); fun = @(XT,yT,Xt,yt)loss(fitcecoc(XT,yT),Xt,yt); [fs,history] = sequentialfs(fun,X,y,'cv',c,'options',opts) Start forward sequential feature selection: Initial columns included: none Columns that can not be included: none Step 1, added column 5, criterion value 0.00266667 Step 2, added column 7, criterion value 0.00222222 Step 3, added column 1, criterion value 0.00177778 Step 4, added column 3, criterion value 0.000888889 Final columns included: 1 3 5 7 fs = 1×10 logical array 1 0 1 0 1 0 1 0 0 0 history = struct with fields: In: [4×10 logical] Crit: [0.0027 0.0022 0.0018 8.8889e04] history.In ans = 4×10 logical array 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 0 0