Refit neighborhood component analysis (NCA) model for regression
mdlrefit = refit(mdl,Name,Value)
mdl — Neighborhood component analysis model for regression
Neighborhood component analysis model or classification, specified
Specify optional pairs of arguments as
the argument name and
Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name in quotes.
FitMethod — Method for fitting the model
mdl.FitMethod (default) |
Method for fitting the model, specified as the comma-separated
pair consisting of
'FitMethod' and one of the following.
'exact'— Performs fitting using all of the data.
'none'— No fitting. Use this option to evaluate the generalization error of the NCA model using the initial feature weights supplied in the call to
'average'— The function divides the data into partitions (subsets), fits each partition using the
exactmethod, and returns the average of the feature weights. You can specify the number of partitions using the
NumPartitionsname-value pair argument.
mdlrefit — Neighborhood component analysis model for regression
Neighborhood component analysis model or classification, returned as a
FeatureSelectionNCARegression object. You can either save the
results as a new model or update the existing model as
Refit NCA Model for Regression with Modified Settings
Load the sample data.
robotarm (pumadyn32nm) dataset is created using a robot arm simulator with 7168 training and 1024 test observations with 32 features , . This is a preprocessed version of the original data set. Data are preprocessed by subtracting off a linear regression fit followed by normalization of all features to unit variance.
Compute the generalization error without feature selection.
nca = fsrnca(Xtrain,ytrain,'FitMethod','none','Standardize',1); L = loss(nca,Xtest,ytest)
L = 0.9017
Now, refit the model and compute the prediction loss with feature selection, with = 0 (no regularization term) and compare to the previous loss value, to determine feature selection seems necessary for this problem. For the settings that you do not change,
refit uses the settings of the initial model
nca. For example, it uses the feature weights found in
nca as the initial feature weights.
nca2 = refit(nca,'FitMethod','exact','Lambda',0); L2 = loss(nca2,Xtest,ytest)
L2 = 0.1088
The decrease in the loss suggests that feature selection is necessary.
Plot the feature weights.
Tuning the regularization parameter usually improves the results. Suppose that, after tuning using cross-validation as in Tune Regularization Parameter in NCA for Regression, the best value found is 0.0035. Refit the
nca model using this value and stochastic gradient descent as the solver. Compute the prediction loss.
nca3 = refit(nca2,'FitMethod','exact','Lambda',0.0035,... 'Solver','sgd'); L3 = loss(nca3,Xtest,ytest)
L3 = 0.0573
Plot the feature weights.
After tuning the regularization parameter, the loss decreased even more and the software identified four of the features as relevant.
 Rasmussen, C. E., R. M. Neal, G. E. Hinton, D. van Camp, M. Revow, Z. Ghahramani, R. Kustra, and R. Tibshirani. The DELVE Manual, 1996, https://mlg.eng.cam.ac.uk/pub/pdf/RasNeaHinetal96.pdf