How are important variables identified in the Partial Least Squares Regression function PLSREGRESS?

I am using the PLSREGRESS function in one of my applications to identify important variables in my data sets.
For another program I need to know how important variables are identified in this function?

 Accepted Answer

Within your Partial Least Squares (PLS) Regression calculation, the PLS projection finds those components that maximize the covariance between X and Y. For NCOMP components, it first finds the covariance between X and Y. Then, it finds a decomposition of the covariance, and then uses the resulting matrices for projection of X and Y.
Let the singular value decomposition of the covariance result in
[U,S,V] = svd(cov)
where U is the matrix of left singular vectors, and V is the matrix of right singular vectors. The following pseudo code is performed within PLSREGRESS in an iterative fashion:
for NCOMP components
    X is projected onto the column space of the vector corresponding to the largest singular value in U
    Y is projected onto the column space of the vector corresponding to the largest singular value in V
    select the NCOMP components from X and Y that maximize the covariance
There are some additional steps for orthogonalization and centering, but the main algorithm is the SIMPLS algorithm, as mentioned in the reference section of the PLSREGRESS documentation:
Please note that the implementation of the “simpls” function can be found inside of PLSREGRESS.m.
As for your other program, you might be looking for the calculation of the "Variable Importance in Projection" (VIP) scores, which estimate the importance of each variable. They can be easily obtained from the outputs of PLSREGRESS as this example illustrates:
% Load data on near infrared (NIR) spectral intensities of 60 samples of gasoline at 401 wavelengths, and their octane ratings.
load spectra
X = NIR;
Y = octane;
% Perform PLS regression with ten components.
NCOMP = 10;
[XL,YL,XS,YS,beta,pctvar,mse,stats] = plsregress(X,Y,NCOMP);
% Calculate normalized PLS weights
W0 = bsxfun(@rdivide,stats.W,sqrt(sum(stats.W.^2,1)));
% Calculate the product of summed squares of XS and YL
sumSq = sum(XS.^2,1).*sum(YL.^2,1);
% Calculate VIP scores for NCOMP components
vipScores = sqrt(size(XL,1) * sum(bsxfun(@times,sumSq,W0.^2),2) ./ sum(sumSq,2));
 

1 Comment

If you are still experiencing this issue, please consider submitting a Technical Support case. We will be happy to help you out. You can do so at the following location:

Sign in to comment.

More Answers (0)

Products

Release

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!