PLS using plsregress and matrices

16 views (last 30 days)
Hello there community
I'm right now studying Partial Least Squares regression, and looking at the implementation of PLS in matlab.
One of the tasks have been to compare the loadings, weights and scores found from using matrices, to the ones using the built-in function plsregress, and I thought it would come out to be the same in the end.
Using matrices I simply followed the PLS1 routine and the code is as follows:
% Perform PLS using matrices
X1 = X;
y1 = y;
w1 = X1'*y1/(norm(X1'*y1));
t1 = X1*w1;
c1 = t1'*y1/(t1'*t1);
p1 = X1'*t1/(t1'*t1);
X2 = X1 - t1*p1';
y2 = y1 - t1*c1;
w2 = X2'*y2/(norm(X2'*y2));
t2 = X2*w2;
c2 = t2'*y2/(t2'*t2);
p2 = X2'*t2/(t2'*t2);
W = [w1 w2];
P = [p1 p2];
T = [t1 t2];
% PLS Using plsregress
[XL, YL, XS, YS, BETA, PCTVAR,MSE,stats] = plsregress(X, y, 2);
So just to explain - X is my xblock of data and is a 25x27 block (centered). The vector y is my y data and is a 25x1 vector.
I go through the PLS algorithm twice. First I calculate the w1 vector, the t1 vector, the regression coefficient c1 and the loadings vector p1. I then subtract the explained portion and put the residuals in X2 and y2. Then go through the steps once more. In the end I create the weights matrix W, the loadings matrix P and the scores matrix T.
However, when I compare my results with what I get using the plsregress() function, they are not the same. Following I show the stats.W (weights from the plsregress as far as I understand) and W (the weights calculated using matrices).
% Display stats.W next to W
[stats.W W]
ans =
3.6239e-02 -1.7838e-01 6.0033e-02 -2.6785e-01
9.5784e-02 3.0986e-02 1.5868e-01 -1.8624e-01
1.7037e-01 3.0955e-01 2.8223e-01 -6.7067e-02
2.1082e-01 4.9946e-01 3.4924e-01 3.7886e-02
3.2194e-01 8.6533e-01 5.3333e-01 1.6439e-01
1.7337e-01 3.3531e-01 2.8721e-01 -4.7183e-02
2.7217e-02 -1.9484e-01 4.5087e-02 -2.6437e-01
4.6464e-02 -1.5543e-01 7.6972e-02 -2.6734e-01
7.8943e-02 3.9884e-04 1.3078e-01 -1.7960e-01
9.1123e-02 3.1160e-02 1.5096e-01 -1.7543e-01
1.2736e-01 1.1261e-01 2.1099e-01 -1.7350e-01
1.2120e-01 2.6840e-02 2.0078e-01 -2.4849e-01
-1.3984e-02 -4.2193e-01 -2.3166e-02 -4.0622e-01
-1.8946e-02 -4.1855e-01 -3.1387e-02 -3.9139e-01
-5.8644e-03 -4.0043e-01 -9.7149e-03 -4.0241e-01
-2.5563e-03 -1.3194e-01 -4.2347e-03 -1.3117e-01
8.5710e-03 -4.2801e-02 1.4199e-02 -6.3986e-02
3.7592e-02 7.8691e-02 6.2275e-02 -4.0133e-03
4.2253e-02 9.8045e-02 6.9997e-02 5.4530e-03
7.7289e-02 2.1046e-01 1.2804e-01 4.2280e-02
1.3653e-01 4.0125e-01 2.2618e-01 1.0530e-01
7.4131e-02 1.9662e-01 1.2281e-01 3.5118e-02
1.1368e-01 3.2715e-01 1.8832e-01 8.0472e-02
2.1683e-01 6.6354e-01 3.5920e-01 1.9454e-01
6.6012e-02 1.7034e-01 1.0936e-01 2.6346e-02
8.8717e-03 -1.0416e-02 1.4697e-02 -3.1045e-02
5.2629e-03 -1.1638e-02 8.7185e-03 -2.4085e-02
Am I misunderstanding something here? Can anybody shed some light on what is happening?
Thanks in advance!
/Rune

Accepted Answer

the cyclist
the cyclist on 28 Nov 2015
I notice that the first and third columns are proportional to each other, within expected error. See the plot below the shows their ratio. (It might not look constant, but notice the tiny variations in the y-axis.)
According to the documentation, stats.W is a "... matrix of PLS weights so that XS = X0*W". Have you taken into account the same normalization in your data?
I expect something similar is going on with the other columns, but I'm not sure.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!