Patial Least-Squares (PLS) is a widely used technique in various areas. This package provides a function to perform the PLS regression using the Nonlinear Iterative Partial Least-Squares (NIPALS) algorithm. It consists of a tutorial function to explain the NIPALS algorithm and the way to perform discriminant analysis using the PLS function.
The difference between the total least squares regression and partial least squares regression can be explained as follows:
For given independent data X and dependent data Y, to fit a model
Y = X*B + E
the total least squares regression solves the problem to minimize the error in least squares sense:
J = E'*E
Instead of directly fitting a model between X and Y, the PLS decomposes X and Y into low-dimensional space (so called laten variable space) first:
X = T*P' + E0, and
Y = U*Q' + F0
where P and Q are orthogonal matrices, i.e. P'*P=I, Q'*Q=I, T and U has the same number of columns, a, which is much less than the number of columns of X. Then, a least squares regression is performed between T and U:
U = T*B + F1
At the end, the overall regression model is
Y = X*(P*B*Q') + F
i.e. the overall regression coefficient is P*B*Q'.
The reason to perform PLS instead of total LS regression is that the data sets X and Y may contain random noises, which should be excluded from regression. Decomposing X and Y into laten space can ensure the regression is performed based on most reliable variation. |