Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
PLSREGRESS

Subject: PLSREGRESS

From: Frank Sabouri

Date: 24 Aug, 2010 19:37:27

Message: 1 of 2

Hi all,

In partial least squares (plsregress), we need to define number of predictors or number of components in function to avoid over-fitting. I noticed that people plot either PCTVAR or MSE versus number of principal components. If we have “n” predictors the size of PCTVAR and MSE are 2-by-n and 2-by-n+1, respectively. I have some questions:

1. The question is whether PCTVAR or MSE should be considered to define number of “n” in “plsregress” function. May you please explain why?

2. If I would plot MSE vs. number of components or “n”, but not “n+1”; which column of the data (MSE) should be removed #1 or #n+1.

3. To define the number of components, I need to know whether we need to look at the first row (predictors) of either PCTVAR or MSE, or whether we need to plot the 2nd row (responses) of either PCTVAR or MSE.

Regards,
Frank

Subject: PLSREGRESS

From: Cagri

Date: 24 Aug, 2010 20:30:24

Message: 2 of 2

The validation method for choosing number of components of PLS is controversial. However, there are several metrics you can use: root means squared error of cross-validation (RMSECV), prediction residual error sum of squares (PRESS), R^2, Q^2. Several papers on choosing number of PLS components is as follows:

http://scholar.google.com/scholar?q=A+comparison+of+partial+least+squares+regression+with+other+prediction+methods&hl=en&as_sdt=0&as_vis=1&oi=scholart

Multi-way Analysis with Applications in the Chemical Sciences, by Age Smilde.

Some people use classification accuracy as the determinant for number of PLS components, which is not verified.

Hope this helps.

"Frank Sabouri" <Frank.Sabouri@gmail.com> wrote in message <i5171m$p0k$1@fred.mathworks.com>...
> Hi all,
>
> In partial least squares (plsregress), we need to define number of predictors or number of components in function to avoid over-fitting. I noticed that people plot either PCTVAR or MSE versus number of principal components. If we have “n” predictors the size of PCTVAR and MSE are 2-by-n and 2-by-n+1, respectively. I have some questions:
>
> 1. The question is whether PCTVAR or MSE should be considered to define number of “n” in “plsregress” function. May you please explain why?
>
> 2. If I would plot MSE vs. number of components or “n”, but not “n+1”; which column of the data (MSE) should be removed #1 or #n+1.
>
> 3. To define the number of components, I need to know whether we need to look at the first row (predictors) of either PCTVAR or MSE, or whether we need to plot the 2nd row (responses) of either PCTVAR or MSE.
>
> Regards,
> Frank

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us