Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
deleting observations in linear regression

Subject: deleting observations in linear regression

From: Mike

Date: 18 May, 2011 20:03:03

Message: 1 of 4


I seem to remember that there is an efficient way to recompute multiple linear regression coefficients when each observation in the dataset is left out in turn. That is, the regression is implicitly performed N times with a unique subset of N-1 observations each time.

Does anybody know how to do this efficiently? Does/can matlab do this?

Thanks much

Subject: deleting observations in linear regression

From: Darren Rowland

Date: 19 May, 2011 09:13:03

Message: 2 of 4

Sounds like the PRESS method (predicted residual sum of squares). I can't seem to find a decent reference, but there is a file on the FEX by Antonio Trujillo-Ortiz
http://www.mathworks.com/matlabcentral/fileexchange/14564

A similar procedure is Leave One Out Cross Validation. The Matlab implementation appears to require the Bioinformatics Toolbox (function crossvalind).

Hth
Darren

Subject: deleting observations in linear regression

From: Mike

Date: 19 May, 2011 16:12:03

Message: 3 of 4

"Darren Rowland" wrote in message <ir2muv$afr$1@newscl01ah.mathworks.com>...
> Sounds like the PRESS method (predicted residual sum of squares). I can't seem to find a decent reference, but there is a file on the FEX by Antonio Trujillo-Ortiz
> http://www.mathworks.com/matlabcentral/fileexchange/14564
>
> A similar procedure is Leave One Out Cross Validation. The Matlab implementation appears to require the Bioinformatics Toolbox (function crossvalind).
>
> Hth
> Darren

I've done some googling and it appears that there are methods - and PRESS seems to be one - to compute the leave-out-one residuals, but I need the actual regression coefficients (or decent estimates thereof) for all N regressions.

If the criterion was mean absolute deviation instead of sum of squares, this can be done with linear programming and maybe the output sensitivities could be used to get the coefficients. Does anybody know of such a method?

thanks again

Subject: deleting observations in linear regression

From: Tom Lane

Date: 19 May, 2011 21:28:05

Message: 4 of 4

> I've done some googling and it appears that there are methods - and PRESS
> seems to be one - to compute the leave-out-one residuals, but I need the
> actual regression coefficients (or decent estimates thereof) for all N
> regressions.

The technique for computing this is described in the regression book by
Belsley, Kuh, and Welsch (BKW). It's probably described lots of other places
as well. If memory serves, it's based on the Sherman-Morrison-Woodbury
theorem for performing rank-one updating of matrix operations. It's part of
the process of computing the dfbeta or dfbetas statistic that BKW propose.
You may get some useful search terms out of this.

If you have the Statistics Toolbox, you might take a look inside the
regstats function. The variable b_i is what you want. You could probably
extract the lines you need from that file to compute just this thing.

To test this, I set a break point in regstats and examined the b_i values
with a separate regression.

>> load hald
>> s = regstats(heat,ingredients,'linear','dfbetas')
   < stop right after b_i is created
K>> b_i(:,1:3)
ans =
       62.485 48.008 138.59
       1.5505 1.7392 0.78739
       0.5093 0.65465 -0.26777
      0.10135 0.2682 -0.70993
     -0.14498 -0.013866 -0.91617
K>> dbquit
>> X(2:end,:)\heat(2:end)
ans =
       62.485
       1.5505
       0.5093
      0.10135
     -0.14498
>>

-- Tom

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us