Fastest way of computing standard errors / inverting X'X

5 views (last 30 days)
Hello all,
I'm running a Monte Carlo study and need to evaluate many linear regressions y = X b + u very efficiently. Specifically, I need to estimate a regression and compute the standard errors of the estimates many times. So speed is very important, while accuracy is not too much so. Evidently, Matlab's built-in functions such as lscov and regress take a fair amount of time to run. Hence, I wrote a little function to do these tasks myself.
However, as I need to calculate the standard errors, the function is still quite slow. I use inv(X' * X) to get the inverse, as it is used in the calculation of the standard errors (and it is evidently faster than X' * X \ eye(size(X, 2))). Is there a faster way of doing this, i.e. by some smart factorizations? I saw a suggestion to use a QR decomposition, and then use inv( R ) for calculating inv(X'* X) but this is even slower.
All the help is greatly appreciated!

Accepted Answer

Teja Muppirala
Teja Muppirala on 4 Jan 2012
You can use symbolic math to find the inverse of a symmetric 3x3 matrix:
syms a b c d e f real
M = diag(inv([a b c; b d e; c e f]))
And then use that expression directly. I find that this is several times faster than calling INV.
X = rand(100,3);
XX = X'*X;
denom = (XX(9)*XX(2)^2 - 2*XX(2)*XX(3)*XX(6) + XX(5)*XX(3)^2 + XX(1)*XX(6)^2 - XX(1)*XX(5)*XX(9));
invXX = [(XX(6)^2 - XX(5)*XX(9))/denom;
(XX(3)^2 - XX(1)*XX(9))/denom;
(XX(2)^2 - XX(1)*XX(5))/denom]
You can confirm that this is the same as diag(inv(X'*X)).
  1 Comment
mickey
mickey on 4 Jan 2012
Hey! Thanks so much for the answer. This indeed is faster, although not by several times, at least on my machine. It may be interesting to note that for finding the estimates it still seems better to calculate X \ y, instead of using the whole matrix (XX)^{-1} calculated as above, if I did not do any mistakes.

Sign in to comment.

More Answers (1)

Matt Tearle
Matt Tearle on 3 Jan 2012
inv is evil -- avoid! The quickest way to do a regression is
c = X\y;
then you can calculate errors yourself
resid = X*c - y;
% etc
but are you doing this numerous times for different y but the same X? If so, maybe an LU decomposition would help.
  10 Comments
Matt Tearle
Matt Tearle on 3 Jan 2012
When I test it, I get the same result for rinv*rinv' as inv(x'*x), to within roundoff. (Which is what should happen, from theory.) I thought extracting the non-zero rows of r might help dispose of a few redundant calculations. But if inv() is the bottleneck... Maybe bruteforce inverse calculation would help? This seems to be a tiny bit faster than inv:
[~,r] = qr(x);
rinv = diag(1./diag(r));
rinv(2,3) = -r(2,3)*rinv(3,3)/r(2,2);
rinv(1,2) = -r(1,2)*rinv(2,2)/r(1,1);
rinv(1,3) = (-r(1,2)*rinv(2,3)-r(1,3)*rinv(3,3))/r(1,1);
vcov = rinv*rinv';
But, after all that, I must have been doing something wrong before, because I'm now getting that inv(x'*x) is faster after all.
mickey
mickey on 4 Jan 2012
Hey! Thanks again for this answer. At the end I opted for Teja's way of calculation, as it seems faster for the smaller problem I have. For bigger problems, though, I think something like that should be preferred. :) Thanks!!

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!