MATLAB Answers

0

Why is fitlm (or regess) and estimation using mathematical equations giving different results?

Asked by Prashanth Ravindran on 23 Feb 2016
Latest activity Answered by Tom Lane
on 23 Feb 2016
I am trying to estimate the linear regression coefficients from mathematical equations. But I get different results using standard function and the mathematical equation which is β = inverse(X'X)X'Y. But I get different results. Why does that happen?
Here is the code:
% X = input data
% Y = outcome
% Using the fitlm command to estiamte the multiple liner regression model
lin_mdl = fitlm(X,Y);
b1 = lin_mdl.Coefficients.Estimate;
% Using the regress command to estiamte the multiple liner regression model
X1= [ones(size(X,1),1) X];
b2 = regress(Y,X1)
% Using mathematical equation
b3 = inv(X1'*X1)*X1'*Y;
% Comparing the coefficients
[b1 b2 b3]
And the output is:
ans =
1.0e+05 *
0.0002 0.0002 -5.6828
-0.0000 -0.0000 -0.0758
0.0000 0.0000 -0.0092
-0.0001 -0.0001 -0.1538
-0.0000 -0.0000 -0.0023
-0.0000 -0.0000 -0.2201
0.0000 0.0000 0.4286
0.0000 0.0000 -0.0009
0.0000 0.0000 0.1575
-0.0000 -0.0000 -0.3488
0.0000 0.0000 0.0040
-0.0000 -0.0000 -0.0057
0 0 -7.1398
0.0014 0.0014 -0.5267
0.0004 0.0004 0.0004
-0.0001 -0.0001 -0.0001
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
-0.0000 -0.0000 -0.0000
-0.0000 -0.0000 -0.0000
0.0000 0.0000 0.0000
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
-0.0002 -0.0002 -0.0002
-0.0003 -0.0003 -0.0003
-0.0002 -0.0002 -0.0002
-0.0003 -0.0003 -0.0003
-0.0003 -0.0003 -0.0003
-0.0003 -0.0003 -0.0003
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0001 0.0001 0.0001
0.0001 0.0001 0.0001
0.0001 0.0001 0.0001
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0003 -0.0003 -0.0003
Now the outputs by the mathematical equation is different from fitlm (or regress) function. Why is that? The correlation matrix as obtained by command corr(X) can be visualized as follows:

  0 Comments

Sign in to comment.

1 Answer

Answer by Tom Lane
on 23 Feb 2016

The first two columns of coefficients have what appear to be exact zeros in row 13, corresponding to column 12 of X because of the constant. I suggest you try fitting a model with column 12 of X as the output (response) variable and the rest of X as the input (predictor) variables. I suspect you will find that column 12 is very close to an exact linear function of some set of other columns.
Inverting X'*X is notoriously ill-conditioned. Another way to do this is b=X1\Y, which is in principle the same thing but better conditioned.

  0 Comments

Sign in to comment.