Asked by Prashanth Ravindran
on 23 Feb 2016

I am trying to estimate the linear regression coefficients from mathematical equations. But I get different results using standard function and the mathematical equation which is β = inverse(X'X)X'Y. But I get different results. Why does that happen?

Here is the code:

% X = input data

% Y = outcome

% Using the fitlm command to estiamte the multiple liner regression model

lin_mdl = fitlm(X,Y);

b1 = lin_mdl.Coefficients.Estimate;

% Using the regress command to estiamte the multiple liner regression model

X1= [ones(size(X,1),1) X];

b2 = regress(Y,X1)

% Using mathematical equation

b3 = inv(X1'*X1)*X1'*Y;

% Comparing the coefficients

[b1 b2 b3]

And the output is:

ans =

1.0e+05 *

0.0002 0.0002 -5.6828

-0.0000 -0.0000 -0.0758

0.0000 0.0000 -0.0092

-0.0001 -0.0001 -0.1538

-0.0000 -0.0000 -0.0023

-0.0000 -0.0000 -0.2201

0.0000 0.0000 0.4286

0.0000 0.0000 -0.0009

0.0000 0.0000 0.1575

-0.0000 -0.0000 -0.3488

0.0000 0.0000 0.0040

-0.0000 -0.0000 -0.0057

0 0 -7.1398

0.0014 0.0014 -0.5267

0.0004 0.0004 0.0004

-0.0001 -0.0001 -0.0001

0.0000 0.0000 0.0000

0.0000 0.0000 0.0000

0.0000 0.0000 0.0000

0.0000 0.0000 0.0000

0.0000 0.0000 0.0000

-0.0000 -0.0000 -0.0000

-0.0000 -0.0000 -0.0000

0.0000 0.0000 0.0000

-0.0002 -0.0002 -0.0002

-0.0002 -0.0002 -0.0002

-0.0002 -0.0002 -0.0002

-0.0001 -0.0001 -0.0001

-0.0001 -0.0001 -0.0001

-0.0002 -0.0002 -0.0002

-0.0003 -0.0003 -0.0003

-0.0002 -0.0002 -0.0002

-0.0003 -0.0003 -0.0003

-0.0003 -0.0003 -0.0003

-0.0003 -0.0003 -0.0003

-0.0001 -0.0001 -0.0001

-0.0001 -0.0001 -0.0001

0.0000 0.0000 0.0000

0.0000 0.0000 0.0000

0.0001 0.0001 0.0001

0.0001 0.0001 0.0001

0.0001 0.0001 0.0001

-0.0002 -0.0002 -0.0002

-0.0001 -0.0001 -0.0001

-0.0001 -0.0001 -0.0001

-0.0002 -0.0002 -0.0002

-0.0002 -0.0002 -0.0002

-0.0001 -0.0001 -0.0001

-0.0003 -0.0003 -0.0003

Now the outputs by the mathematical equation is different from fitlm (or regress) function. Why is that? The correlation matrix as obtained by command corr(X) can be visualized as follows:

Answer by Tom Lane
on 23 Feb 2016

The first two columns of coefficients have what appear to be exact zeros in row 13, corresponding to column 12 of X because of the constant. I suggest you try fitting a model with column 12 of X as the output (response) variable and the rest of X as the input (predictor) variables. I suspect you will find that column 12 is very close to an exact linear function of some set of other columns.

Inverting X'*X is notoriously ill-conditioned. Another way to do this is b=X1\Y, which is in principle the same thing but better conditioned.

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.