# Multiple Linear Regression creates same results as the Target values

4 views (last 30 days)
mr mo on 12 Feb 2018
Commented: Jelle on 13 Feb 2018
Hi. I have to use the Multiple linear Regression in Matlab. I have the X matrix of size 10*1000 and Y Matrix of size 10*1. I use this code
b = regress(Y,X);
and then I use this
s=find(b~=0)
to find the values of vector b that are inequal to zero. assume that this is the result of above code
s= 2 7 95 172 290 333 471 560 680 890
then I use this
for i=1:size(X,1)
Z(i,1)=X(i,2)*b(2,1)+X(i,7)*b(7,1)+X(i,95)*b(95,1)+X(i,172)*b(172,1)+
X(i,290)*b(290,1)+X(i,333)*b(333,1)+X(i,471)*b(471,1)+X(i,560)*b(560,1)+
X(i,680)*b(680,1)+X(i,890)*b(890,1);
end
to create the output values.
But the output values is same as the Y values.
Did I make a mistake some where or the code is true ?
Thanks a lot.

Jelle on 13 Feb 2018
If you have more predictors in your regression than you have values to predict you will get a perfect prediction model.
In other words: If you have 10 measurements, and 10 variables that could be related to these measurements, you can fit a model that perfectly explains the variation in the measured variables. This is called overfitting. In practice, only 1 or 2 variables really explain what is going on. The rest is just 'filling the gaps'
I would recommend looking at stepwise regression if you do not know which of the 1000 predictors in X are the ones you need. The check the p-value for the predictor to see how likely it is that predictor is just a random variable.
##### 2 CommentsShowHide 1 older comment
Jelle on 13 Feb 2018
Cannot help you with the actual implementation: I have not used the stepwise model in matlab. It is here, and seems straightforward though: https://au.mathworks.com/help/stats/stepwisefit.html
Rather, I have build a stepwise fitter myself, as my goal was not the best fitting regression, but an estimate of the reliability of the fitting (https://au.mathworks.com/matlabcentral/answers/379468-vectorizing-a-repeated-regression).