polyfit doesn't fit the data

25 views (last 30 days)
>> mdl=fitlm(X,Y)
mdl =
Linear regression model:
y ~ 1 + x1
Estimated Coefficients:
Estimate SE tStat pValue
________ _______ _______ __________
(Intercept) 16.325 1.8303 8.9194 1.054e-16
x1 -1.1809 0.14751 -8.0059 4.5928e-14
Number of observations: 250, Error degrees of freedom: 248
Root Mean Squared Error: 1.19
R-squared: 0.205, Adjusted R-Squared 0.202
F-statistic vs. constant model: 64.1, p-value = 4.59e-14
>> plot(X,Y,'.')
hold on
plot(X,polyval(p,X),'r.')
hold on
f=16.3250-1.1809*X
plot(X,f,'.b')
Attached figure:
in blue: data
in red: polyfit regression
in green: fitlm model
polyfit does not fit the data whereas fitlm does. Is there anything I can do to fix that? I would rather not use fitlm as I have to do thousand of regressions and it seems more complex and using more memory
  5 Comments
John D'Errico
John D'Errico on 19 May 2015
Usually, when someone says something like this, they did not really do what they think they did. The fact is, polyfit WILL generate the same model as does fitlm. Here for example, you don't show where you did the polyfit call. Did you use the same data? What were the coefficients produced by polyfit?
florence briton
florence briton on 19 May 2015
Edited: Matt J on 19 May 2015
Yes I called polyfit on the same data. I just did it again (variables have another name but they are the same)
>> [p,s,mu]=polyfit(logX,logY,1);
>> plot(logX,logY,'.')
hold on
plot(logX,polyval(p,logX),'r.')
polyfit produces:
p=[-0.6039,1.6844]
mu=[12.3973;0.5114]
s.R=[15.7797,-5.7510e-14;0,-15.8114]
s.df=248
s.normr=18.7448

Sign in to comment.

Accepted Answer

Titus Edelhofer
Titus Edelhofer on 19 May 2015
Hi,
I guess it's another problem here: note, that if you call polyfit with
[p,s,mu] = polyfit ...
you get the scaled and shifted representation of the polynomial. Therefore you would need to take mu to transform logX->\hat logX (see doc of polyfit).
If you call polyfit only with output p you get a different result that coincides with fitlm.
Titus
  2 Comments
florence briton
florence briton on 19 May 2015
Thank you for your answer Titus, I didn't notice that point
Geoff
Geoff on 21 May 2018
I realize this post is now three years old but as of 2018a, the help file describing polyfit is not clear.
In the help file, p is described as the coefficients of the unscaled and uncentered data. In the description of the [p,S,mu] syntax, however, there is no mention that the returned p is now of the scaled and centered data.
Perhaps Mathworks should change their description and use [phat,S,mu] rather than p, or automatically rescale polynomial coefficients back to the original space if scaling and centering are used.

Sign in to comment.

More Answers (3)

John D'Errico
John D'Errico on 19 May 2015
Edited: John D'Errico on 19 May 2015
GIVE US A BREAK!
You used polyfit on the log of the data! But you used fitlm on the unlogged data! Here are the calls you yourself showed:
mdl=fitlm(X,Y)
[p,s,mu]=polyfit(logX,logY,1);
I've just copied what you yourself typed. Don't tell us that the variables are the same, but they just have a different name. Show us what happens when you use polyfit like this:
p = polyfit(X,Y,1)
Please tell us why we should be surprised that there is a difference. If you change the data, then expect to get a different answer.
Next, READ THE HELP! From the help for polyfit, it tells us that when you call it with THREE output arguments, it performs a centered and scaled regression.
[P,S,MU] = polyfit(X,Y,N) finds the coefficients of a polynomial in
XHAT = (X-MU(1))/MU(2) where MU(1) = MEAN(X) and MU(2) = STD(X). This
centering and scaling transformation improves the numerical properties
of both the polynomial and the fitting algorithm.
So in order to predict the result, you need to use the centered and scaled variable.
XHAT = (X-MU(1))/MU(2);
where MU(1) = MEAN(X) and MU(2) = STD(X). For example...
X = rand(10,1);
X = 10 + 100*rand(10,1);
Y = rand(size(X));
p = polyfit(X,Y,1)
p =
-0.0047359 0.94733
[p,S,mu] = polyfit(X,Y,1)
p =
-0.15668 0.58731
S =
R: [2x2 double]
df: 8
normr: 0.7445
mu =
76.021
33.084
See that there IS a difference in the coefficients produced. READ THE HELP. What you actually did wrong is only for you to know, since you have not proved to us that logX and X are truly the same thing, as with logY and Y.
Ok. Since you actually gave us the results from polyfit, lets try something:
p=[-0.6039,1.6844];
mu=[12.3973;0.5114];
syms X Y
xhat = (X - mu(1))/mu(2);
yhat = p(1)*xhat + p(2);
vpa(yhat,10)
ans =
16.32407436 - 1.180876027*X
AGAIN, IF you use polyfit with THREE output arguments, it produces a DIFFERENT model. You can recover the untransformed model as I did, but if you can't bother to read the help, what do you expect?

florence briton
florence briton on 19 May 2015
No no it's just that I changed the names but it is confusing I am sorry. I do the whole thing again:
>> [p,s,mu]=polyfit(logX,logY,1);
polyfit produces:
p=[-0.6039,1.6844] mu=[12.3973;0.5114] s.R=[15.7797,-5.7510e-14;0,-15.8114] s.df=248 s.normr=18.7448
>> mdl=fitlm(logX,logY)
mdl =
Linear regression model: y ~ 1 + x1
Estimated Coefficients: Estimate SE tStat pValue ______ _____ _____ ________
(Intercept) 16.325 1.8303 8.9194 1.054e-16
x1 -1.1809 0.14751 -8.0059 4.5928e-14
Number of observations: 250, Error degrees of freedom: 248 Root Mean Squared Error: 1.19 R-squared: 0.205, Adjusted R-Squared 0.202 F-statistic vs. constant model: 64.1, p-value = 4.59e-14
>> plot(logX,logY,'.')
hold on
plot(logX,polyval(p,logX),'r.')
hold on
f=16.3250-1.1809*logX;
plot(logX,f,'.g')
  2 Comments
John D'Errico
John D'Errico on 19 May 2015
Edited: John D'Errico on 19 May 2015
ARGH!
READ THE ANSWERS!
First, PROVE TO USE THAT X AND logX ARE THE SAME. For example:
min(X-logX)
max(X-logX)
should both be essentially zero.
Then call polyfit using a call that WILL PRODUCE the same result!
p = polyfit(X,Y,1)
READ THE HELP FOR POLYFIT! READ THE ANSWERS. Don't just keep repeating the same nonsense. (See my edit to my answer, where I show how the coefficients that you got ARE the correct coefficients but for a different model.)
florence briton
florence briton on 19 May 2015
Well, you edited your answer while I was writing, so couldn't see it... Anyway the problem is solved, I didn't see that the regression was centered and normalized.
Thank you for everything and most of all for being so polite

Sign in to comment.


Augusto Samussone
Augusto Samussone on 17 Mar 2021
slime

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!