Thread Subject: Polyfit

Subject: Polyfit

From: Karan

Date: 1 Nov, 2007 02:27:26

Message: 1 of 13

How to force y intercept to be zero using function PolyFit
(x, y, 1)

Subject: Polyfit

From: Daniel Sutoyo

Date: 1 Nov, 2007 07:11:10

Message: 2 of 13

Check this post

http://www.mathworks.com/support/solutions/data/1-12BBUC.html?product=OP&solution=1-12BBUC

Daniel Sutoyo

"Karan " <ksumbaly@hotmail.com> wrote in message
<fgbdie$5mp$1@fred.mathworks.com>...
> How to force y intercept to be zero using function PolyFit
> (x, y, 1)

Subject: Polyfit

From: John D'Errico

Date: 1 Nov, 2007 07:32:39

Message: 3 of 13

"Daniel Sutoyo" <dsutoyo@gmail.com> wrote in message <fgbu6e$bve
$1@fred.mathworks.com>...
> Check this post
>
> http://www.mathworks.com/support/solutions/data/1-12BBUC.html?
product=OP&solution=1-12BBUC
>
> Daniel Sutoyo
>
> "Karan " <ksumbaly@hotmail.com> wrote in message
> <fgbdie$5mp$1@fred.mathworks.com>...
> > How to force y intercept to be zero using function PolyFit
> > (x, y, 1)

EXCEPT!

The link posted suggests the user must use
fmincon from the optimization toolbox! This
is the Matlab equivalent of using a Mack truck
to carry a pea to Boston. No toolboxes are even
remotely necessary to solve this problem.

In the simple case of fitting a straight line with
a zero intercept, the constant term will be zero
in the model. So we can solve for the slope of
the fitted line by simply

  slope = x\y;

where x and y are column vectors. If you wish
it to be a polynomial as polyfit would return,
just do this:

  P = [slope,0];

Even in the far more general case, where the
asker might have wanted to fit a higher order
polynomial through some general point, the
use of fmincon is still not necessary. There
I would have suggested using lsqlin from the
optimization toolbox, or even one of my own
tools from the file exchange. The use of
fmincon is wild overkill for this problem.

John

Subject: Polyfit

From: Phil Wallhead

Date: 20 Dec, 2007 14:44:56

Message: 4 of 13


John's method is quick but dirty in the sense that it will
weight the fit preferentially towards low x values. This
is because you have scaled the errors in the data by x, so
that the homoscedasticity assumption (constant error
variance) does not apply if it applies to the original
data (polyfit uses least squares estimation which makes
use the Guass-Markov assumptions). To avoid this why not
try the good-old general formula for multiple linear
regression (see e.g. Wikipedia):

beta = inv(X'*X)*X'*y

where X is the matrix of n data by p explanatory function
values (=x for the simple slope fit), y is your data
vector, and beta is the vector of p least-squares
parameter estimates (=slope for slope fit).

Good luck!
Phil



"John D'Errico" <woodchips@rochester.rr.com> wrote in
message <fgbven$rf3$1@fred.mathworks.com>...
> "Daniel Sutoyo" <dsutoyo@gmail.com> wrote in message
<fgbu6e$bve
> $1@fred.mathworks.com>...
> > Check this post
> >
> > http://www.mathworks.com/support/solutions/data/1-
12BBUC.html?
> product=OP&solution=1-12BBUC
> >
> > Daniel Sutoyo
> >
> > "Karan " <ksumbaly@hotmail.com> wrote in message
> > <fgbdie$5mp$1@fred.mathworks.com>...
> > > How to force y intercept to be zero using function
PolyFit
> > > (x, y, 1)
>
> EXCEPT!
>
> The link posted suggests the user must use
> fmincon from the optimization toolbox! This
> is the Matlab equivalent of using a Mack truck
> to carry a pea to Boston. No toolboxes are even
> remotely necessary to solve this problem.
>
> In the simple case of fitting a straight line with
> a zero intercept, the constant term will be zero
> in the model. So we can solve for the slope of
> the fitted line by simply
>
> slope = x\y;
>
> where x and y are column vectors. If you wish
> it to be a polynomial as polyfit would return,
> just do this:
>
> P = [slope,0];
>
> Even in the far more general case, where the
> asker might have wanted to fit a higher order
> polynomial through some general point, the
> use of fmincon is still not necessary. There
> I would have suggested using lsqlin from the
> optimization toolbox, or even one of my own
> tools from the file exchange. The use of
> fmincon is wild overkill for this problem.
>
> John

Subject: Polyfit

From: John D'Errico

Date: 20 Dec, 2007 15:16:03

Message: 5 of 13

"Phil Wallhead" <pjw5@noc.soton.ac.uk> wrote in message
<fkdv58$psb$1@fred.mathworks.com>...
>
> John's method is quick but dirty in the sense that it will
> weight the fit preferentially towards low x values.

WRONG!!!!!!!

The method that I suggested is to use

x\y

This computes the linear regression both
accurately and more efficiently than does
the normal equations method that you
later recommend.

You are perhaps thinking of a transformation
to linearity using a log transformation in
your response, but x\y does nothing of the
sort.


> This
> is because you have scaled the errors in the data by x, so
> that the homoscedasticity assumption (constant error
> variance) does not apply if it applies to the original
> data (polyfit uses least squares estimation which makes
> use the Guass-Markov assumptions).

WRONG WRONG WRONG!

x\y does nothing of the sort!


> To avoid this why not
> try the good-old general formula for multiple linear
> regression (see e.g. Wikipedia):
>
> beta = inv(X'*X)*X'*y

BAD!!!!!!!

This is a very POOR way to implement this
computation. Please get your mathematics
right if you are going to correct them.

Why is this bad? It uses a matrix inverse,
a less efficient way to solve the system.

It also squares the condition number of your
system, a bad thing to do for accuracy.

 
> where X is the matrix of n data by p explanatory function
> values (=x for the simple slope fit), y is your data
> vector, and beta is the vector of p least-squares
> parameter estimates (=slope for slope fit).

Please get your numerical analysis right.

Just because you find something on
Wikipedia, does NOT mean that it is any
good.

John

Subject: Polyfit

From: Bruno Luong

Date: 20 Dec, 2007 19:11:35

Message: 6 of 13

"Phil Wallhead" <pjw5@noc.soton.ac.uk> wrote in message
<fkdv58$psb$1@fred.mathworks.com>...
>
> John's method is quick but dirty

Quick but certainly NOT dirty. It gives a least-square
estimation without "weighting" effect as you wrote Phil.
Please revise your algebra.

Bruno

Subject: Polyfit

From: Phil Wallhead

Date: 5 Jan, 2008 11:32:58

Message: 7 of 13


Apologies John, I wasn't aware of this use of \ and
thought you meant: beta = polyfit(x,x.\y,n).
So, for a polynomial regression:

beta = X\y

is the numerically hygeinic solution in my notation.

Phil


"John D'Errico" <woodchips@rochester.rr.com> wrote in
message <fke0vj$miq$1@fred.mathworks.com>...
> "Phil Wallhead" <pjw5@noc.soton.ac.uk> wrote in message
> <fkdv58$psb$1@fred.mathworks.com>...
> >
> > John's method is quick but dirty in the sense that it
will
> > weight the fit preferentially towards low x values.
>
> WRONG!!!!!!!
>
> The method that I suggested is to use
>
> x\y
>
> This computes the linear regression both
> accurately and more efficiently than does
> the normal equations method that you
> later recommend.
>
> You are perhaps thinking of a transformation
> to linearity using a log transformation in
> your response, but x\y does nothing of the
> sort.
>
>
> > This
> > is because you have scaled the errors in the data by
x, so
> > that the homoscedasticity assumption (constant error
> > variance) does not apply if it applies to the original
> > data (polyfit uses least squares estimation which
makes
> > use the Guass-Markov assumptions).
>
> WRONG WRONG WRONG!
>
> x\y does nothing of the sort!
>
>
> > To avoid this why not
> > try the good-old general formula for multiple linear
> > regression (see e.g. Wikipedia):
> >
> > beta = inv(X'*X)*X'*y
>
> BAD!!!!!!!
>
> This is a very POOR way to implement this
> computation. Please get your mathematics
> right if you are going to correct them.
>
> Why is this bad? It uses a matrix inverse,
> a less efficient way to solve the system.
>
> It also squares the condition number of your
> system, a bad thing to do for accuracy.
>
>
> > where X is the matrix of n data by p explanatory
function
> > values (=x for the simple slope fit), y is your data
> > vector, and beta is the vector of p least-squares
> > parameter estimates (=slope for slope fit).
>
> Please get your numerical analysis right.
>
> Just because you find something on
> Wikipedia, does NOT mean that it is any
> good.
>
> John
>
>

Subject: Polyfit

From: Tom

Date: 12 Jan, 2010 04:47:03

Message: 8 of 13

A late comer to this party apologizes if the following question is trivial, but....
S = x\y gives an n-over-n matrix when x is a vector of length n. How do I extract the scalar fit parameter?

Subject: Polyfit

From: Matt J

Date: 12 Jan, 2010 06:26:02

Message: 9 of 13

"Tom" <tomdoe.nospam@mathworks.com> wrote in message <higus7$19a$1@fred.mathworks.com>...
> A late comer to this party apologizes if the following question is trivial, but....
> S = x\y gives an n-over-n matrix when x is a vector of length n. How do I extract the scalar fit parameter?

The discussion above assumes that x and y are column vectors, whereas you clearly have them shaped as row vectors. So, what you really want is S=x(:)\y(:)

Subject: Polyfit

From: Greg Heath

Date: 12 Jan, 2010 07:36:36

Message: 10 of 13

On Jan 11, 11:47 pm, "Tom" <tomdoe.nos...@mathworks.com> wrote:
> A late comer to this party apologizes if the following question is trivial, but....
> S = x\y gives an n-over-n matrix when x is a vector of length n.

No, it gives a constant.

>How do I extract the scalar fit parameter?

The constant S is the scalar fit parameter.

clear all, clc

randn('state',4151941)
x = randn(10,1);
y = pi*x + 0.4*std(pi*x).*randn(10,1);
EVR = var(pi*x)/var(y) % 0.74167 "Explainable" Variance Ratio
figure(1),plot(x,y,'o'),hold on

% No intercept Linear Model
S = x\y % 3.1777
yhat = x*S % model
EVR = var(yhat)/var(y) % 0.75883 "Explained" Variance Ratio
plot(x,yhat,'r')
e = y-yhat % error
MSE = mse(e) % 1.9033 mean-squared error

%All intercept Constant Model
y00 = mean(y) % 0.022617
e00 = y - mean(y) % error
MSE00 = mse(e00) % 8.0045

R2 = 1 - MSE/MSE00 % R^2 = 0.76222 (So-called EVR)

However, does it make sense to compare a no-intercept
model with an all intercept model???

Hope this helps.

Greg

Subject: Polyfit

From: Matt J

Date: 12 Jan, 2010 08:05:21

Message: 11 of 13

Greg Heath <heath@alumni.brown.edu> wrote in message <ad61b94c-4d06-499f-be65-62ee11351cdc@m3g2000yqf.googlegroups.com>...
> On Jan 11, 11:47 pm, "Tom" <tomdoe.nos...@mathworks.com> wrote:
> > A late comer to this party apologizes if the following question is trivial, but....
> > S = x\y gives an n-over-n matrix when x is a vector of length n.
>
> No, it gives a constant.

Not if x and y are row vectors:

>> x=[1 2 3]; y=x;

>> S=x\y

S =

         0 0 0
         0 0 0
    0.3333 0.6667 1.0000

>> S=x(:)\y(:)

S =

    1.0000

Subject: Polyfit

From: Tom

Date: 12 Jan, 2010 11:26:04

Message: 12 of 13

I apologize for the confusion; thank you for the clear reply, Matt J.

Subject: Polyfit

From: Mark

Date: 29 Jun, 2011 23:14:55

Message: 13 of 13

I have always admired John D'Errico's extremely elegant and erudite solutions. This case is no different. The use of the backslash operator to solve what is essentially a linear least-squares fit problem can be generalized to find the coefficients of nearly any linear combination of terms.

In this particular case (forcing the y-intercept to zero) there is an even simpler solution. If you assume that the data [X,Y] can be fit to a polynomial P=POLYFIT(X,Y,N) which passes through the origin, then P(N+1)=0, and the polynomial can be recast as the product of X and the polynomial Q=POLYFIT(X,Y./X,N-1), that is P=CONV([1,0],Q), which is polynomial multiplication of X and the polynomial Q (n.b.: [1 0] is X written as a MATLAB polynomial).

e.g.: Say N=2, then Y=P(1)*X^2+P(2)*X+P(3), but P(3)=0 since we want the polynomial P to go through the origin, so Y=P(1)*X^2+P(2)*X, or factoring out X, Y=X*(P(1)*X+P(2)) = X*(Q(1)*X+Q(2)) = X*Z. Solve this equation for Z and Z = Y./X. Then you can find the polynomial Q that fits the data [X,Z]. Finally convolve the two polynomials back together using the polynomial [1 0] for X.

The one caveat would be if X=0 for any data, which will yield Z=NaN if Y=0 for the same ordered pair, or Z=Inf otherwise. Probably best to leave those out when fitting Q as POLYFIT will return all NaN if any data are NaN

The function below performs the tasks described above. It could be adapted to a script easily as it is only 5 commands. Even though it's not as cool and general as using the backslash operator, for this particular task using CONV seems easier to me.

function pZero = polyfitZero(x,y,n)
z = y./x;
znozero = z(x~=0);xnozero = x(x~=0);
q = polyfit(xnozero,znozero,n-1);
pZero = conv([1 0],q);
end

Happy hunting!

"John D'Errico" <woodchips@rochester.rr.com> wrote in message <fgbven$rf3$1@fred.mathworks.com>...
> "Daniel Sutoyo" <dsutoyo@gmail.com> wrote in message <fgbu6e$bve
> $1@fred.mathworks.com>...
> > Check this post
> >
> > http://www.mathworks.com/support/solutions/data/1-12BBUC.html?
> product=OP&solution=1-12BBUC
> >
> > Daniel Sutoyo
> >
> > "Karan " <ksumbaly@hotmail.com> wrote in message
> > <fgbdie$5mp$1@fred.mathworks.com>...
> > > How to force y intercept to be zero using function PolyFit
> > > (x, y, 1)
>
> EXCEPT!
>
> The link posted suggests the user must use
> fmincon from the optimization toolbox! This
> is the Matlab equivalent of using a Mack truck
> to carry a pea to Boston. No toolboxes are even
> remotely necessary to solve this problem.
>
> In the simple case of fitting a straight line with
> a zero intercept, the constant term will be zero
> in the model. So we can solve for the slope of
> the fitted line by simply
>
> slope = x\y;
>
> where x and y are column vectors. If you wish
> it to be a polynomial as polyfit would return,
> just do this:
>
> P = [slope,0];
>
> Even in the far more general case, where the
> asker might have wanted to fit a higher order
> polynomial through some general point, the
> use of fmincon is still not necessary. There
> I would have suggested using lsqlin from the
> optimization toolbox, or even one of my own
> tools from the file exchange. The use of
> fmincon is wild overkill for this problem.
>
> John

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
polyfit Mark Mikofski 29 Jun, 2011 19:19:14
yintercept Mark Mikofski 29 Jun, 2011 19:19:14
backslash Mark Mikofski 29 Jun, 2011 19:19:14
mldivide Mark Mikofski 29 Jun, 2011 19:19:14
linear algebra Mark Mikofski 29 Jun, 2011 19:19:14
leastsquares fit Mark Mikofski 29 Jun, 2011 19:19:14
rssFeed for this Thread

Contact us at files@mathworks.com