Thread Subject: forcing to pass origin in linear regression

Subject: forcing to pass origin in linear regression

From: Young Ryu

Date: 12 Jul, 2008 00:53:24

Message: 1 of 11

Hi

I have two arrays:

A=[1, 2, 3, 4, 5]
B=[2, 4, 6, 7, 11]

I'd like to make a linear regression between A and B, but
want to force the regression line to pass the origin. Can
you help me how to do this?

Thanks!

Subject: forcing to pass origin in linear regression

From: Roger Stafford

Date: 12 Jul, 2008 01:32:04

Message: 2 of 11

"Young Ryu" <ryuyr77@gmail.com> wrote in message <g58va4$fep
$1@fred.mathworks.com>...
> Hi
>
> I have two arrays:
>
> A=[1, 2, 3, 4, 5]
> B=[2, 4, 6, 7, 11]
>
> I'd like to make a linear regression between A and B, but
> want to force the regression line to pass the origin. Can
> you help me how to do this?
>
> Thanks!

  Which is to be the independent variable and which the dependent variable?
I'll assume A is the independent one. If so, to find the slope m so as to
minimize the L2 norm(B-m*A), do this:

 m = sum(A.*B)/sum(A.^2);

  Your line of regression is then b = m*a. You can demonstrate that this is
correct by setting the derivative with respect to m of the above norm to zero
and solving for m.

Roger Stafford

Subject: forcing to pass origin in linear regression

From: Greg Heath

Date: 12 Jul, 2008 04:47:24

Message: 3 of 11

On Jul 11, 8:53=A0pm, "Young Ryu" <ryuy...@gmail.com> wrote:
> Hi
>
> I have two arrays:
>
> A=3D[1, 2, 3, 4, 5]
> B=3D[2, 4, 6, 7, 11]
>
> I'd like to make a linear regression between A and B, but
> want to force the regression line to pass the origin. Can
> you help me how to do this?

m =3D B/A % B =3D m*A

Hope this helps.

Greg

Subject: forcing to pass origin in linear regression

From: Per Sundqvist

Date: 12 Jul, 2008 11:04:02

Message: 4 of 11

Greg Heath <heath@alumni.brown.edu> wrote in message
<9781e69c-6905-4cec-8942-26eefecc4ca6@m36g2000hse.googlegroups.com>...
> On Jul 11, 8:53=A0pm, "Young Ryu" <ryuy...@gmail.com> wrote:
> > Hi
> >
> > I have two arrays:
> >
> > A=3D[1, 2, 3, 4, 5]
> > B=3D[2, 4, 6, 7, 11]
> >
> > I'd like to make a linear regression between A and B, but
> > want to force the regression line to pass the origin. Can
> > you help me how to do this?
>
> m =3D B/A % B =3D m*A
>
> Hope this helps.
>
> Greg
in this case i think the solution is just simply:
k=mean(B./A)

where y=k*x is the linear line through origin.

Subject: forcing to pass origin in linear regression

From: John D'Errico

Date: 12 Jul, 2008 12:02:01

Message: 5 of 11

"Per Sundqvist" <sunkan@fy.chalmers.se> wrote in message
<g5a332$kth$1@fred.mathworks.com>...
> Greg Heath <heath@alumni.brown.edu> wrote in message
> <9781e69c-6905-4cec-8942-
26eefecc4ca6@m36g2000hse.googlegroups.com>...
> > On Jul 11, 8:53=A0pm, "Young Ryu" <ryuy...@gmail.com> wrote:
> > > Hi
> > >
> > > I have two arrays:
> > >
> > > A=3D[1, 2, 3, 4, 5]
> > > B=3D[2, 4, 6, 7, 11]
> > >
> > > I'd like to make a linear regression between A and B, but
> > > want to force the regression line to pass the origin. Can
> > > you help me how to do this?
> >
> > m =3D B/A % B =3D m*A
> >
> > Hope this helps.
> >
> > Greg
> in this case i think the solution is just simply:
> k=mean(B./A)
>
> where y=k*x is the linear line through origin.

Sorry, but this is a POOR way to solve the
problem.

What happens if any of the elements of A
is exactly zero? Yes, you get a divide by
zero. The resulting inf or NaN will be
detrimental to computing the mean.

Almost as bad is that this is NOT the least
squares solution for the problem. You will
get a weighted solution, where the weights
are inversely proportional to the value of
x at any point. So data points with small
values for x will be weighted very highly.
But, why should those particular points be
important to compute the slope?

The least squares solution to the problem

B=m*A

is just

m = A(:)\B(:);

I chose this form to be insensitive to
whether a and B are row and column
vectors.

HTH,
John

Subject: forcing to pass origin in linear regression

From: Greg Heath

Date: 12 Jul, 2008 13:54:46

Message: 6 of 11

On Jul 12, 7:04=A0am, "Per Sundqvist" <sun...@fy.chalmers.se> wrote:
> Greg Heath <he...@alumni.brown.edu> wrote in message
>
> <9781e69c-6905-4cec-8942-26eefecc4...@m36g2000hse.googlegroups.com>...
> > On Jul 11, 8:53=3DA0pm, "Young Ryu" <ryuy...@gmail.com> wrote:
> > > Hi
>
> > > I have two arrays:
>
> > > A=3D3D[1, 2, 3, 4, 5]
> > > B=3D3D[2, 4, 6, 7, 11]
>
> > > I'd like to make a linear regression between A and B, but
> > > want to force the regression line to pass the origin. Can
> > > you help me how to do this?
>
> > m =3D B/A =A0 =A0 % B =3D m*A
>
> > Hope this helps.
>
> > Greg
>
> in this case i think the solution is just simply:
> k=3Dmean(B./A)
>
> where y=3Dk*x is the linear line through origin

No.

The LMSE solution is B/A. Within roundoff error,
Roger's solution is equivalent. However, your
MSE is always larger.


close all, clear all, clc
n =3D 30;
m =3D 100;
randn('state',0)
for i =3D 1:m
    A =3D (1:n);
    m0 =3D abs(randn);
    B0 =3D m0*A;
    B =3D B0 + 0.2*std(B0)*randn(1,n);
    m1 =3D sum(A.*B)/sum(A.^2);
    m2 =3D B/A;
    m3 =3D mean(B./A);
    MSE1(i,1) =3D mse(B-m1*A); % Roger
    MSE2(i,1) =3D mse(B-m2*A); % Greg
    MSE3(i,1) =3D mse(B-m3*A); % Per
end

D12 =3D MSE1 - MSE2;
D32 =3D MSE3 - MSE2;
E12 =3D max(abs(D12)) % 6.2172e-015
E32 =3D max(abs(D32)) % 23.0877

NLTEQ32 =3D sum(MSE3<=3DMSE2) % 0

plot(MSE2,D12,'.'), hold on
plot(MSE2,D32,'.r')

Hope this helps,

Greg

Subject: forcing to pass origin in linear regression

From: Roger Stafford

Date: 12 Jul, 2008 16:34:02

Message: 7 of 11

"Young Ryu" <ryuyr77@gmail.com> wrote in message <g58va4$fep
$1@fred.mathworks.com>...
> Hi
>
> I have two arrays:
>
> A=[1, 2, 3, 4, 5]
> B=[2, 4, 6, 7, 11]
>
> I'd like to make a linear regression between A and B, but
> want to force the regression line to pass the origin. Can
> you help me how to do this?
>
> Thanks!

  Note that there is a different solution to this problem that minimizes the
mean square orthogonal distance of A, B pairs from the line b = m*a.

 [U,S,V] = svd([A(:) B(:)],0);
 m = -V(1,2)/V(2,2);

  If values in A and B are both subject to the same kind of error, it is a best-
fitting line in the least squares sense. However, this is not ordinarily referred
to as linear regression. This line will lie in the angle between the two
regression lines in which 1) A is the independent variable and 2) B is the
independent variable. These three lines are only equal in case the values in A
and B are exactly proportional.

Roger Stafford

Subject: forcing to pass origin in linear regression

From: Greg Heath

Date: 13 Jul, 2008 13:33:54

Message: 8 of 11

On Jul 12, 12:34=A0pm, "Roger Stafford"
<ellieandrogerxy...@mindspring.com.invalid> wrote:
> "Young Ryu" <ryuy...@gmail.com> wrote in message <g58va4$fep
>
> $...@fred.mathworks.com>...
>
> > Hi
>
> > I have two arrays:
>
> > A=3D[1, 2, 3, 4, 5]
> > B=3D[2, 4, 6, 7, 11]
>
> > I'd like to make a linear regression between A and B, but
> > want to force the regression line to pass the origin. Can
> > you help me how to do this?
>
> > Thanks!
>
> =A0 Note that there is a different solution to this problem that minimize=
s the
> mean square orthogonal distance of A, B pairs from the line b =3D m*a.
>
> =A0[U,S,V] =3D svd([A(:) B(:)],0);
> =A0m =3D -V(1,2)/V(2,2);
>
> =A0 If values in A and B are both subject to the same kind of error, it i=
s a best-
> fitting line in the least squares sense. =A0However, this is not ordinari=
ly referred
> to as linear regression. =A0This line will lie in the angle between the t=
wo
> regression lines in which 1) A is the independent variable and 2) B is th=
e
> independent variable. =A0These three lines are only equal in case the val=
ues in A
> and B are exactly proportional.
>
> Roger Stafford

Roger,

What are the generalizations for multiple and
multivatiate regressions?

TIA,

Greg

Subject: forcing to pass origin in linear regression

From: Young Ryu

Date: 13 Jul, 2008 21:56:02

Message: 9 of 11

"John D'Errico" <woodchips@rochester.rr.com> wrote in
message <g5a6fp$gac$1@fred.mathworks.com>...
> "Per Sundqvist" <sunkan@fy.chalmers.se> wrote in message
> <g5a332$kth$1@fred.mathworks.com>...
> > Greg Heath <heath@alumni.brown.edu> wrote in message
> > <9781e69c-6905-4cec-8942-
> 26eefecc4ca6@m36g2000hse.googlegroups.com>...
> > > On Jul 11, 8:53=A0pm, "Young Ryu"
<ryuy...@gmail.com> wrote:
> > > > Hi
> > > >
> > > > I have two arrays:
> > > >
> > > > A=3D[1, 2, 3, 4, 5]
> > > > B=3D[2, 4, 6, 7, 11]
> > > >
> > > > I'd like to make a linear regression between A and
B, but
> > > > want to force the regression line to pass the
origin. Can
> > > > you help me how to do this?
> > >
> > > m =3D B/A % B =3D m*A
> > >
> > > Hope this helps.
> > >
> > > Greg
> > in this case i think the solution is just simply:
> > k=mean(B./A)
> >
> > where y=k*x is the linear line through origin.
>
> Sorry, but this is a POOR way to solve the
> problem.
>
> What happens if any of the elements of A
> is exactly zero? Yes, you get a divide by
> zero. The resulting inf or NaN will be
> detrimental to computing the mean.
>
> Almost as bad is that this is NOT the least
> squares solution for the problem. You will
> get a weighted solution, where the weights
> are inversely proportional to the value of
> x at any point. So data points with small
> values for x will be weighted very highly.
> But, why should those particular points be
> important to compute the slope?
>
> The least squares solution to the problem
>
> B=m*A
>
> is just
>
> m = A(:)\B(:);
>
> I chose this form to be insensitive to
> whether a and B are row and column
> vectors.
>
> HTH,
> John

Thanks for you all guy's replies. Two more questions.
1. How to calculate R2 in this linear regression?
2. How to estimate standard error of the slope?

Subject: forcing to pass origin in linear regression

From: Roger Stafford

Date: 14 Jul, 2008 17:32:01

Message: 10 of 11

"Young Ryu" <ryuyr77@gmail.com> wrote in message <g5dtli$9vf
$1@fred.mathworks.com>...
> Thanks for you all guy's replies. Two more questions.
> 1. How to calculate R2 in this linear regression?
> 2. How to estimate standard error of the slope?

  In question 1, I am not sure how you would define R2. You have forced the
model line to go through the origin and therefore the residual sum of squares
could easily exceed the total "sum of squares", sum((B-mean(B)).^2). This
would give you a negative R2 by the definition I am familiar with. See:

 http://en.wikipedia.org/wiki/Coefficient_of_determination

However, it is easy to compute the residual sum of squares:

 sum((B-m*A).^2) = (sum(B.^2)*sum(A.^2)-sum(A.*B)^2)/sum(A.^2)

  On question 2 it seems to me the error in the slope would depend very
much on your assumptions about the underlying statistics of the data. If the
data is actually generated by a true linear function running through the origin
with a normal distribution of error in B, you could conceivably compute the
slope error. Lacking such information, I see no way of estimating the error in
the slope determination. For example, if the data is actually based on points
along a circular path with gaussian noise on B, the above computation would
be completely erroneous.

Roger Stafford

Subject: forcing to pass origin in linear regression

From: thomas theunissen

Date: 12 Apr, 2010 09:03:02

Message: 11 of 11

two vectors: A*x = B

[x,stdx]=lscov(A,B);

that's all !

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
slope regression Chih-Wei Tsai 7 Aug, 2011 11:06:37
rssFeed for this Thread

Contact us at files@mathworks.com