Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Best fit line with constrained coefficients

Subject: Best fit line with constrained coefficients

From: Nathan

Date: 10 May, 2010 20:13:08

Message: 1 of 13

I am trying to fit a line to my data points, and while polyfit and regstats will easily fit a line, it may not be physically relevant. How do I edit these functions so they will fit a regression line with a positive slope.

If you care, here is some sample data.
x=[282.2540 285.8649 253.2350 271.8654 293.8727 293.8727 106.1968 226.1100];
y=[104.8101 116.0248 112.0172 106.1792 117.0507 64.0306 115.3988 102.3172]

And I know the line should be positive, but polyfit generates a line with negative slope.

Subject: Best fit line with constrained coefficients

From: Matt J

Date: 10 May, 2010 20:43:10

Message: 2 of 13

"Nathan " <ndn3@georgetown.edu.remove.this> wrote in message <hs9pck$618$1@fred.mathworks.com>...
> I am trying to fit a line to my data points, and while polyfit and regstats will easily fit a line, it may not be physically relevant. How do I edit these functions so they will fit a regression line with a positive slope.
>
> If you care, here is some sample data.
> x=[282.2540 285.8649 253.2350 271.8654 293.8727 293.8727 106.1968 226.1100];
> y=[104.8101 116.0248 112.0172 106.1792 117.0507 64.0306 115.3988 102.3172]
>
> And I know the line should be positive, but polyfit generates a line with negative slope.
===================


Fit using the following parametrized line

y(x)=m^2*x+b

and the objective function

f(m,b)=sum (m^2*x(i)+b-y(i) )^2

Sett the gradient of f to zero and solve for m and b.

Subject: Best fit line with constrained coefficients

From: Nathan

Date: 10 May, 2010 23:22:24

Message: 3 of 13

I should have been more specific. While m and b are important, I'd also like to get the MSE, p, and r values for the fit. Ideally, I was looking for a model option for regstats that let me add constraints to the coefficients. Something like regstats(X,Y,[>0]).

Subject: Best fit line with constrained coefficients

From: Roger Stafford

Date: 10 May, 2010 23:30:27

Message: 4 of 13

"Nathan " <ndn3@georgetown.edu.remove.this> wrote in message <hs9pck$618$1@fred.mathworks.com>...
> I am trying to fit a line to my data points, and while polyfit and regstats will easily fit a line, it may not be physically relevant. How do I edit these functions so they will fit a regression line with a positive slope.
>
> If you care, here is some sample data.
> x=[282.2540 285.8649 253.2350 271.8654 293.8727 293.8727 106.1968 226.1100];
> y=[104.8101 116.0248 112.0172 106.1792 117.0507 64.0306 115.3988 102.3172]
>
> And I know the line should be positive, but polyfit generates a line with negative slope.
- - - - - - - -
  Nathan, if you are using least sum of squares as a criterion for best fit, the data you have given is best fit using a line with negative slope. It is a fact that is very easily demonstrated mathematically. If you wish to constrain the slope to non-negative values, then the best slope in that least squares sense would be a slope of zero. That is the value you would get if you were to minimize the objective function Matt described while restricting slope m to real values.

  If these results are not in accordance with your needs, it would be necessary to define a different criterion for best fit.

Roger Stafford

Subject: Best fit line with constrained coefficients

From: Matt J

Date: 11 May, 2010 14:37:04

Message: 5 of 13

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <hsa4uj$6mv$1@fred.mathworks.com>...
 If you wish to constrain the slope to non-negative values, then the best slope in that least squares sense would be a slope of zero. That is the value you would get if you were to minimize the objective function Matt described while restricting slope m to real values.
====================

That is indeed the result I got when I performed the function minimization for the given data. It would not be the case for all data, however.

Subject: Best fit line with constrained coefficients

From: Matt J

Date: 11 May, 2010 14:50:19

Message: 6 of 13

"Nathan " <ndn3@georgetown.edu.remove.this> wrote in message <hsa4fg$7fv$1@fred.mathworks.com>...
> I should have been more specific. While m and b are important, I'd also like to get the MSE, p, and r values for the fit. Ideally, I was looking for a model option for regstats that let me add constraints to the coefficients. Something like regstats(X,Y,[>0]).
=================

Well, once you have m and b, you can certainly calculate residuals, their norm, and any other function based on the residuals that you like. If you have simulated ground truth values, then you can also compute MSE and the like by numerical simulation.

The reason that you're not going to find a turn-the-crank method/code for doing error analysis is that, with the slope constrained positive, the parameter estimator is no longer linear/unbiased. The statistical distribution of the estimate is therefore much harder to calculate analytically.

Subject: Best fit line with constrained coefficients

From: Nathan

Date: 11 May, 2010 15:12:06

Message: 7 of 13

Is there a way to use orthogonal offsets instead of vertical offsets when doing the least squares fitting? I think that may be more of what I am envisioning.

And before it comes up, I am not interested in passing the line through the origin.

Subject: Best fit line with constrained coefficients

From: Bruno Luong

Date: 11 May, 2010 15:35:06

Message: 8 of 13

"Nathan " <ndn3@georgetown.edu.remove.this> wrote in message <hsbs46$q21$1@fred.mathworks.com>...
> Is there a way to use orthogonal offsets instead of vertical offsets when doing the least squares fitting? I think that may be more of what I am envisioning.

Such regression called TLS (total least square). Numerical method for linear fitting uses SVD. You could google.

Bruno

Subject: Best fit line with constrained coefficients

From: Matt J

Date: 11 May, 2010 16:37:04

Message: 9 of 13

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <hsbq2g$be5$1@fred.mathworks.com>...
> "Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <hsa4uj$6mv$1@fred.mathworks.com>...
> If you wish to constrain the slope to non-negative values, then the best slope in that least squares sense would be a slope of zero. That is the value you would get if you were to minimize the objective function Matt described while restricting slope m to real values.
> ====================
>
> That is indeed the result I got when I performed the function minimization for the given data. It would not be the case for all data, however.
=============

Scratch that. I just convinced myself that it will happen for all data...

Subject: Best fit line with constrained coefficients

From: Roger Stafford

Date: 11 May, 2010 19:23:05

Message: 10 of 13

"Nathan " <ndn3@georgetown.edu.remove.this> wrote in message <hsbs46$q21$1@fred.mathworks.com>...
> Is there a way to use orthogonal offsets instead of vertical offsets when doing the least squares fitting? I think that may be more of what I am envisioning.
> .........

To Nathan:

  There is nothing difficult about finding orthogonal fits. For your data x and y you would do this:

 xm = mean(x); ym = mean(y);
 A = [x-xm;y=ym].';
 [U,S,V] = svd(A,0);

Then V(:,1) is a unit vector parallel to the desired line, so its slope is

 m = V(2,1)/V(1,1)

and the line runs through the point (xm,ym).

  The assumption behind ordinary regression is that all the errors are in the y-ccordinates, while these orthogonal fits assume that both x and y coordinates are equally in error. You can revise the above to weight the respective coordinates so as to obtain an optimum line in the sense of correspondingly weighted least square errors.

  I seem to recall that the orthogonal-fit slope wil always lie between that of the ordinary regression line on the one hand and the regression line with x and y interchanged on the other, and that the three slopes can only be equal when the original data is colinear. (I would have to brush off some mental cobwebs to prove this right now.)

To Matt J:

  When you write y(x) = m^2*x + b and remove all constraints, the optimization procedure will move toward a solution in which the partial derivatives of the objective function with respect to m and b are zero, since there is no longer a constraint barrier to stop it. If the data is such that it cannot find such a solution with m^2 > 0 - in other words, the natural regression slope would be negative - then it will presumably gravitate (if its search is successful) toward the m = 0 value which will then be the only way it can achieve a zero partial derivative with respect to m. That was the basis of my statement earlier.

Roger Stafford

Subject: Best fit line with constrained coefficients

From: Bruno Luong

Date: 11 May, 2010 19:36:04

Message: 11 of 13

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <hscaqo$g36$1@fred.mathworks.com>...

>
> I seem to recall that the orthogonal-fit slope wil always lie between that of the ordinary regression line on the one hand and the regression line with x and y interchanged on the other, and that the three slopes can only be equal when the original data is colinear. (I would have to brush off some mental cobwebs to prove this right now.)

Roger, that reminds me a series of academic papers written about those by Christopher Paige, for example "Unifying Least Squares, Total Least Squares and Data Least Squares" where he showed they just belong to the same regression with one hyper-parameter.

Bruno

Subject: Best fit line with constrained coefficients

From: Roger Stafford

Date: 11 May, 2010 20:02:06

Message: 12 of 13

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <hscbj4$49m$1@fred.mathworks.com>...
> Roger, that reminds me a series of academic papers written about those by Christopher Paige, for example "Unifying Least Squares, Total Least Squares and Data Least Squares" where he showed they just belong to the same regression with one hyper-parameter.
>
> Bruno

  Yes, I suspect that "hyper-parameter" corresponds to the weights one uses. With weighting all on the side of the y-ccordinates, you get ordinary regression, with weights all on the side of the x-coordinates, one obtains regression with the coordinates reversed, with equal weights, you find the orthogonal best fit, and similarly for any weights. As you vary the weighting, the slopes change monotonically and continuously between the two extreme regression values.

Roger Stafford

Subject: Best fit line with constrained coefficients

From: Matt J

Date: 11 May, 2010 20:49:04

Message: 13 of 13

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <hscaqo$g36$1@fred.mathworks.com>...

> When you write y(x) = m^2*x + b and remove all constraints, the optimization procedure will move toward a solution in which the partial derivatives of the objective function with respect to m and b are zero, since there is no longer a constraint barrier to stop it. If the data is such that it cannot find such a solution with m^2 > 0 - in other words, the natural regression slope would be negative - then it will presumably gravitate (if its search is successful) toward the m = 0 value which will then be the only way it can achieve a zero partial derivative with respect to m. That was the basis of my statement earlier.
===================

Yes, Roger, that's true. I was thinking of the case where both m and b are constrained, e.g., we are solving

min. || m*x+b-y||^2

subject to m>=0 and b>=0

Once you add constraints on both variables, it is no longer trivial to predict how the constraints will affect the sign of the minimizing m (or b).

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us