Linear Regression with Y as your Dependent Variable

Question

Tommaso Costantini on 22 Feb 2018

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/384212-linear-regression-with-y-as-your-dependent-variable

Edited: Star Strider on 26 Feb 2018

Howdy! I've had the issue where I have to calculate a non-linear line with y as the dependent variable to make a regression for later steps in a problem. I used polyfit, polyval and it worked the first time(I'm not sure how as the graph was strange but it gave the right values), but now on the more highly non-linear example it breaks. How should I go about doing this?

I thought about inputting the y value as x and visa versa but this introduces problems later on. I've attached a portion of the code that finds the regressions. Please excuse the comments, it's how I take notes as I think.

xD=0.85; xF=0.5; xB=0.03; degree=9; R=1.5; % Stripping operating line: y = a*x + b, where

a=(R/(R+1));b=xD/(R+1);

yF=a*xF+b; %This appears to be the intersection of the line with the q line

xe=[0 .02 .05 .1 .2 .3 .4 .5 .6 .7 .8 .9 .94 .96 .98 1];

ye=[0 .192 .377 .527 .656 .713 .746 .771 .794 .822 .858 .912 .942 .959 .978 1];

%Data points as described in the question

pp1=polyfit(ye,xe,degree); %This is where the problem starts

y=0:0.01:1;

pp=polyfit(xe,ye,degree);

range=0:0.01:1; %This defines the range that we want the function to go over

y=polyval(pp,range);

plot(range,y,[0 1],[0 1],[xB xF],[xB yF],[xD xF],[xD yF],[xF xF],[xF yF],'--')

pE=polyfit([xD xF],[xD yF],1);

EDIT: I found that interp1 seems to work, but I'll leave this open for a bit if anyone has a more elegant solution.

2 Comments
Show NoneHide None

Are Mjaavatten on 22 Feb 2018

Open in MATLAB Online

plot(xe,ye,'o',range,y)

shows that pp gives a decent fit to the data, so it seems to me that you have solved the task.

The high polynomial degree means that extrapolating outside the given interval will not be meaningful, but that may not be relevant.

But I obviously miss something, since I fail to see the idea behind the first three lines in your code, and the resulting straight lines in the plot. Please explain!

Tommaso Costantini on 26 Feb 2018

The first three lines are used later, I apologize about the confusion. They function at a later point to create additional lines that the data is bound by (the stripping and enriching lines) using the McCabe-Thiele method. I used the higher degree polynomial as it fit the steps created later in the program, although you are correct that it does work to plot the line.

Sign in to comment.

Sign in to answer this question.

Answer 1

Star Strider on 22 Feb 2018

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/384212-linear-regression-with-y-as-your-dependent-variable#answer_306526

Do you actually need to fit a curve to your data? Consider using the interp1 (link) function if you only need to get data from it.

2 Comments
Show NoneHide None

Tommaso Costantini on 22 Feb 2018

Ideally yes, but I found a way around it. interp1 is working quite well, thank you so much!

Star Strider on 22 Feb 2018

Edited: Star Strider on 26 Feb 2018

As always, my pleasure!

ADDENDUM —

With respect to the McCabe-Thiele method, see the File Exchange contribution McCabe-Thiele Method for an Ideal Binary Mixture (link).

Sign in to comment.

Answer 2

John D'Errico on 22 Feb 2018

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/384212-linear-regression-with-y-as-your-dependent-variable#answer_306538

Edited: John D'Errico on 22 Feb 2018

Open in MATLAB Online

xe=[0 .02 .05 .1 .2 .3 .4 .5 .6 .7 .8 .9 .94 .96 .98 1];
ye=[0 .192 .377 .527 .656 .713 .746 .771 .794 .822 .858 .912 .942 .959 .978 1];

Lets look at your data.

plot(xe,ye,'o')

Now, one thing you need to understand about polynomials (thus polyfit) is they abhor a singularity. What do you have at x==0? Its pretty much a singularity (here, a point of nearly infinite slope.) Polynomial models simply don't have points where you have an essentially infinite slope. In order to get any kind of fit there you would need a very high degree polynomial model, and high order polynomial models are a BAD idea.

If you think about it, even things like Taylor series (POLYNOMIALS!) represent functions with singularities very poorly. What you see are always massive convergence problems. As I said it before, polynomials abhor a singularity.

What can you do here? There are several entirely valid approaches. It would depend on what you will do with the model, and what other datasets that you will encounter look like.

You have apparently very little noise in the data. It seems quite smooth and well-behaved. If that point at x==0 is NOT expected to be a point of infinite slope, then a simple spline will usually be adequate. It is sometimes dangerous to use the spline option in interp1 that. A traditional spline will often be poor on functions like this, because a spline is itself made of polynomials, and polynomials don't like points of infinite slope. You get lucky here, because at x==0 things seem not quite that bad.

xeint = linspace(0,1,500);
yeint = interp1(xe,ye,xeint,'spline');
plot(xe,ye,'ro',xeint,yeint,'b-')

So the spline interpolant actually did ok there. In some cases, a pchip interpolant (essentially just a different kind of spline that can be found as an option in interp1) would have been necessary.

As I said, we got lucky here. I entirely expected to see oscillations (extraneous bumps and wiggles in the curve between the data points) in the spline fit. That would have been indicative that a spline model was inadequate, a poor choice. But it seems to work entirely well on the data I see here.

2 Comments
Show NoneHide None

Tommaso Costantini on 26 Feb 2018

Thank you so much for your input, I really enjoyed the insight gained from it! Basically, I can easily fit the data but I wanted to have the code give me the coefficients of the line as a function of y (x(y)=equation). In doing so, I could solve for the intersection of a horizontal line at a fixed y to make steps using the McCabe-Thiele method for calculating theoretical plates.

The idea is to go from the 45 degree line (y=x) to the data points, then down to the stripping/enriching line depending on where I'm at. As is, I'm experimenting more with the interpolation since it doesn't quite give perfect intersections and has a couple other errors in special cases. Thanks again for your time, and I apologize about the slow reply.

John D'Errico on 26 Feb 2018

Edited: John D'Errico on 26 Feb 2018

Open in MATLAB Online

I'd need to see the special cases where you found problems with a spline to know how to fix things.

But solving the problem where you flip the relationship between x and y is simple enough. There are two good solutions available. One is the trivial, just flip x and y. Since your curve is smooth and monotonic,

xe=[0 .02 .05 .1 .2 .3 .4 .5 .6 .7 .8 .9 .94 .96 .98 1];
ye=[0 .192 .377 .527 .656 .713 .746 .771 .794 .822 .858 .912 .942 .959 .978 1];
x_y = spline(ye,xe);
ytarget = 0.75;
fnval(x_y,ytarget)
ans =
     0.4147

So the value of xe that yields ye of 0.75.

Or, you could have left the relatinoship in the form ye(xe), and then just used a solver to find the location of interest.

y_x = spline(xe,ye);
xtarget = fzero(@(x) ppval(y_x,x) - ytarget,[0 1])
xtarget =
      0.41473

Were you using my SLM toolbox, I provide a solver in there that would work.

slmsolve(y_x,ytarget)
ans =
      0.41473

But there is absolutely no reason to need it here. fzero is entirely sufficient.

Now, it is possible that one reason you indicated some problems is the data may not always be so well-behaved. That is something I cannot know, because all I have seen is one relationship that is well-behaved.

In some cases, I might recommend use of my SLM toolbox to build a model, in one of the directions I showed above. But I really cannot say, since a simple spline (or even interp1) is entirely adequate here, in either direction.

Sign in to comment.

Linear Regression with Y as your Dependent Variable

2 Comments
Show NoneHide None

Accepted Answer

2 Comments
Show NoneHide None

More Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

Linear Regression with Y as your Dependent Variable

2 Comments Show NoneHide None

Accepted Answer

2 Comments Show NoneHide None

More Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

2 Comments
Show NoneHide None

2 Comments
Show NoneHide None

2 Comments
Show NoneHide None