Broken stick regression and find change point

24 views (last 30 days)
Hi!
I've got two vectors, one with data on outside temperature, and the other one is heat supply for a building. I can scatter plot the vectors to get a visual of how the heat supply increases with lower temperature (Celsius). For higher temperature heat is only needed for water and therefore there is no longer a dependency between increasing temperature and lowered heat demand. As can be seen in the picture below
I would like to find the change point, (the temperature where building heating is starting to be needed) and perform a segmented regression (broken stick), one line for the temperature dependent part, and one line for the independent. Is there a way to do that in for example the curve fitting app? Or how do i write it in code?
I know about the function Findchangepts, but it only works with 1 vector, not 2, and not with a matrix.
  3 Comments
Joton
Joton on 26 Oct 2017
Edited: Joton on 26 Oct 2017
Yes, it can be interpreted that way, there could be a slight slope but that is more due to random variations in usage of warm water than physical factors. So, in the figure above, there should be a line with a negative slope between -15 (Celsius) to somewhere around 3-5(Celsius). And from there to +15 (Celsius) a line with no slope.
Joton
Joton on 26 Oct 2017
Edited: Joton on 26 Oct 2017
The important part though, is the slope before the breaking point and the position of the breaking point. As the line above the breaking point is of no larger interest, a slight slope there doesn´t matter.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 26 Oct 2017
Edited: John D'Errico on 26 Oct 2017
Since you provide no data, I'll make some up.
plot(xdata,ydata,'o')
grid on
Not quite the same as yours, since yours is clearly not uniform variance. I will assume there is an unknown break location, and employ a model where the function is constant above the break, and continuous but not differentiable at the break.
Two small helper functions...
coefs = @(xbreak) [ones(numel(xdata),1),min(xdata,xbreak)-xbreak]\ydata;
errfun = @(xbreak) norm([ones(numel(xdata),1),min(xdata,xbreak)-xbreak]*coefs(xbreak) - ydata);
They assume that xdata and ydata are column vectors of the same length, residing in your workspace.
Solve for the break point:
xbreak = fminbnd(errfun,0,1)
xbreak =
0.59438654958197
Not bad, since the real function that I created had a break at 0.60.
C = coefs(xbreak)
ans =
-0.462584832269373
-1.32149120536521
Above the break point, the function is constant, at -0.4625...
Below the break point, the slope is estimated at -1.32.
So, the line below the break point can be written as
y = (x - xbreak)*C(2) + C(1)
Above the break point, it is simply
y = C(1)
  3 Comments
John D'Errico
John D'Errico on 27 Oct 2017
It is a linear regression for ALL of the values, with an optimization in the middle, to find the break location. Because the function above the break was modeled as a constant one, I could get tricky, and solve it in a non-obvious way.
Effectively, this should have been modeled as a piecewise linear spline. Had I done it that way, there would have been more terms, and it would have looked more complex. But knowledge of the constant nature above the break allowed me to simplify things greatly using the min function.
As far as plotting goes, I would plot it as three points.
plot([min(x),xbreak,max(x)],[(min(x)-xbreak)*C(2)+C(1),C(1),C(1)],'r-')
For evaluation purposes, I would just use a piecewise function. An old FEX utility exists for such piecewise function evaluation, but the simplest solution is to just use a function like this:
yfun = @(x) (min(x,xbreak) - xbreak)*C(2) + C(1);
that works for any value of x.
Finally, I would point out that the fit that you got is a little high in the constant portion of the curve. That is because if you look in the lower part of that constant segment, there is a LOT of data that tends to draw the curve up a bit.
Remember that a simple regression ONLY assumes the data has noise in the Y direction. But really, this curve looks more like a curve where you want to model as if the noise is projected to the curve orthogonally. That is a significantly more difficult problem to solve, especially when the breakpoint is unknown.
Joton
Joton on 27 Oct 2017
Thank you for good answers and explanations!
/Anton

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!