line fitting between two independent line segments

I have an array of values that has a length of 640 values. When plotted, it is a 2D cross-sectional surface profile. What I want to do is obtain an array that is a line fit to the first 50 and last 50 points, because I do not want the rest of the array to be included when fitting a line. Essentially, I want to mask the in-between. polyfit is the closest I've come but I cannot seem to get it to work for discontinued segments of an array. Additionally, I do not want to pretend the masked values are just a straight line between point 50 and 590, I want it to fit a profile as if the first 50 and last 50 points were continuous.

 Accepted Answer

George:
Is this what you mean:
% Load data.
load('z_profile_x.mat')
load('z_profile_z.mat')
x = z_profile_x;
y = z_profile_z;
plot(x, y, 'b-', 'LineWidth', 3);
grid on;
% Find the flat parts
leftIndex = find(y > -74, 1, 'first')
rightIndex = find(y > -74, 1, 'last')
% Extract flat sections
xTrain = [x(1:leftIndex), x(rightIndex:end)];
yTrain = [y(1:leftIndex), y(rightIndex:end)];
% Do a linear regression fit.
coefficients = polyfit(xTrain, yTrain, 1);
% Get the yFitted at all x locations,
% not only the training locations.
yFitted = polyval(coefficients, x);
% Plot the fitted line
hold on;
plot(x, yFitted, 'r-', 'LineWidth', 3);
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
yFitted is a line that is calculated using only the two flat portions on the left and right but computed everywhere, even under the hump.

2 Comments

Image Analyst
Yes, that is pretty much what I am going for, thank you. I've modified it slightly though because I was getting the error that "dimensions of matrices being concatenated are not consistent" for xTrain. I also wanted to increase the degree of polyfit.
Thanks again for your help!
% Load data.
load('z_profile_x.mat')
load('z_profile_z.mat')
x = z_profile_x;
y = z_profile_z;
plot(x, y, 'b-', 'LineWidth', 3);
grid on;
% Find the flat parts
leftIndex = find(y > -74, 1, 'first')
rightIndex = find(y > -74, 1, 'last')
% Extract flat sections
xTrain = [x(leftIndex); x(rightIndex)];
yTrain = [y(leftIndex); y(rightIndex)];
% Do a linear regression fit.
coefficients = polyfit(xTrain, yTrain, 2);
% Get the yFitted at all x locations,
% not only the training locations.
yFitted = polyval(coefficients, x);
% Plot the fitted line
hold on;
plot(x, yFitted, 'r-', 'LineWidth', 3);
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
It worked fine for me. For some reason your x must be a column vector whereas it was a row vector for me. Anyway, glad it's working.

Sign in to comment.

More Answers (2)

John D'Errico
John D'Errico on 23 Dec 2015
Edited: John D'Errico on 23 Dec 2015
My SLM toolbox can do this, in the sense that it can be used to fit a curve that is simultaneously linear over the domains at each end, as well as a smoothly fit curve between those domains. So it would solve your entire problem at once.
Download it from the file exchange. It does require the optimization toolbox, so unless you have that, don't bother trying or asking me how to use it without the optimization toolbox.

10 Comments

I have the optimization toolbox, which .m file should I use in your toolbox?
George. I don't understand. You said you want data "that is a line fit to the first 50 and last 50 points" OK, so that's two lines, each with it's own slope and offset. Then you say "I want to mask the in-between" so I thought that you wanted to mask both the training data and the fitted output data. But then you say "I want it to fit a profile as if the first 50 and last 50 points were continuous." which hints that you do actually want data for the middle/masked portion, and in fact that it should be continuous. So I'm picturing extrapolating the left and right lines, and if your data points (which you're not showing us for some strange reason) trend correctly, you could have both those lines meet in the middle. And you want the equation of the lines or the fitted points in the straight line sections up to the peak of the triangle. Is this correct? Am I visualizing now correctly what you want? You want the triangle, right?
I have attached to arrays, one to be the x-axis (z_profile_x) and the other as the y-axis values (z_profile_z). If you open the file, it begins as a somewhat straight line and there is a large hump in the middle, then goes back to the somewhat straight line again. What I want is a line to be fit for the beginning (first 50) and end (last 50) "straight lines" that disregards the middle hump during fitting. I cannot just make the middle 540 points a straight line because there is some curve to the points (and if there isn't, I still want a line to be fit for them). Does this make sense?
I haven't loaded your data yet (a screen shot would have been convenient in addition). So would fitting a parabola or hyperbola be sufficient? Those are relatively straight away from the vertex.
Here's the image. A parabola would be fine, the coefficients aren't expected to be large.
I inserted the picture for you with the green and brown frame icon. You might want to use that next time.
George, are you trying to find some overall offset as a function of x and subtract that from the whole signal? You said you don't want a line between 200 and 440, so what do you want? Zero? A parabola? A pair of lines like I showed? How about forgetting about all the 50 element stuff and just look at your central data between the corners at 200 and 440, and the two "linear" sections outside of that central section? Corners can be found in a variety of ways, such as via the triangle thresholding method.
An ideal output would help us help you. We can write code but don't know what code to write because we don't know what you want until you show us.
what I want is an array of points that describe a line fit to the linear portions of the profile, specifically the first and last 50 points. A parabola is fine. I want the line to pass through the linear portions without being fit to the middle "hump". I want there to be values between 200 and 440, but want them to be in the region of -76 or something that make them part of the line that is fit to the linear portions. Basically I want to polyfit the line, though only if the line didn't have the hump in the middle and just had the continuous surface reflecting the beginning and end.
So you want to fit a regression spline of some ilk, in the sense that it is a set of piecewise continuous functions. The end segments seem to be fairly constant from the picture, not just linear.
Note that a parabola in the center portion will exhibit some serious lack of fit.
If I had your data, instead of just a picture of it, I could show you how to use slmengine to do the fit.
Edit: Oh, I see you attached some data. I'll see what I can do then.
So, since you don't seem to really care what happens in the middle part, I just allowed it to be piecewise linear.
The first and last segments are close to constant, but not fully so. The breaks points are shown as vertical green lines.
slm = slmengine(z_profile_x,z_profile_z,'degree',1,'knots',[0 95:20:215 315],'plot','on');
Or, with a smoother fit, where I allow the curve to smoothly transition between behaviors:
slm = slmengine(z_profile_x,z_profile_z,'degree',3,'knots',[0:5:315],'plot','on','linearregion',[0,90;225,315],'concavedown',[140,180]);
With some more play time invested, I could probably come up with other solutions. Or, it can be done easily enough using just lsqlin, if you know how to create the proper matrices. On these things, easy is sometimes in the eye of the beholder.
Thanks for the help! It still fit to the "hump" but the problem was solved below anyway.

Sign in to comment.

If it's not going to be a line in between, then you need to process as two separate fits. So just pass in those elements for each section one at a time:
% Process left section
xTrain = x(1:50);
yTrain = y(1:50);
coefficients = polyfit(xTrain, yTrain, 1);
% Get the yFitted at just the xTrain locations
yFittedLeft = polyval(coefficients, xTrain);
% Process right section
xTrain = x(end-49:end);
yTrain = y(end-49:end);
coefficients = polyfit(xTrain, yTrain, 1);
% Get the yFitted at just the xTrain locations
yFittedRight = polyval(coefficients, xTrain);
You can combine yFittedLeft and yFittedRight into a single array if you can decide what to put in the masked area in between, like zeros or whatever.
yFitted = [yFittedLeft, zeros(1, length(x)-100), yFittedRight];
Not sure you want to do that or not though.

Categories

Find more on 2-D and 3-D Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!