Collpase data to a single value for each value of Y

1 view (last 30 days)
I have a set of data as shown below in the pic as well as a portion of the actual data:
12 12 12 12 11 11 11 11 11 11 11 11 10 10 10 10 10 10 10 10 10 10
I would like to do a fit but through the centre of each of the horizontal lines. How can I collapse each line to its centre value which will then allow me to use polyfit?

Accepted Answer

Star Strider
Star Strider on 3 Nov 2015
Your data are in a sense ‘weighted’, so fitting the centres would not be as good as simply fitting the data as they currently exist. Least squares routines should be robust to such repeated values, so you will likely get a better fit to your data if you fit them without any pre-processing.
Try it on your data as they exist. If the fit does not produce acceptable results, then pre-process it, but I doubt any pre-processing will be necessary to get good parameter estimates.
  1 Comment
John D'Errico
John D'Errico on 3 Nov 2015
if there is uncertainty in x, then this is an errors in variables problem.
Errors in variables least squares problems are well known to yield biased estimates for the parameters.

Sign in to comment.

More Answers (3)

Jan
Jan on 3 Nov 2015
Edited: Jan on 3 Nov 2015
You can use polyfit with these data also.
Index = [find(diff(Y)), length(Y) + 1];
Center = (Index(1:end-1) + Index(2:end) - 1) / 2

Image Analyst
Image Analyst on 3 Nov 2015
As long as none of the levels repeats in a non-contiguous x range, you can try unique():
[newY, ia, ic] = unique(y)
newX = x(ia);
hold on;
plot(newX, newY, 'r*', 'MarkerSize', 15);
but I think that will get x at the start of a line.
Anyway, you can do the fit through the center just by fitting all of the data. You don't need to cull it down, as long as there is a unique Y for each X (i.e., no X has two Y values).

John D'Errico
John D'Errico on 3 Nov 2015
Edited: John D'Errico on 3 Nov 2015
It appears as if you wish to treat this as an errors in variables estimation. So you have noise in x, with known y values. The problem is that errors in variables estimates tend to yield biased estimates for the coefficients. You have several options.
1. Use a tool like my consolidator. It can average the values of x for EACH given value for y, producing a reduced set. If there are a variable number of points at each location, you could use that information, counting the number of points in each group (also using consolidator) to do a weighted regression. My polyfitn can do a weighted polynomial regression.
2. You might also use the inverse of the group standard deviation for each group (again, consolidator can compute that too) to give weights for the regression estimates. Thus a high standard deviation at one point would reduce the weight given to that point in the regression.
3. You could reverse the problem, fitting x as a function of y. A nice thing about this option, is you do not need to reduce the groups of points into singletons at all! No averaging needed. Yes, this will result in a model of the form x(y), so predictive ability might be more difficult, BUT the modeling is far better in that direction. You would then need to us the inverse model for prediction. But suppose that you choose a quadratic polynomial for the x(y) model? This is trivial to invert, in fact, you can write down the formula for the inverse!
4. Once the curve has been reduced to single points at each location in y (use consolidator) then use a spline to fit the curve. An interpolating spline is good so interp1 or just spline) but you can also use my SLM tools on the File Exchange to do that fit.
I imagine I can come up with some other approaches, but one of the above should work for you.
Find the tools I mention above on the File Exchange. CONSOLIDATOR , POLYFITN , SLM TOOLBOX .
  2 Comments
Jason
Jason on 3 Nov 2015
Thanks everyone, your comments are really helpful. The horizontal data ar enot error bars John, its due to my extraction of objects where I currently only locate to the nearest pixel (i.e. where the object is brightest). I want to address this so will start another question for this.
Image Analyst
Image Analyst on 3 Nov 2015
Can you attach your data file and code to read it in?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!