Find Diverging point of two arrays

12 views (last 30 days)
chetan sharma
chetan sharma on 18 Oct 2015
Edited: Thorsten on 17 Nov 2015
Hi, I want to find the diverging point of two arrays. Please see the figure. In this figure the point may be around 44-48, where the arrays diverges fast. I am not able to do.
  1 Comment
dpb
dpb on 18 Oct 2015
Attach your data file for convenience; could generate some but why make the folks you're asking for help do more work than needed?
I gotta' run, but I think I'd probably start by fitting a trend line to each, remove that trend and compare residuals...

Sign in to comment.

Accepted Answer

dpb
dpb on 19 Oct 2015
Edited: dpb on 26 Oct 2015
Alternatively, if similar behavior is always expected, you could fit a piecewise linear model to each curve and compare the slopes of the second segment to the first to see which is the more pronounced change. If it's known the two curves always have same relative shape then probably that breakpoint for the one is going to be pretty good estimate. I posted code some months ago demonstrating solving the breakpoint altho I don't recall the question subect...
OK, here's the link--search did work one only second try to think of what used as catchword...
ADDENDUM
Seems to work reasonably well...I first plotted
x=1:length(d); x=x(:); % x as observation number
sx=std(x); mx=mean(x);
xpr=(x-mx)/sx; % standardize for better numeric properties later
figure
plotyy(x,d(1:2:end),x,d(2:2:end)) % see shapes together; same scaling
and concluded are similar enough to only deal with one specifically as representative of all.
So,
>> b=nlinfit(xpr,d(:,1),@piecewise,[-1 0 0 5]) % estimate breakpoint
b =
-0.1447 0.6965 -0.0294 6.5596
>> yh=piecewise(b,xpr); % evaluate the fitted regression
>> close % start new figure
>> plot(d(:,1),'x') % plot original
>> hold on
>> plot(yh,'r') % overlay the fit
Looks pretty good estimate of where the breakpoint would/should be...so what is it?
>> xBreak=b(3)*sx+mx % convert the break coefficient estimate to x-coordinates
xBreak =
48.6640
The piecewise regression function is
>> type piecewise
function y=piecewise(b,x)
% Piecewise linear regression
% y = a1 + b1*x, x<C
% y = a2 + b2*x, x>C
% = a1 + C*(b1-b2) + b2*x
% Input coefficients as vector of [a1 b1 C b2]
ix=x>b(3); % location> breakpoint
y(ix)=[b(1) + b(3)*(b(2)-b(4))+b(4)*x(ix)];
y(~ix)=[b(1)+b(2)*x(~ix)];
y=y(:);
>>
Given the characteristic shapes are so similar, the "divergence" location is pretty much the location at which the two pieces of the first curve change characteristics. This gives at least a reasonable estimate of where that breakpoint is.
It appears that the second portion is nearly quadratic; I've nor worked on the algebra to see about actually implementing that; you need another condition to tie the coefficients together as above where solve for a2 in terms of a1 and C by equating the two pieces at the intersection point, ie,
a1+b1*C = a2+b2*C --> a2 = a1 + (b1-b2)*C
One possible refinement might be to use the above for an initial estimate the fit the quadratic from floor(xBreak:end) and then look for where that residual peaks to the left of that point. This would rely on the data being quadratic as nearly as seems to be in the sample data, probably.
I just stuck in the actual value...
>> bq=polyfit(x(48:end)',d(48:end,1),2) % fit quadratic over the last
bq =
-0.0025 0.6033 -24.5031
>> yhq=polyval(bq,x(48:end)'); % evaluate it
>> hold on
>> plot(x(48:end)',yhq,'r') % the fitted quadratic
>> plot(x(48:end)',yhq-d(48:end,1),'g') % add the residual
>>
  4 Comments
chetan sharma
chetan sharma on 24 Oct 2015
This is actual data. This is my model generated data. There are around 1000 of my generated series and for each series I am getting these type of curves, you may say that I am re-sampling or dividing the data for my next simulation. My data varies like this only. I am not aware where will be the next divergence in next series. it may be at the beginning or middle or end or anywhere.
chetan sharma
chetan sharma on 17 Nov 2015
Sorry for late reply. I was out of town. This script is fairly accurate, but my problem is the divergence can be anywhere in x axis i.e. it can be in the starting or end. Accidentally, the data sheet I attached is having the divergence in middle. Can you please elaborate what b=nlinfit(xpr,d(:,1),@piecewise,[-1 0 0 5]) is doing?

Sign in to comment.

More Answers (3)

Thorsten
Thorsten on 19 Oct 2015
Edited: Thorsten on 19 Oct 2015
You can compute the difference between both lines and find the point of divergence when the difference exceeds some threshold.
threshold = 6; % sample threshold
find(abs(y1 - y2) > threshold, 1, 'first')
  5 Comments
dpb
dpb on 24 Oct 2015
Edited: dpb on 24 Oct 2015
Sorry about that but the browser here can't manage to download the link when it's not text (not your fault, it's Firefox). Can you attach a text file, perhaps after trimming its size some but retaining the key features? Apologize for the inconvenience but this seems an issue with how these links are interpreted that can't manage to actually download the file itself but copies it as text instead.
I do think, however, that for a characteristic curve such as you've shown, that finding that knee will be at least a first step in identifying the point of interest. I'll reiterate again, however ( :) ) that you've still not really provided a definition of what the definition of this point is except as some general description of "I'll know it when I see it", but that's not an implementable algorithm; there's needs to be some quantitative criterion/criteria.
ADDENDUM Or, I am contactable via the "Contributors" page; I can't promise when but I'd try to take a look at the data if you were to send it directly. I'd suggest attaching the files to your original question via "EDIT" rather as an Answer to the question you posed as it really isn't...
chetan sharma
chetan sharma on 24 Oct 2015
I really appreciate your help and support. I am trying but I am not getting what it should be. Firstly it looked simple by just looking at graph, but I am not able to tell machine how to look at the data my way..:) I have attached the text file.

Sign in to comment.


chetan sharma
chetan sharma on 24 Oct 2015
Please find the attached data. There are four pairs of data i.e. four data set of Y axis for two variables U_R and U_P varying with any X-axes.

Thorsten
Thorsten on 26 Oct 2015
Edited: Thorsten on 26 Oct 2015
I loaded your data and found that if y is the data of the higher curve, you can find the diverging point using
dp = find([1; diff(y)] < 0, 1, 'last');
The diverging point is determined by the upper curve only, the lower curves are pretty smooth. After diverging, the curve is monotonically increasing. The code finds the last point where the upper curve decreases.
  2 Comments
chetan sharma
chetan sharma on 17 Nov 2015
Thanks for reply. Can you please elaborate what is find([1; diff(y)]) doing?
Thorsten
Thorsten on 17 Nov 2015
Edited: Thorsten on 17 Nov 2015
diff takes the difference of adjacent elements. After diverging of curves, the upper curve is monotonically increasing, that is, diff is always > 0. The last position where diff is negative, i.e., where the curve decreases, is marked as the starting point. The 1; is used to shift this position by one to the right, because the divergence should start at the position of the lower number.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!