Thread Subject: Finding dips in a data set

Subject: Finding dips in a data set

From: Sachitha Obeysekara

Date: 3 Sep, 2007 11:10:22

Message: 1 of 9

Hi,
I have a data set of around 13000 points which effectively
look like:
a = 3.19 3.15 3.2 3.13 3.00 2.89 2.55 2.45 2.65 2.390 3.00
3.19 3.15 3.30

...so the data is basically a straight-ish line followed
by a dip (with lots of noise in between)....how can find
the location of the dip?

(something involving the gradient() function???)

Any ideas?


Many thanks,
Sach

Subject: Finding dips in a data set

From: Dimitri Shvorob

Date: 3 Sep, 2007 11:23:07

Message: 2 of 9

http://blogs.mathworks.com/pick/?p=74

Subject: Finding dips in a data set

From: Sachitha Obeysekara

Date: 3 Sep, 2007 12:34:46

Message: 3 of 9

Thanks for the reply.

I've tried using something similar but found the results
unsatisfactory since you have to set the threshold level
manually. If the height of the dip changes even slightly
w.r.t to the threshold the detection method fails.

Sach

"Dimitri Shvorob" <not.dimitri.shvorob@vanderbilt.edu>
wrote in message <fbgqqr$1hi$1@fred.mathworks.com>...
> http://blogs.mathworks.com/pick/?p=74

Subject: Finding dips in a data set

From: dpb

Date: 3 Sep, 2007 12:40:49

Message: 4 of 9

Sachitha Obeysekara wrote:
> Thanks for the reply.
>
> I've tried using something similar but found the results
> unsatisfactory since you have to set the threshold level
> manually. If the height of the dip changes even slightly
> w.r.t to the threshold the detection method fails.

I didn't look at your data closely, but in presence of noise you might
try fitting a smooth function and then look at residuals...

--

Subject: Finding dips in a data set

From: Dimitri Shvorob

Date: 3 Sep, 2007 12:57:10

Message: 5 of 9

> you have to set the threshold level manually
How else would the computer recognize a dip? I imagine that
you have a choice between setting a threshold on the
deviation from local average, or limiting the number of dips
to pick out.

Subject: Finding dips in a data set

From: dpb

Date: 3 Sep, 2007 13:29:28

Message: 6 of 9

Sachitha Obeysekara wrote:
> Hi,
> I have a data set of around 13000 points which effectively
> look like:
> a = 3.19 3.15 3.2 3.13 3.00 2.89 2.55 2.45 2.65 2.390 3.00
> 3.19 3.15 3.30
>
> ...so the data is basically a straight-ish line followed
> by a dip (with lots of noise in between)....how can find
> the location of the dip?
>
...
After my previous posting I did paste your sample data into Matlab and
plotted it -- for this subsample, it doesn't look at all like "basically
a straight-ish line followed by a dip" -- on this scale it looks more
like a "U" with a bump in the bottom.

Over a much larger population the result may look reasonably linear, but
what is the object even wouldn't be clear to me from your sample data
[sub]set.

Do you want the single "bump" in the bottom (the 2.65) of the U isolated
or is the overall minimum shown from points 3 through 12 the object of
concern?

Or, there are two "little bumps" at points 2 and 13 between their
surrounding two points that tend to destroy the smoothness of the
plotted figure.

Need more guidance on what you would expect the result of the algorithm
for the sample dataset to be, and undoubtedly the question would arise
again if a larger dataset were to be shown...


--


--

Subject: Finding dips in a data set

From: John D'Errico

Date: 3 Sep, 2007 13:36:24

Message: 7 of 9

"Sachitha Obeysekara" <sachitha.2.obeysekara@gsk.com> wrote in message
<fbgq2u$k0q$1@fred.mathworks.com>...
> Hi,
> I have a data set of around 13000 points which effectively
> look like:
> a = 3.19 3.15 3.2 3.13 3.00 2.89 2.55 2.45 2.65 2.390 3.00
> 3.19 3.15 3.30
>
> ...so the data is basically a straight-ish line followed
> by a dip (with lots of noise in between)....how can find
> the location of the dip?
>
> (something involving the gradient() function???)
>
> Any ideas?

Gradient won't help where you have noise.

If you really mean that it is constant, with a few
interspersed dips, then just compute the global
standard deviation. Look for any locations where
the curve dips below 3*sigma below the mean.

Or, slicker yet, estimate the coefficients of the
least squares quadratic polynomial, using a
running estimator. Pinv, then conv do all the
heavy lifting here. For example...

  % the window has length 2*n+1
  n = 3;
  t = (-n:n)';
  M = pinv([ones(size(t)),t,t.^2]);
  
  % all we need is the quadratic coefficient.
  c2 = conv(a,M(3,:));
  c2 = c2((n+1):(end-n)); % trim the ends.
  plot(c2,'o')

Now, wherever c2 is positive, this appears to be
a dip. The advantage of this approach is it is
not hurt if your baseline curve has a local trend
in it instead of being a constant.

HTH,
John

Subject: Finding dips in a data set

From: Sachitha Obeysekara

Date: 3 Sep, 2007 13:51:10

Message: 8 of 9

There are two dips in this set (I just made these up in
excel to look similar to my data set):
a =
3.15
3.2
3.13
3.08
3.1
2.98
2.55
2.282
2.22
2.3
3.17
3.19
3.15
3.228
3.19
3.15
3.2
3.13
3.08
3.1
2.89
2.55
2.282
2.266
2.39
3.17
3.19
3.15
3.3

The height of the dip can vary so I assumed a detection
method would involve looking at the gradient rather than
having to rely on setting some sort of threshold value

Thanks
Sach

dpb <none@non.net> wrote in message
<fbh2cc$dab$1@aioe.org>...
> Sachitha Obeysekara wrote:
> > Hi,
> > I have a data set of around 13000 points which
effectively
> > look like:
> > a = 3.19 3.15 3.2 3.13 3.00 2.89 2.55 2.45 2.65 2.390
3.00
> > 3.19 3.15 3.30
> >
> > ...so the data is basically a straight-ish line
followed
> > by a dip (with lots of noise in between)....how can
find
> > the location of the dip?
> >
> ...
> After my previous posting I did paste your sample data
into Matlab and
> plotted it -- for this subsample, it doesn't look at all
like "basically
> a straight-ish line followed by a dip" -- on this scale
it looks more
> like a "U" with a bump in the bottom.
>
> Over a much larger population the result may look
reasonably linear, but
> what is the object even wouldn't be clear to me from
your sample data
> [sub]set.
>
> Do you want the single "bump" in the bottom (the 2.65)
of the U isolated
> or is the overall minimum shown from points 3 through 12
the object of
> concern?
>
> Or, there are two "little bumps" at points 2 and 13
between their
> surrounding two points that tend to destroy the
smoothness of the
> plotted figure.
>
> Need more guidance on what you would expect the result
of the algorithm
> for the sample dataset to be, and undoubtedly the
question would arise
> again if a larger dataset were to be shown...
>
>
> --
>
>
> --

Subject: Finding dips in a data set

From: us

Date: 3 Sep, 2007 15:36:08

Message: 9 of 9

Sachitha Obeysekara:
<SNIP looking for a dip-finder...

one of the very many solutions is outlined below
note: this requires the <spline tbx>

us

% the data (your example)
     y=[
          3.150 3.200 3.130 3.080 3.100,...
          2.980 2.550 2.282 2.220 2.300,...
          3.170 3.190 3.150 3.228 3.190,...
          3.150 3.200 3.130 3.080 3.100,...
          2.890 2.550 2.282 2.266 2.390,...
          3.170 3.190 3.150 3.300,...
     ];
     x=1:numel(y);
% the engine
% ...a macro
     fun=@(v) find(...
                    v(1:end-2)>=v(2:end-1) & ...
                    v(2:end-1)<=v(3:end)...
                   )+1;
% - this requires the <spline tbx>!
     nf=5;
     sf=.75; % <- play with the smoothness
     xs=linspace(x(1),x(end),5*numel(x));
     ys=csaps(x,y,.75,xs).';
     id=fun(y);
     is=fun(ys);
     [it,it]=histc(xs(is),x);
     it=it+1;
 % the result
     r=[x(it);y(it)];
     disp(r);
 % ...on display
     line(x,y,...
         'marker','o',...
         'color',[1,0,0]);
     line(xs,ys);
     line(xs(is),ys(is),...
         'marker','+',...
         'linestyle','none');
     line(x(it),y(it),...
         'marker','o',...
         'markerfacecolor',[0,0,0],...
         'linestyle','none');

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
histc us 3 Sep, 2007 11:40:21
csaps us 3 Sep, 2007 11:40:21
spline toolbox us 3 Sep, 2007 11:40:21
spline us 3 Sep, 2007 11:40:21
smoothing us 3 Sep, 2007 11:40:21
code us 3 Sep, 2007 09:48:29
gradient Sachitha Obeysekara 3 Sep, 2007 07:15:05
peak Sachitha Obeysekara 3 Sep, 2007 07:15:05
data Sachitha Obeysekara 3 Sep, 2007 07:15:05
rssFeed for this Thread

Public Submission Policy

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.

Contact us at files@mathworks.com