How to count "gaps" consisting of designated values in a vector?

34 views (last 30 days)
I have a very long vector of data that includes gaps of "invalid" values where data that's missing has been replaced with a designated missing data value (e.g. -1) or with "NaN".
invalid_values = [-1 NaN];
sample_data_vector = [22 23 22 24 -1 -1 -1 25 20 24 NaN Nan NaN 25 24 -1 -1 22 20 NaN 23];
I want to summarize info on the length of the gaps of invalid data (and no need to distinguish between whether the value is Nan or -1 or some other number). It might be helpful to know that my goal is to compare gap lengths over subsequent trials to determine whether gaps are getting longer/shorter and more/less frequent.
So for the above sample vector I'd get a matrix output that summarizes the following info about the gaps:
Length: 1 Number of occurences: 1
Length: 2 Number of occurences: 1
Length: 3 Number of occurences: 2
How would I write code to do this?

Accepted Answer

Matt J
Matt J on 18 Dec 2018
Edited: Matt J on 18 Dec 2018
One way, assuming you have the Image Processing Toolbox,
bw=ismember(sample_data_vector, invalid_values)|isnan(sample_data_vector);
S=regionprops(bw,'Area');
gaplengths=[S.Area];
H = histcounts(gaplengths,1:max(gaplengths)+1) %output histogram
  5 Comments
Al_G
Al_G on 8 Jun 2021
Revisiting this a few years later in an attempt to improve my code. Is there a straightforward way to obtain the indices of where the gaps start and end?
The ideal output would look like:
gap.length = [3, 3, 2, 1];
gap.start_indices = [5, 11, 16, 20];
gap.end_indices = [7, 13, 17, 20];
Matt J
Matt J on 8 Jun 2021
That will just be,
gap.start_indices = find( diff([0,bw])>0 );
gap.end_indices = find( diff([bw,0])<0 );
gap.length = gap.end_indices-gap.start_indices+1;

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2015b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!