Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Find numbers within (n)

Asked by LifeSux SuperHard on 17 Jun 2013

I'm trying to write a program that for a set of numbers with N groups of M numbers close (e.g. within 2) to each other will store each group separately and where each number M occurs in the original set. Groups MAY be adjacent in the set.

For example, the set

A = [300 301 300 350 350 350 400 399 401] 

Would produce

R = 
[300] [1]        [399] [8]
[300] [3]        [400] [7]
[301] [2]        [401] [9]

Or, preferably, the program would save into R only the start and end index of each group of numbers close to each other, e.g. for the above set

R = 
[300] [1]        [399] [7]
[300] [3]        [400] [9]
[301]            [401] 

Or, generally,

     R = 
      [number1]_ofgroup1 [Start]_ofgroup1
      [number2]_ofgroup1 [End]_ofgroup1
      ...
      [number_n]_ofgroup1 
      ...
      [number1]_ofgroupN [Start]_ofgroupN
      [number2]_ofgroupN [End]_ofgroupN
      ...
      [number_n]_ofgroupN

So far what I have been able to do is pick out the numbers close together and their location in the set. What I have NOT been able to do is group sets of numbers close to each other.

Now don't fixate on helping me figure out how to store this information into cells, I can do that pretty easily. Really what is most important to me is just how to separate the groups, like, with a for loop, or whatever.

% Test Set 
% Two sets of badness (307/308, 389/390), non-adjacent  
B = [
   267
   267  
   267
   267
   267
   267
   267
   267
   308
   308
   308
   308
   308
   308   
   350
   350
   350
   349
   349   
   350
   389
   390
   391
   391
   391
   389
   389
   390
   390
   ];
A = sort(B);  
%Find unique numbers in A
U = unique(A); %Unique in A 
% Visual Check (Print U, A, respective indices)
for i = 1:length(U)
    fprintf('%d %d\n', U(i), i); 
end
fprintf('U\n\n'); 
for i = 1:length(A)
    fprintf('%d %d\n', A(i), i);
end
fprintf('A\n\n'); 
%Find badness
l = 0; %Index: Number of indicies  in A identified as corresponding to offending numbers. 
%Alternatively, how much bad, total?
badL = []; %Array in which to store badness locations 
for i = 1:length(U)    
    for j = 1:length(A)       
        a = abs(U(i) - A(j));         
        if a <=2 && a ~= 0 %Within 2 of each other                         
            l = l + 1;             
            badL(l) = j;        %#ok<*SAGROW>
        end          
    end        
end
% Above seems to pick up duplicate bad indices when any #Bad couple differ
% by exactly two, not sure how to fix withiin the loop, but the problem does
% seem to be solved by "unique-ing" badL. 
badness = sort(unique(badL)); %Really, really hard to work with unsorted bad indicies
%Test initialization of R (just to show how I want the data formatted) 
for i = 1:3 %Group (N) 
    for k = 1:4 %Bad numbers 
        R{1,i}{1,1}{k,1} = [];
    end
    for m = 1:2 %Start and end indices for group N
        R{1,i}{1,2}{m,1} = []; 
    end        
end

0 Comments

LifeSux SuperHard

Products

No products are associated with this question.

4 Answers

Answer by Roger Stafford on 17 Jun 2013
 [B,p] = sort(A);
 f = find([true,diff(B)>2,true])';
 R = [f(1:end-1),f(2end)-1]; % Start and end indices w.r. to B

This isn't quite what you asked for. The indices in R refer to the sorted version of A, namely B. The vector 'p' could be used to find your way back to A. However what was a consecutive interval in B will not necessarily be one in A. I leave you with that problem.

1 Comment

LifeSux SuperHard on 17 Jun 2013

Yeah that's fine. I'll make that work for me.

I'm also just curious, how would you do that explicitly? Like if you couldn't use a bunch of matlab functions, but instead like only unique, round, and sort.

I'm sorry I'm just more interested in learning how to do this stuff explicitly than rely on some tool someone else, someone else who is a much, much better programmer than me, wrote for me.

But I'm certainly not going to ignore what you suggested.

Roger Stafford
Answer by Roger Stafford on 17 Jun 2013

The idea with the 'find' operation is to recognize a change of more than 2 in successive elements of the sorted list B. The 'diff' function just takes successive differences in B. You could just as well write

 B(2:end)-B(1:end-1)

instead of using 'diff' or an equivalent for-loop operation. You need the extra 'true' values at the beginning and the end to furnish the first start and the last end values which 'diff' wouldn't catch. The 'find' function itself could be reproduced with the appropriate for-loop, something like this:

 t = [true,diff(B)>2,true];
 f = []; % Start with f empty
 for k = 1:length(t)
   if t(k)
     f = [f,k]; % Append the index k to the f vector wherever t(k) is true
   end
 end

Then it is easy to see that f(1:end-1) is a list of the start indices and f(2:end)-1 a list of the end indices. The use of logical vectors for inputs to 'find' makes it a very useful function indeed. You should learn to be comfortable with it. The need for it pops up quite frequently in programming matlab.

Does that give you a feeling for the function calls involved?

0 Comments

Roger Stafford
Answer by Andrei Bobrov on 18 Jun 2013
B = [   267
   267
   391
   391
   391
   267
   267
   267
   308
   308
   308
   308
   308
   308
   350
   350
   350
   349
   349
   350
   389
   390
   391
   391
   391
   389
   389
   390
   390   ];
[a,c,c] = unique(B);
t = [true;diff(a) > 2];
tt = cumsum(t);
q = numel(B);
out = accumarray([tt,ones(numel(a),1);tt(c),2*ones(q,1)],[a;(1:q)'],[],@(x){x});

0 Comments

Andrei Bobrov
Answer by Jan Simon on 18 Jun 2013

Just to show how to pre-allocate efficiently, when the maximum number is known in advance:

badL = zeros(1, length(U)*length(A));
for i = 1:length(U)    
   for j = 1:length(A)       
      a = abs(U(i) - A(j));         
      if a <=2 && a ~= 0 %Within 2 of each other                         
          l = l + 1;             
          badL(l) = j;
      end
   end
end
badL = badL(1:l);

This reduces the memory footprint massively, when the result contains more than a few elements. E.g. if the result contains 45 elements, letting it grow iteratively allocates sum(1:45)=1035 elements and copies 990 elements. Then pre-allocating 1000 elements and cutting the array at the end should be more efficient already.

0 Comments

Jan Simon

Contact us