MATLAB Answers


Find unique numbers in a set

Asked by LifeSux SuperHard on 12 Jun 2013
Latest activity Edited by John Kelly
on 12 Feb 2015

This is, lol, I guess another personal exercise (sort of, although it is kind towards solving a bigger problem).

Anyway I'm having trouble figuring out how to find the unique numbers in an array/set.

Now, I KNOW about the unique function. It's cool. I'm even using it elsewhere. But I am also interested in how it is written, and the unique.m file in the matlab toolbox is basically incomprehensible to me.

So far I have started out really simple. What I am trying to do is simply look at an array, see if a number is repeated (literally true false atm), then stick that number and it's occurrence boolean into a cell. Here is the code so far:

%What we want to do is fill a cell as follows 
%C=cell{1,2,N(=Unique values in [S])} 
%C{1,1,N} = [N_unique] (N_unique => Nq)
%C{1,2,N} = [Occurence of N_unique] (=> On(Nq))
%Such that 
%C{:,:,1} = 
%[Nq(1)] [Oc(Nq(1))] 
%C{:,:,N} = 
%[Nq(N)] [Oc(Nq(N))] 
%This requires an index N that iterates only once for every number,
%whether or not it occurs more than once. 
%This also requires another index Oc that iterates once for every time
%its respective number occurs. Oc must be at least one for every number. 
%So, the array [267 308 267] should produce 
%C{:,:,1} = 
%[267] [2] 
%C{:,:,2} = 
%[308] [1] 
A = [267 267 308]; 
N = 1; %Index for each unique number
On = 0; 
    for i = 1:length(A)    
        for j = 1:length(A)        
            if A(i) == A(j) && i~=j 
                %fprintf('%d %d %d %d %d\n', A(i),i, A(j),j,1); 
                %A(i) has occured more than once 
              On = 1; %True for On(A(i))             
          elseif A(i) ~= A(j)
              fprintf('%d %d %d %d %d\n', A(i),i, A(j),j, 0); 
              %A(i) has not occured more than once 
              On = 0; %False for On(A(i))             
          C{1,1,N} = A(i); 
          C{1,2,N} = On;          

Now this code is pretty unfinished, obviously. For example, N doesn't even increase. It just sits there the whole time at 1. This is because I can't figure out how to get it to iterate once PER NUMBER, as opposed to, for example, per loop.

Also Oc right now is just true/false, not the actual occurrence rate.

Anyway sorry for wasting time but I'm just really frustrated and curious about how to do this.

  1 Comment

on 13 Sep 2013

I have a similar issue, I need to sort a cell array containing numbers, but unique only works for strings. I have tried :

 %% Clean data of repeated value errors
%makes the numbers of the first col into strings so unique will work with them.
for idx=2:length(raw1(:,1))   %ignores the string at the top
%only works on single digit numbers :C
raw=raw1(idx_new);Raw is a massive cell array with column containing both letters and numbers


No products are associated with this question.

2 Answers

Answer by Roger Stafford
on 12 Jun 2013
 Accepted answer

If I were writing a 'unique' function, the first operation would be to sort the set. There are many efficient routines to do that. After that it is a breeze picking out only those elements in the sorted sequence that are unique: pick the first (or last) of each consecutive subsequence of equal elements.


To first order

B = [N1 N2 N3 ... Nn]; 
A = sort(B); 
j = 1; 
for i = 1:length(A)
    if i+1>length(A)
       if A(i-1) ~= A(i)
          U(j) = A(i); 
elseif A(i) ~= A(i+1) 
       U(j) = A(i); 
       j = j+1; 

except this like skips the last value wtf.

You are, in effect, storing the last element in each series of equal elements in A, but for a final sequence of the largest element there is no following element to compare them with. That is why you skipped the last one. You should unconditionally store the final element of A in U. In other words, leave out the "if A(i-1) ~= A(i)" part and just do "U(j) = A(i);" at that point.

Here's an easier way to compute U:

 A = sort(B);
 U = A([true,diff(A)~=0]);

Of course you have to work harder to reorder U according to the original order in B. You have to work even harder to come back with the other two returned index vectors from matlab's 'unique' function.

Answer by LifeSux SuperHard on 12 Jun 2013

B = [1 2 2 3 3 3 4 4 4 4];
A = sort(B);
j = 1; 
for i = 1:length(A)
    if i+1> length(A) 
        if A(i)~=A(i-1)
            O{1,1,j} = A(i);
        elseif A(i) == A(i-1) 
            O{1,1,j} = A(i);
    elseif A(i)~= A(i+1)        
        O{1,1,j} = A(i);
        j = j+1;            
k = 0; 
for i = 1:length(O)
    for j = 1:length(A) 
        if O{1,1,i} == A(j)            
            k = k + 1;
        O{1,2,i} = k; 
    k = 0; 

  1 Comment

If anyone can think of a better way to do/optimize this code please let me know. It's roughly .01s slower than matlab's own unique function, or matlab's unique is about 6.5 times faster.

Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

MATLAB Academy

New to MATLAB?

Learn MATLAB today!