Find unique numbers in a set

23 views (last 30 days)
LifeSux SuperHard
LifeSux SuperHard on 12 Jun 2013
Edited: John Kelly on 10 Nov 2017
This is, lol, I guess another personal exercise (sort of, although it is kind towards solving a bigger problem).
Anyway I'm having trouble figuring out how to find the unique numbers in an array/set.
Now, I KNOW about the unique function. It's cool. I'm even using it elsewhere. But I am also interested in how it is written, and the unique.m file in the matlab toolbox is basically incomprehensible to me.
So far I have started out really simple. What I am trying to do is simply look at an array, see if a number is repeated (literally true false atm), then stick that number and it's occurrence boolean into a cell. Here is the code so far:
%What we want to do is fill a cell as follows
%C=cell{1,2,N(=Unique values in [S])}
%Where
%C{1,1,N} = [N_unique] (N_unique => Nq)
%C{1,2,N} = [Occurence of N_unique] (=> On(Nq))
%Such that
%C{:,:,1} =
%[Nq(1)] [Oc(Nq(1))]
%...
%C{:,:,N} =
%[Nq(N)] [Oc(Nq(N))]
%
%This requires an index N that iterates only once for every number,
%whether or not it occurs more than once.
%This also requires another index Oc that iterates once for every time
%its respective number occurs. Oc must be at least one for every number.
%So, the array [267 308 267] should produce
%C{:,:,1} =
%[267] [2]
%C{:,:,2} =
%[308] [1]
A = [267 267 308];
N = 1; %Index for each unique number
On = 0;
for i = 1:length(A)
for j = 1:length(A)
if A(i) == A(j) && i~=j
%fprintf('%d %d %d %d %d\n', A(i),i, A(j),j,1);
%A(i) has occured more than once
On = 1; %True for On(A(i))
elseif A(i) ~= A(j)
fprintf('%d %d %d %d %d\n', A(i),i, A(j),j, 0);
%A(i) has not occured more than once
On = 0; %False for On(A(i))
end
C{1,1,N} = A(i);
C{1,2,N} = On;
end
end
Now this code is pretty unfinished, obviously. For example, N doesn't even increase. It just sits there the whole time at 1. This is because I can't figure out how to get it to iterate once PER NUMBER, as opposed to, for example, per loop.
Also Oc right now is just true/false, not the actual occurrence rate.
Anyway sorry for wasting time but I'm just really frustrated and curious about how to do this.
  1 Comment
¥agoth
¥agoth on 13 Sep 2013
I have a similar issue, I need to sort a cell array containing numbers, but unique only works for strings. I have tried :
%%Clean data of repeated value errors
%makes the numbers of the first col into strings so unique will work with them.
%
raw1={'string';1;2;3;3,3000,20}
raw2=cell(length(raw1(:,1)),1);
for idx=2:length(raw1(:,1)) %ignores the string at the top
raw2{idx}=num2str(raw1{idx,1});
end
%
[no_reps,idx_new,idx_0]=unique(raw2(:),'stable');
%only works on single digit numbers :C
num=num1(idx_new(2:length(idx_new))-1,:);
raw=raw1(idx_new);Raw is a massive cell array with column containing both letters and numbers

Sign in to comment.

Accepted Answer

Roger Stafford
Roger Stafford on 12 Jun 2013
If I were writing a 'unique' function, the first operation would be to sort the set. There are many efficient routines to do that. After that it is a breeze picking out only those elements in the sorted sequence that are unique: pick the first (or last) of each consecutive subsequence of equal elements.
  8 Comments
LifeSux SuperHard
LifeSux SuperHard on 12 Jun 2013
except this like skips the last value wtf.
Roger Stafford
Roger Stafford on 12 Jun 2013
You are, in effect, storing the last element in each series of equal elements in A, but for a final sequence of the largest element there is no following element to compare them with. That is why you skipped the last one. You should unconditionally store the final element of A in U. In other words, leave out the "if A(i-1) ~= A(i)" part and just do "U(j) = A(i);" at that point.
Here's an easier way to compute U:
A = sort(B);
U = A([true,diff(A)~=0]);
Of course you have to work harder to reorder U according to the original order in B. You have to work even harder to come back with the other two returned index vectors from matlab's 'unique' function.

Sign in to comment.

More Answers (1)

LifeSux SuperHard
LifeSux SuperHard on 12 Jun 2013
B = [1 2 2 3 3 3 4 4 4 4];
A = sort(B);
j = 1;
for i = 1:length(A)
if i+1> length(A)
if A(i)~=A(i-1)
O{1,1,j} = A(i);
elseif A(i) == A(i-1)
O{1,1,j} = A(i);
end
break
elseif A(i)~= A(i+1)
O{1,1,j} = A(i);
j = j+1;
end
end
k = 0;
for i = 1:length(O)
for j = 1:length(A)
if O{1,1,i} == A(j)
k = k + 1;
end
O{1,2,i} = k;
end
k = 0;
end
O
  1 Comment
LifeSux SuperHard
LifeSux SuperHard on 12 Jun 2013
Edited: LifeSux SuperHard on 12 Jun 2013
If anyone can think of a better way to do/optimize this code please let me know. It's roughly .01s slower than matlab's own unique function, or matlab's unique is about 6.5 times faster.

Sign in to comment.

Categories

Find more on Shifting and Sorting Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!