Code covered by the BSD License

Highlights from NUMUNIQUE

5.0
5.0 | 2 ratings Rate this file 3 Downloads (last 30 days) File Size: 4.42 KB File ID: #25209

NUMUNIQUE

Zhigang Xu (view profile)

02 Sep 2009 (Updated )

Returns unique elements in an array and all the indices for the repetitious values.

File Information
Description

[B P]=numunique(A) for the array A returns the same values as in A but
with no repetitions. B will also be sorted. A can only be numerical,
as the name of the function suggests.

The second output, P, is a row of cells, containing the indices of A,
such that, A(p{n})=B(n) is true. P has the same number of cells as the
number of elements in B.

Note that each cell of P lists all the indices of A, not only the first
or last occurrence, which have the repetitious values. This is different to the Mathwork's
function UNIQUE. Sometimes we need to know all the indices of which A have the same values.

MATLAB release MATLAB 7.7 (R2008b)
Tags for This File   Please login to tag files.
Comments and Ratings (10)
05 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

Ah, the explanation to the above puzzle (i.e., NaN vs Double) lies in that whenever NaN is concatenated with other classes, it will be converted to zero first. Thus,

uint8(nan)

ans =

0

int32(nan)

ans =

0

[nan uint16([1 2])]

ans =

0 1 2

whereas
[nan double([1 2])]

ans =

NaN 1 2

A lesson to remember is that NaN is only a not-a-number within double class. It will become zero when it is converted to other classes.

Comment only
05 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

This is a very thoughtful suggestion. Thanks! I did a quick experiment as follows, showing that NaN can only work with a double class to get right logical indices. I will implement your suggestion for next updating. Thanks again!

>> diff([nan [0 1]])~=0

ans =

1 1

>> diff([nan uint8([0 1])])~=0

ans =

0 1

Comment only
05 Sep 2009 Bruno Luong

Bruno Luong (view profile)

To make the function work for other integer class, I would suggest to modify the two lines:

d=diff([NaN; x]);
d=[true; d~=0];

to:

d=diff(x);
d=[true; d~=0];

Otherwise unexpected result is obtained with
[u p]=numunique(uint8([0 1]))

Comment only
04 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

Matt,

Thanks for your comments and rating. Your suggestion can evidently make the codes more concise. However, using logical indexing is usually faster than FIND, as the Matlab Editor would advise. For this reason, I am hesitating to implement it for now. I tested your suggestion with a large input array (x=randi(999, 7e6,1)), and did not find any significant improvement in speed, although it did not go slower either. I am wondering if you can supply me your test script to show the significant difference in speed. I will be very happy to implement your suggestion after I understand why it can do so. Thanks!

Zhigang

Comment only
04 Sep 2009 Matt Fig

Matt Fig (view profile)

Well done. About the only thing I would change would be to replace these lines:

n = 1:N;
d = diff([nan; x]);
d = d~=0;
n = n(d);

with

n = find(diff([nan; x]));

On my machines this can make a significant difference in speed. Other than that, great code!

04 Sep 2009 Bruno Luong

Bruno Luong (view profile)

Good coding

04 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

The updated version has been just up now (as 04-Sep-2009 11:09:21). So please feel free to download the new version. Thanks! --- Zhigang

Comment only
03 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

In my yesterday's submission, NUMUNIQE, I found a bug in it. Now I have fixed it and submitted an updated version already. The new version should show up in this center in a day or so. For those of you who have downloaded my yesterday's submission (there are only 5 downloads by now as 03-Sep-2009 18:52:49), you may simply cut and paste the following function body as the replacement. Or simply re-download the new version when it comes up. Sorry for any inconvenience. Zhigang

isrowvec=size(x,1)==1;
x=x(:);
[x s]=sort(x);
N=numel(x);
n=1:N;
s=s.';

d=diff([nan; x]);
d=d~=0;
n=n(d);
x=x(n);
K=numel(x);

if K==1
p{1}=s;
else
p{K}=s(n(K):N);
for k=1:K-1
p{k}=s(n(k):n(k+1)-1);
end
end

if isrowvec, x=x.'; end

Comment only
03 Sep 2009 Zhigang Xu

Zhigang Xu (view profile)

Jos,

You may want to avoid using FIND unnecessarily, since it is slow when the input is a very large array.

Zhigang

Comment only
03 Sep 2009 Jos (10584)

Jos (10584) (view profile)

Nice implementation with good help (although missing a see also to UNIQUE). Here is another approach you might consider:

B = unique(A) ;
P = cell(1,numel(B)) ;
for k = 1:numel(B)
P{k} = find(A==B(k)) ;
end

and in more recent ML versions:
P = arrayfun(@(x) find(x==A),B,'un',false)

Comment only
03 Sep 2009

There was a bug and now it has been fixed.

04 Sep 2009

Correction for the two minor typos in the help text.

04 Sep 2009

Correction also for the same typos but appeared in the General Information about the file.

05 Sep 2009

Suggestions from Matt Fig and Bruno Luong are implemented. A third optional output is implemented in case one also needs the representative indices, which are chosen as those for the first occurrences.

05 Sep 2009

Suggestions from Matt Fig and Bruno Luong, and the case of unique values only are implemented. A third optional output is implemented in case one also needs the representative indices, which are chosen as those for the first occurrences.