File Exchange

image thumbnail

Unique Rows for a cell array

version 1.1 (2.98 KB) by

Find unique rows of a cell array containing columns with strings or scalars, or N-D matrices

18 Ratings



View License

unique(cA,'rows') does not work for cell arrays, the rows entry is ignored (at least as of 2009b). This function handles this case. It works for strings or scalars or both, but each column must contain only one or the other.

Comments and Ratings (23)

per isakson

Thanks for sharing this function!

I made COL_ORDER an input argument according to the tip in the comment on line 36. That exposed an issue:

line 105 reads
tmp = char(iCA(ndx,k)); %Nice and quick
but it should read
tmp = char(iCA(ndx,colUse));


ghills (view profile)

nice & fast

Thanks for publishing this, but I am having some issues with this function... I am calling it on a character cell array and I am getting the following error:

"Error using char
Cell elements must be character arrays.

Error in uniqueRowsCA (line 101)
tmp = char(iCA(ndx,k)); %Nice and quick"

When I tried removing the char() function from the statement above, I got an error for the following function used, sortrowsc(), that it does not exist.

Jim Hokanson

Jim Hokanson (view profile)


My guess is that you have NaN values. This function by default considers NaN values to be equal. This can be changed by passing in false as the 2nd input to this function.

If that's not the case then I'd be more than willing to get an example data set from you to try and figure out where the problem is arising.




I am trying to use your function, and am running into something strange. I have a matrix M, size 2520x3, that I turn into a cell array, size 2520x4 with the first column being all letters.

I run unique with the original M matrix first, to get an idea of how many rows I would be losing. I end up with a matrix that is 1194x3. One would assume that just adding a letter would have no bearing on the uniqueness of the remaining rows, hence I should have the same amount of rows as with the original matrix.

When I use your function, I have an 1105x4 cell array.

Do you have any idea what could be a problem?

Thank you

Jim Hokanson

Jim Hokanson (view profile)

Hi Arnold,

I'm not sure I understand what is going wrong in your case. If you send an example input to me I can try and make it work for your case.



arnold (view profile)

it would be great if this worked on mixed cell arrays. If I use it on something with mixed (numbers and strings) entries it gives me this error:

Error using char
Cell elements must be character arrays.
Error in uniqueRowsCA (line 101)
tmp = char(iCA(ndx,k)); %Nice and quick


Alize (view profile)

Good job, worked fast on a 6000x3 ca of strings.

Mike Shen



Tony (view profile)

Jessica Lam


Ulrik (view profile)

input check not implemented well. eg. if nargin == 2 the it will fail with an error.


Ulrik (view profile)


Nate (view profile)

Great job. Exactly what I was looking for.


Worked for me fast (1s) on a 50000x3 array of strings


Nick (view profile)

Excellent, and works very fast even on 100,000+ rows.


Sriram (view profile)

is way faster than the code by Matt Fig for cell arrays of strings

epoxy patch

thanks - exactly what i needed, and it works well

Jim Hokanson

Jim Hokanson (view profile)

Thanks for the comment Matt. The approach you suggest is definitely faster and more flexible for a small example like the one shown. Unfortunately the approach shown becomes nearly intractable for a larger number of rows when the number of unique rows is also large.

I tried a modified A matrix, that followed the rules outlined in the code. I updated the code tonight to begin to handle N-D matrices (currently they need to be of the same size, although I have a decent idea on how I can change the code to handle mismatches).

A = {1,'red', [3 2;4 5]; ...
2,'green', [4 4;5 6]};

To get more samples I added the following code:
A = repmat(A,[500 1]); %I had 2 rows to begin with
A(:,1) = num2cell(rand(size(A,1),1));

The second line gives the variability and can be commented out to see how large an effect multiple output rows can have.

On my computer the calculation took 30ms for 1000 rows compared to 10 ms with your code. However, if I make the numeric entries random, my code takes about the same amount of time as it had before, but your code takes 2 seconds. For 10000 rows, this changes to approximately 200 seconds (I only it once and got 213) vs 0.2 seconds for my code.

Interestingly, one of the slowest aspects of this code turned out to be cell2mat, which is called by sortrows. I rewrote my own version based a newsthread online. It is a bit frustrating though that TMW doesn't have a really quick way of going from something like {1 2 3} to [1 2 3].


Loginatorist (view profile)

This is much faster, and has no requirement as to types or ordering:

ii = 1;
while ii<size(A,1)
tf = true(size(A,1),1);
for jj = ii:size(A,1)
if isequal(A(ii,:),A(jj,:)) && ii~=jj
tf(jj) = 0;
A = A(tf,:);
ii = ii + 1;

For example, try it with:
A = {1,'red',magic(3);'blue',magic(4),'green';1,'red',magic(3)}



Increased speed. Put in support for matrices. Put in options for the usual additional outputs for a "unique" function in Matlab.

MATLAB Release
MATLAB 7.9 (R2009b)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Win prizes and improve your MATLAB skills

Play today