Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Count occurences by row

Subject: Count occurences by row

From: Namo Namo

Date: 16 Jul, 2010 18:17:20

Message: 1 of 13

Say I have a matrix

a = [ 1 2 2
  2 3 3
  1 4 5 ];

I want to count the occurences of 1 2 ... 5, but by how many rows. That is, although 2 appears 3 times, but only in 2 rows. I can count the total times by tabulate(a(:)) or histc, accumarray, etc. But to count the total rows, I am doing a for loop

for i = 1:N
count(i) = sum( any( a==i, 2) )
end

any advice on how to speep up for large N? Thanks.

Subject: Count occurences by row

From: us

Date: 16 Jul, 2010 18:45:20

Message: 2 of 13

"Namo Namo" <wynamo@yahoo.com> wrote in message <i1q7ng$78g$1@fred.mathworks.com>...
> Say I have a matrix
>
> a = [ 1 2 2
> 2 3 3
> 1 4 5 ];
>
> I want to count the occurences of 1 2 ... 5, but by how many rows. That is, although 2 appears 3 times, but only in 2 rows. I can count the total times by tabulate(a(:)) or histc, accumarray, etc. But to count the total rows, I am doing a for loop
>
> for i = 1:N
> count(i) = sum( any( a==i, 2) )
> end
>
> any advice on how to speep up for large N? Thanks.

one of the solutions

% the data
     a=[
          1 2 2
          2 3 3
          1 4 5
     ];
% the engine
     nx=1:max(a(:));
     n=histc(a.',nx);
     n=sum(n~=0,2);
% the result
     disp([nx;n.']);
%{
          1 2 3 4 5 % <= unique val...
          2 2 1 1 1 % <- #rows
%}

us

Subject: Count occurences by row

From: Sean

Date: 16 Jul, 2010 19:09:04

Message: 3 of 13

"Namo Namo" <wynamo@yahoo.com> wrote in message <i1q7ng$78g$1@fred.mathworks.com>...
> Say I have a matrix
>
> a = [ 1 2 2
> 2 3 3
> 1 4 5 ];
>
> I want to count the occurences of 1 2 ... 5, but by how many rows. That is, although 2 appears 3 times, but only in 2 rows. I can count the total times by tabulate(a(:)) or histc, accumarray, etc. But to count the total rows, I am doing a for loop
>
> for i = 1:N
> count(i) = sum( any( a==i, 2) )
> end
>
> any advice on how to speep up for large N? Thanks.

Another solution:

A = [2 3 2; 4 2 5; 1 3 7];
Acell = cellfun(@unique,mat2cell(A,ones(1,3),3),'UniformOutput',false);
U = unique(A);
[n] = histc(cell2mat(Acell'),U);
table = [U, n'] %U is the value n is the occurance

Subject: Count occurences by row

From: Namo Namo

Date: 16 Jul, 2010 19:13:04

Message: 4 of 13


> > Say I have a matrix
> >
> > a = [ 1 2 2
> > 2 3 3
> > 1 4 5 ];
> >
> > I want to count the occurences of 1 2 ... 5, but by how many rows. That is, although 2 appears 3 times, but only in 2 rows. I can count the total times by tabulate(a(:)) or histc, accumarray, etc. But to count the total rows, I am doing a for loop
> >
> > for i = 1:N
> > count(i) = sum( any( a==i, 2) )
> > end
> >
> > any advice on how to speep up for large N? Thanks.
>
> one of the solutions
>
> % the data
> a=[
> 1 2 2
> 2 3 3
> 1 4 5
> ];
> % the engine
> nx=1:max(a(:));
> n=histc(a.',nx);
> n=sum(n~=0,2);
> % the result
> disp([nx;n.']);
> %{
> 1 2 3 4 5 % <= unique val...
> 2 2 1 1 1 % <- #rows
> %}
>
> us


I actually used this before. It becomes slow for me because for me, a is like 1e5 rows and 5 columns, and max(a(:)) = 4000. So it takes a long time and generates a large n when I perform histc for each row of a, each time sorting 5 numbers into 4000 bins :-(. Thanks anyway.

Subject: Count occurences by row

From: Bruno Luong

Date: 16 Jul, 2010 19:18:04

Message: 5 of 13

Another solution:

a=[1 2 2;
2 3 3;
1 4 5];

sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)

Bruno

Subject: Count occurences by row

From: us

Date: 16 Jul, 2010 19:40:24

Message: 6 of 13

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qb9c$oo4$1@fred.mathworks.com>...
> Another solution:
>
> a=[1 2 2;
> 2 3 3;
> 1 4 5];
>
> sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
>
> Bruno

the problem with all these nice solutions
- given the OP's mat size...

     a=ceil(4000*rand(1e5,5));
     sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
%{
??? Error using ==> accumarray
Out of memory. Type HELP MEMORY for your options.
%}
% same with the HISTC approach...

us

Subject: Count occurences by row

From: Namo Namo

Date: 16 Jul, 2010 19:52:04

Message: 7 of 13

Indeed. Sorry I should have made my question more clear.

I know there is a tradeoff between time and memory. Vectorized code often requires more use of memory to hold indices. I guess I will just go with for loop for now. Thanks for the all the replies!



"us " <us@neurol.unizh.ch> wrote in message <i1qcj8$idl$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qb9c$oo4$1@fred.mathworks.com>...
> > Another solution:
> >
> > a=[1 2 2;
> > 2 3 3;
> > 1 4 5];
> >
> > sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> >
> > Bruno
>
> the problem with all these nice solutions
> - given the OP's mat size...
>
> a=ceil(4000*rand(1e5,5));
> sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> %{
> ??? Error using ==> accumarray
> Out of memory. Type HELP MEMORY for your options.
> %}
> % same with the HISTC approach...
>
> us

Subject: Count occurences by row

From: Bruno Luong

Date: 16 Jul, 2010 19:57:04

Message: 8 of 13

"us " <us@neurol.unizh.ch> wrote in message <i1qcj8$idl$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qb9c$oo4$1@fred.mathworks.com>...
> > Another solution:
> >
> > a=[1 2 2;
> > 2 3 3;
> > 1 4 5];
> >
> > sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> >
> > Bruno
>
> the problem with all these nice solutions
> - given the OP's mat size...
>
> a=ceil(4000*rand(1e5,5));
> sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> %{
> ??? Error using ==> accumarray
> Out of memory. Type HELP MEMORY for your options.
> %}
> % same with the HISTC approach...

Well, OP can break the matrix to smaller pieces, no need to eat at once an elephant.

Bruno

Subject: Count occurences by row

From: Sean

Date: 16 Jul, 2010 20:04:04

Message: 9 of 13

"us " <us@neurol.unizh.ch> wrote in message <i1qcj8$idl$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qb9c$oo4$1@fred.mathworks.com>...
> > Another solution:
> >
> > a=[1 2 2;
> > 2 3 3;
> > 1 4 5];
> >
> > sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> >
> > Bruno
>
> the problem with all these nice solutions
> - given the OP's mat size...
>
> a=ceil(4000*rand(1e5,5));
> sum(accumarray([mod((0:numel(a)-1)',size(a,1))+1 a(:)],1)>0)
> %{
> ??? Error using ==> accumarray
> Out of memory. Type HELP MEMORY for your options.
> %}
> % same with the HISTC approach...
>
> us

tic
A=ceil(4000*rand(1e5,5));
Acell = cellfun(@unique,mat2cell(A,ones(1,size(A,1)),size(A,2)),'UniformOutput',false);
U = unique(A);
[n] = histc(cell2mat(Acell'),U);
table = [U, n']; %U is the value n is the occurance
toc
%Elapsed time is 4.247591 seconds.
%table = 4000 x 2 double

Subject: Count occurences by row

From: Bruno Luong

Date: 16 Jul, 2010 20:06:07

Message: 10 of 13

Almost the same solution, but for large size:

a=ceil(4000*rand(1e5,5));

full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1, a(:), 1) > 0,1))

% Bruno

Subject: Count occurences by row

From: Sean

Date: 16 Jul, 2010 20:13:21

Message: 11 of 13

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qe3f$ouh$1@fred.mathworks.com>...
> Almost the same solution, but for large size:
>
> a=ceil(4000*rand(1e5,5));
>
> full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1, a(:), 1) > 0,1))
>
> % Bruno

That wins:
%Elapsed time is 0.199475 seconds.

Subject: Count occurences by row

From: us

Date: 16 Jul, 2010 21:05:09

Message: 12 of 13

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qe3f$ouh$1@fred.mathworks.com>...
> Almost the same solution, but for large size:
>
> a=ceil(4000*rand(1e5,5));
>
> full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1, a(:), 1) > 0,1))
>
> % Bruno

but with a lot of unnecessary data...

     a=[ceil(10*rand(1e5,5))+100000;ceil(10*rand(1e5,5))+200000];
     r=full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1,a(:),1)>0,1));
     whos r;
%{
  Name Size Bytes Class Attributes
  r 1x200010 1600080 double
%}
% while
     au=unique(a);
     size(au)
% ans = 20 1

just a pedestrian thought...
us

Subject: Count occurences by row

From: Bruno Luong

Date: 16 Jul, 2010 21:33:03

Message: 13 of 13

"us " <us@neurol.unizh.ch> wrote in message <i1qhi5$2jl$1@fred.mathworks.com>...
> "Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <i1qe3f$ouh$1@fred.mathworks.com>...
> > Almost the same solution, but for large size:
> >
> > a=ceil(4000*rand(1e5,5));
> >
> > full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1, a(:), 1) > 0,1))
> >
> > % Bruno
>
> but with a lot of unnecessary data...
>
> a=[ceil(10*rand(1e5,5))+100000;ceil(10*rand(1e5,5))+200000];
> r=full(sum( sparse(mod((0:numel(a)-1)',size(a,1))+1,a(:),1)>0,1));
> whos r;
> %{
> Name Size Bytes Class Attributes
> r 1x200010 1600080 double
> %}
> % while
> au=unique(a);
> size(au)
> % ans = 20 1
>
> just a pedestrian thought...
> us

I assume OP have posted this thread without random coincidence:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/287043

Bruno

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us