Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Is there a better way to count number of occurances of string when grouping?

Subject: Is there a better way to count number of occurances of string when grouping?

From: Tyler

Date: 20 Mar, 2012 18:15:54

Message: 1 of 7

Hi everyone:

I have a dataset svdatm with columns:

% dnum (datenum)
% symid (symbol id - a cell array)

I would like to find the count or number of elements (i.e., symid)
that occur each day.

% Use a loop - this becomes slow for large array
mydates = unique(svdatm.dnum);
NDAYS = length(mydates);
symcount = nan(NDAYS,1);
for i = 1:NDAYS
    idxDay = svdatm.dnum==mydates(i);
    tmpD = svdatm(idxDay,:);
    symcount(i) = length(unique(tmpD.symid));
end

I've tried to speed up the process using something like the following,

symcount = unstack(svdatm,'symid','dnum','AggregationFun',@numel);

or similarly grpstats, but these all bomb because it is a cell array I
am applying the function too. Is there a better way to do this?

Cheers,

t.

Subject: Is there a better way to count number of occurances of string

From: Peter Perkins

Date: 20 Mar, 2012 19:30:43

Message: 2 of 7

Tyler, it's pretty hard to say without actually knowing more about what
your data look like, but you might try changing the body of your loop to

for i = 1:NDAYS
     symidDay = svdatm.symid(svdatm.dnum==mydates(i));
     symcount(i) = length(unique(symidDay));
end



On 3/20/2012 2:15 PM, Tyler wrote:
> Hi everyone:
>
> I have a dataset svdatm with columns:
>
> % dnum (datenum)
> % symid (symbol id - a cell array)
>
> I would like to find the count or number of elements (i.e., symid)
> that occur each day.
>
> % Use a loop - this becomes slow for large array
> mydates = unique(svdatm.dnum);
> NDAYS = length(mydates);
> symcount = nan(NDAYS,1);
> for i = 1:NDAYS
> idxDay = svdatm.dnum==mydates(i);
> tmpD = svdatm(idxDay,:);
> symcount(i) = length(unique(tmpD.symid));
> end
>
> I've tried to speed up the process using something like the following,
>
> symcount = unstack(svdatm,'symid','dnum','AggregationFun',@numel);
>
> or similarly grpstats, but these all bomb because it is a cell array I
> am applying the function too. Is there a better way to do this?
>
> Cheers,
>
> t.

Subject: Is there a better way to count number of occurances of string when grouping?

From: Bruno Luong

Date: 20 Mar, 2012 20:19:36

Message: 3 of 7

Tyler <hayes.tyler@gmail.com> wrote in message <9fac79ee-3505-4307-b232-b280773e7cbe@ow8g2000pbc.googlegroups.com>...
> Hi everyone:
>
> I have a dataset svdatm with columns:
>
> % dnum (datenum)
> % symid (symbol id - a cell array)
>
> I would like to find the count or number of elements (i.e., symid)
> that occur each day.
>
> % Use a loop - this becomes slow for large array
> mydates = unique(svdatm.dnum);
> NDAYS = length(mydates);
> symcount = nan(NDAYS,1);
> for i = 1:NDAYS
> idxDay = svdatm.dnum==mydates(i);
> tmpD = svdatm(idxDay,:);
> symcount(i) = length(unique(tmpD.symid));
> end
>

Please provide an example of svdatm. It is not clear what kind of structure is svdatm.

Bruno

Subject: Is there a better way to count number of occurances of string

From: Peter Perkins

Date: 20 Mar, 2012 20:43:14

Message: 4 of 7

On 3/20/2012 4:19 PM, Bruno Luong wrote:
> Tyler <hayes.tyler@gmail.com> wrote in message
> <9fac79ee-3505-4307-b232-b280773e7cbe@ow8g2000pbc.googlegroups.com>...
>> Hi everyone:
>>
>> I have a dataset svdatm with columns:
>>
>> % dnum (datenum)
>> % symid (symbol id - a cell array)
>>
>> I would like to find the count or number of elements (i.e., symid)
>> that occur each day.
>>
>> % Use a loop - this becomes slow for large array
>> mydates = unique(svdatm.dnum);
>> NDAYS = length(mydates);
>> symcount = nan(NDAYS,1);
>> for i = 1:NDAYS
>> idxDay = svdatm.dnum==mydates(i);
>> tmpD = svdatm(idxDay,:);
>> symcount(i) = length(unique(tmpD.symid));
>> end
>>
>
> Please provide an example of svdatm. It is not clear what kind of
> structure is svdatm.
>
> Bruno

I believe this is a dataset array from the Statistics Toolbox.

Subject: Is there a better way to count number of occurances of string

From: Tyler

Date: 20 Mar, 2012 20:57:18

Message: 5 of 7

Peter is correct, it is a Matlab "dataset" with the the two variables.

Cheers,

t.

Subject: Is there a better way to count number of occurances of string

From: Bruno Luong

Date: 20 Mar, 2012 21:05:15

Message: 6 of 7

Tyler <hayes.tyler@gmail.com> wrote in message <8a63ef11-c4f4-4ab1-b624-e0a4ced5da5f@qg3g2000pbc.googlegroups.com>...
> Peter is correct, it is a Matlab "dataset" with the the two variables.
>

Let me reiterate my request: Please provide an example of svdatm.

Bruno

Subject: Is there a better way to count number of occurances of string when grouping?

From: Tom Lane

Date: 20 Mar, 2012 22:11:00

Message: 7 of 7

> I have a dataset svdatm with columns:
>
> % dnum (datenum)
> % symid (symbol id - a cell array)
>
> I would like to find the count or number of elements (i.e., symid)
> that occur each day.

In addition to the other suggestions, consider trying

   crosstab(svdatm.dnum, svdatm.symid)

I don't know if it would be any faster, and it might require too much memory
if there are lots of unique symid values, but it's a thought.

-- Tom

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us