Thread Subject: Creating a new vector based on unique entries

Subject: Creating a new vector based on unique entries

From: Anna Chen

Date: 14 May, 2009 13:46:02

Message: 1 of 11

Hello there,
I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:

cat 2
cat 3
cat 0
dog 4
dog 5
mouse 6

Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!

Thanks!

Subject: Creating a new vector based on unique entries

From: Andrey Rubshtein

Date: 14 May, 2009 14:01:02

Message: 2 of 11

a very quick and dirty solution to give you an idea

x = {'cat','cat','cat','dog','dog','mouse'};
y = [2,3,0,4,5,6];

cats = strmatch('cat',x);
dogs = strmatch('dog',x);
mice = strmatch('mouse',x);

catsAvg = mean(y(cats));
dogsAvg = mean(y(dogs));
miceAvg = mean(y(mice));

disp(catsAvg);
disp(dogsAvg);
disp(miceAvg);


Now write a loop which recognizes which different animals there are in x, and do it in more generic way

Subject: Creating a new vector based on unique entries

From: Lothar Schmidt

Date: 14 May, 2009 13:58:46

Message: 3 of 11

Anna Chen schrieb:
> Hello there,
> I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:
>
> cat 2
> cat 3
> cat 0
> dog 4
> dog 5
> mouse 6
>
> Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!
>
> Thanks!

like this?

k=0;
for tmp=unique(animal.type),
k=k+1;
list.name{k}=tmp;
index=find(strcmp(tmp,animal.type));
list.mean(k)=mean(animal.num(index));
end
list

Subject: Creating a new vector based on unique entries

From: Anna Chen

Date: 14 May, 2009 14:24:02

Message: 4 of 11

Thanks for everyone's help! I guess i was having trouble with the string matching thing.
Lothar, I have a question for you. when you do find(strcmp(tmp, animal.type)), won't that not work because tmp will be a cell with 3 elements and animal.type has 5?
Just wanted to understand the logic and improve my skills!


Lothar Schmidt <vapooroop@gmx.net> wrote in message <guh86e$5iq$01$1@news.t-online.com>...
> Anna Chen schrieb:
> > Hello there,
> > I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:
> >
> > cat 2
> > cat 3
> > cat 0
> > dog 4
> > dog 5
> > mouse 6
> >
> > Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!
> >
> > Thanks!
>
> like this?
>
> k=0;
> for tmp=unique(animal.type),
> k=k+1;
> list.name{k}=tmp;
> index=find(strcmp(tmp,animal.type));
> list.mean(k)=mean(animal.num(index));
> end
> list

Subject: Creating a new vector based on unique entries

From: Lothar Schmidt

Date: 14 May, 2009 16:08:18

Message: 5 of 11

Anna Chen schrieb:
> Thanks for everyone's help! I guess i was having trouble with the string matching thing.
> Lothar, I have a question for you. when you do find(strcmp(tmp, animal.type)), won't that not work because tmp will be a cell with 3 elements and animal.type has 5?
> Just wanted to understand the logic and improve my skills!
>
>
> Lothar Schmidt <vapooroop@gmx.net> wrote in message <guh86e$5iq$01$1@news.t-online.com>...
>> Anna Chen schrieb:
>>> Hello there,
>>> I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:
>>>
>>> cat 2
>>> cat 3
>>> cat 0
>>> dog 4
>>> dog 5
>>> mouse 6
>>>
>>> Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!
>>>
>>> Thanks!
>> like this?
>>
>> k=0;
>> for tmp=unique(animal.type),
>> k=k+1;
>> list.name{k}=tmp;
>> index=find(strcmp(tmp,animal.type));
>> list.mean(k)=mean(animal.num(index));
>> end
>> list

supposing that

animal.type{1}='cat'
animal.type{2}='dog'
...
animal.num(1)=4
animal.num(2)=7
...

tmp will be a cell with one og the anymal types.

strcmp(tmp,animal.type)

will compare any animal type to the current tmp (type of animal) and
will give you 1 if type=tmp and 0 if type~=tmp
find(this_logical) will give you the index of identical animaltypes

mean(animal.num(index)) gives the mean of the appropriate numbers

is this the answer to your question?

Subject: Creating a new vector based on unique entries

From: Anna Chen

Date: 14 May, 2009 18:42:01

Message: 6 of 11

hi lothar,
i had to change to
for k = 1:1:3
for tmp=unique(animal.type),
list.name{k}=tmp {k};
index=find(strcmp(tmp{k},animal.type));
list.mean(k)=mean(animal.num(index));
 end

or else the two things in the strcmp wouldn't match.

thanks so much for your help and your explanations! now i have a better grasp on this
 =) also, before i never knew that creating list.x and llist.y would make a list with "x" and "y"!
thanks again!


Lothar Schmidt <vapooroop@gmx.net> wrote in message <guhfp9$495$03$1@news.t-online.com>...
> Anna Chen schrieb:
> > Thanks for everyone's help! I guess i was having trouble with the string matching thing.
> > Lothar, I have a question for you. when you do find(strcmp(tmp, animal.type)), won't that not work because tmp will be a cell with 3 elements and animal.type has 5?
> > Just wanted to understand the logic and improve my skills!
> >
> >
> > Lothar Schmidt <vapooroop@gmx.net> wrote in message <guh86e$5iq$01$1@news.t-online.com>...
> >> Anna Chen schrieb:
> >>> Hello there,
> >>> I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:
> >>>
> >>> cat 2
> >>> cat 3
> >>> cat 0
> >>> dog 4
> >>> dog 5
> >>> mouse 6
> >>>
> >>> Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!
> >>>
> >>> Thanks!
> >> like this?
> >>
> >> k=0;
> >> for tmp=unique(animal.type),
> >> k=k+1;
> >> list.name{k}=tmp;
> >> index=find(strcmp(tmp,animal.type));
> >> list.mean(k)=mean(animal.num(index));
> >> end
> >> list
>
> supposing that
>
> animal.type{1}='cat'
> animal.type{2}='dog'
> ...
> animal.num(1)=4
> animal.num(2)=7
> ...
>
> tmp will be a cell with one og the anymal types.
>
> strcmp(tmp,animal.type)
>
> will compare any animal type to the current tmp (type of animal) and
> will give you 1 if type=tmp and 0 if type~=tmp
> find(this_logical) will give you the index of identical animaltypes
>
> mean(animal.num(index)) gives the mean of the appropriate numbers
>
> is this the answer to your question?

Subject: Creating a new vector based on unique entries

From: us

Date: 14 May, 2009 19:33:01

Message: 7 of 11

"Anna Chen" <icedredtea@yahoo.com> wrote in message <guh7ap$s7n$1@fred.mathworks.com>...
> Hello there,
> I have a question on something that I just can't seem to get to work correctly. I have a dataset imported from excel that looks like this:
>
> cat 2
> cat 3
> cat 0
> dog 4
> dog 5
> mouse 6
>
> Where the first column is cell and the second is double. How would I create a vector that has three elements, the first that averages all the "cats," the second that averages all the "dogs" and the third that averages all the "mice"? I've been trying combinations of for and while loops, but the different data classes are messing me up!
>
> Thanks!

one of the many solutions

% the data
% - note: your data combined in one cell for sake of brevity...
     d={
          'cat' 1
          'cat' 2
          'dog' 4
          'cat' 3
          'dog' 8
          'mouse' -10
     };
% the engine
     nÊt(1,d{:,2}); % <- your 2nd data set...
     [du,ix,ix]=unique(d(:,1));
     r¬cumarray(ix,n,[],@mean);
% the result
     disp([du,num2cell(r)]);
%{
    'cat' [ 2]
    'dog' [ 6]
    'mouse' [-10]
%}

us

Subject: Creating a new vector based on unique entries

From: us

Date: 14 May, 2009 19:58:01

Message: 8 of 11

"us"
the broken TMW newsreader starts to get on my nerves...

one of the many solutions (hope it works this time...)
- copy/paste

% the data
% - note: your data combined in one cell for sake of brevity...
     d={
          'cat' 1
          'cat' 2
          'dog' 4
          'cat' 3
          'dog' 8
          'mouse' -10
     };
% the engine
     n = cat(1,d{:,2}); % <- your 2nd data set...
     [du,ix,ix] = unique(d(:,1));
     r = accumarray(ix,n,[],@mean);
% the result
     disp([du,num2cell(r)]);
%{
    'cat' [ 2]
    'dog' [ 6]
    'mouse' [-10]
%}

us

Subject: Creating a new vector based on unique entries

From: Siyi

Date: 14 May, 2009 20:10:16

Message: 9 of 11

On May 14, 12:58 pm, "us " <u...@neurol.unizh.ch> wrote:
> "us"
> the broken TMW newsreader starts to get on my nerves...
>
> one of the many solutions (hope it works this time...)
> - copy/paste
>
> % the data
> % - note: your data combined in one cell for sake of brevity...
>      d={
>           'cat'          1
>           'cat'          2
>           'dog'          4
>           'cat'          3
>           'dog'          8
>           'mouse'      -10
>      };
> % the engine
>      n = cat(1,d{:,2});     % <- your 2nd data set...
>      [du,ix,ix] = unique(d(:,1));
>      r = accumarray(ix,n,[],@mean);
> % the result
>      disp([du,num2cell(r)]);
> %{
>     'cat'      [  2]
>     'dog'      [  6]
>     'mouse'    [-10]
> %}
>
> us

us, I realized that accumarray(...,@mean) can be slower than using


r = accumarray(ix,n)./accumarray(ix,1);

Subject: Creating a new vector based on unique entries

From: us

Date: 14 May, 2009 20:32:02

Message: 10 of 11

Siyi
> us, I realized that accumarray(...,@mean) can be slower than using
> r = accumarray(ix,n)./accumarray(ix,1);

that certainly is/may be correct in this particular case...
however, i used the (more) generic syntax for educational purposes - eg, what if the OP needs the @sum or something else being applied to the clusters...

just a thought
us

Subject: Creating a new vector based on unique entries

From: Jos

Date: 15 May, 2009 11:21:02

Message: 11 of 11

"Andrey Rubshtein" <katana55@gmail.com> wrote in message <guh86u$qca$1@fred.mathworks.com>...
> a very quick and dirty solution to give you an idea
>
> x = {'cat','cat','cat','dog','dog','mouse'};
> y = [2,3,0,4,5,6];
>
> cats = strmatch('cat',x);
> dogs = strmatch('dog',x);
> mice = strmatch('mouse',x);
>
> catsAvg = mean(y(cats));
> dogsAvg = mean(y(dogs));
> miceAvg = mean(y(mice));
>
> disp(catsAvg);
> disp(dogsAvg);
> disp(miceAvg);
>
>
> Now write a loop which recognizes which different animals there are in x, and do it in more generic way

You might be interested in my GROUP2CELL function:

animals = {'cat','cat','cat','dog','dog','mouse'};
val = [2,3,0,4,5,6];

%engine
[R,gri] = group2cell(val,animals)
avg = cellfun(@mean,R) ;

% display result
[animals(gri).' num2cell(avg)]

GROUP2CELL can be found here:
http://www.mathworks.com/matlabcentral/fileexchange/11192

hth
Jos

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
group2cell Jos 15 May, 2009 07:24:06
accumarray us 14 May, 2009 15:34:03
unique us 14 May, 2009 15:34:03
code us 14 May, 2009 15:34:03
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com