Got Questions? Get Answers.
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Selecting rows from cell arrays

Subject: Selecting rows from cell arrays

From: Hanah

Date: 26 Aug, 2010 20:11:21

Message: 1 of 5

Hi,

I have a question: Is there an easy way to select/index rows from cell arrays? I have loaded some alphanumeric data into MATLAB using the 'textscan' function. The problem is that I have to analyze this data in a format similar to pivot tables in excel.

What is the easiest way to select rows from a cell array that contain a specific number or text in a given column? For example:

%cell array "input{}"...
input = [
12 'cats' 2 'girls'
17 'dogs' 5 'boys'
23 'cats' 2 'boys'
];

%select only rows containing the string 'cats' in the second column...
output_1 = [
12 'cats' 2 'girls'
23 'cats' 4 'boys'
];

%select only rows containing the number 2 in the third column...
output_2 = [
12 'cats' 2 'girls'
23 'cats' 2 'boys'
];

%select only rows containing the number 17 in the first column...
output_3 = [
17 'dogs' 5 'boys'
];

%etc...

This is for a very large set of data where computation time is important, so if there is a function/notation to easily achieve this and prevent looping, that would be excellent. I have tried 'grpstats' and 'accumarray' but can't seem to crack it. I will really appreciate any guidance!

Thanks,

- Han

Subject: Selecting rows from cell arrays

From: Oleg Komarov

Date: 26 Aug, 2010 21:39:22

Message: 2 of 5

"Hanah " <zadonix@yahoo.com> wrote in message <i56hp9$lfn$1@fred.mathworks.com>...
> Hi,
>
> I have a question: Is there an easy way to select/index rows from cell arrays? I have loaded some alphanumeric data into MATLAB using the 'textscan' function. The problem is that I have to analyze this data in a format similar to pivot tables in excel.
>
> What is the easiest way to select rows from a cell array that contain a specific number or text in a given column? For example:
>
> %cell array "input{}"...
> input = [
> 12 'cats' 2 'girls'
> 17 'dogs' 5 'boys'
> 23 'cats' 2 'boys'
> ];
>
> %select only rows containing the string 'cats' in the second column...
> output_1 = [
> 12 'cats' 2 'girls'
> 23 'cats' 4 'boys'
> ];
>
> %select only rows containing the number 2 in the third column...
> output_2 = [
> 12 'cats' 2 'girls'
> 23 'cats' 2 'boys'
> ];
>
> %select only rows containing the number 17 in the first column...
> output_3 = [
> 17 'dogs' 5 'boys'
> ];
>
> %etc...
>
> This is for a very large set of data where computation time is important, so if there is a function/notation to easily achieve this and prevent looping, that would be excellent. I have tried 'grpstats' and 'accumarray' but can't seem to crack it. I will really appreciate any guidance!
>
> Thanks,
>
> - Han

It is not clear what you wanna do after the selection but I can say that your grouping criteria doesn't allow you to manipulate the information you store.

"I have to analyze this data in a format similar to pivot tables "
What do you mean with this?
I wrote a pivot function which replicates the spreadsheet functionalities (search on fex), check it out if it may be useful to you or explain with simple but precise examples.

Anyway you can achieve the desired output as follows:
input = {
12 'cats' 2 'girls'
17 'dogs' 5 'boys'
23 'cats' 2 'boys'
};

input(strcmp('cats',input(:,2)),:)
ans =
    [12] 'cats' [2] 'girls'
    [23] 'cats' [2] 'boys' % Note the typo in your example

input([input{:,3}] == 2,:)
ans =
    [12] 'cats' [2] 'girls'
    [23] 'cats' [2] 'boys'

input([input{:,1}] == 17,:)
ans =
    [17] 'dogs' [5] 'boys'

You may also find useful the first four chapters of the getting started guide.

Oleg

Subject: Selecting rows from cell arrays

From: Hanah

Date: 27 Aug, 2010 04:18:04

Message: 3 of 5

Hello Oleg,

Thank you so much for your elegant and quick solution. I was thinking along those line but that does not appear to solve the problem. Was your original variable, "input" a identified as a cell array or a matrix? I think there's a convolution with the data types that makes it tough to implement your clean solution.

For example, if in your current directory, you create a csv file called "input" with the following data in 3x4 cells:

12 cats 2 girls
17 dogs 5 boys
23 cats 2 boys

And then execute the following in MATLAB:

fid = fopen('input.txt');
input = textscan(fid, '%d8%s%d8%s%*[^\n]', 'delimiter', ',');
fclose(fid);

MATLAB reads the file perfectly, but your following commands will return an error:

input(strcmp('cats',input(:,2)),:) %Returns an empty cell array: 0-by-4
input([input{:,3}] == 2,:) %??? Index exceeds matrix dimensions.
input([input{:,1}] == 17,:) %??? Index exceeds matrix dimensions.

I tried to debug your first command by modifying the "strcmp" command as follows:

strcmp('cats',input{:,2})

Using brackets instead of parentheses to reference "input," but if I continue with the same logic and write:

input{strcmp('cats',input{:,2}),:}

I get an "Index exceeds matrix dimensions" error. I'm new to MATLAB but intuitively, I know there has to be a simple solution to this. I will keep trying but let me know if there's something I don't see. Thanks much for your time, it's well appreciated!

- Han

Subject: Selecting rows from cell arrays

From: Matt Fig

Date: 27 Aug, 2010 05:31:05

Message: 4 of 5

Using your suggested text file:

fid = fopen('input.txt');
input = textscan(fid, '%d8%s%d8%s');
fclose(fid);

% Now it seems like what you want is really this, explicitly written out:
I = input;
I = {I{1}(1) I{2}{1} I{3}(1) I{4}{1};
     I{1}(2) I{2}{2} I{3}(2) I{4}{2};
     I{1}(3) I{2}{3} I{3}(3) I{4}{3}}

% Now Oleg's scheme will work.
I(strcmp('cats',I(:,2)),:)
I([I{:,3}] == 2,:)
I([I{:,1}] == 17,:)


But what does your real data look like? Is it many thousands of rows but only 4 columns, or vis versa, or what? What are you going to do with these comparison/indexing operations? These questions should be answered before I can make further recommendations.

Subject: Selecting rows from cell arrays

From: Hanah

Date: 27 Aug, 2010 05:55:23

Message: 5 of 5

Hey Matt,

Thanks much for your prompt and helpful response. What I actually meant was a csv file, thus:

fid = fopen('input.csv');
input = textscan(fid, '%d8%s%d8%s');
fclose(fid);

But to clarify, I'm working with a file containing 75,000 rows X 50 columns of alphanumeric data. Some have integers, some have floats, some have strings and some have dates. But all data is of similar type along columns. I have to go implement some stochastic analyses with all this data.

With this in mind, it will be impossible to write anything out explicitly. The processing I want to do with the data involves grabbing every row for a given characteristic specified for a column. Using the example above, If I want to grab all rows containing the string 'cats' in the second column, the result should be a 2X4 array.

I will have to run this through a multitude of similar files, so I am trying to avoid looping at this level, especially since it appears that there is a possible simple maneuver. Most recently, I have tried to convert the cell array to a simple matrix using the "cell2mat" function, to see if it's easier to manipulate but I can't do that without having uniform data types.

I hope this helps to clarify.

Thanks,

Han

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us