Search for a part of a string in cell of cell array

113 views (last 30 days)
Hello everyone,
i've got some trouble with searching for a string in a cell array. I've tried looking it up, but I didn't find a solution which is working for me.
In fact, I got so confused with cell arrays or cell-in-cell-arrays that I don't know which one I am searching in... When I use the class-function on my 1x68 cell "VarNames" it says:
class(VarNames)
ans = cell
class(VarNames(1))
ans = cell
class(VarNames{1})
ans = char
So do I have a cell-in-cell-array or just a cell array with strings? And how do I search in it? It looks like this:
VarNames =
Columns 1 through 5
'Trend 1 - abc def' 'Trend 2 - ghi mmm jkl' 'Trend 3 - mno ttt kkk pqr' 'Trend 4 - stu vwx aaa' []
Columns 6 through 10
'Trend 6 - yzz yxw' 'Trend 7 - vut rrr srq' 'Trend 8 - pon mlk' 'Trend 9 - jih gfe'
So there are empty cells in it as well as a non-constant amount of chars. If I want to look for the cell-index in which the string "vwx" is included, how do I do it? I tried
find(strcmp('vwx',VarNames),1))
ans =
Empty matrix: 1-by-0
I also tried lots of other functions like strncmp, regexp or using cellfun to access the cell array, but I can't get it to work. Any advice?
Thanks in advance!
J.
  6 Comments
Stephen23
Stephen23 on 15 Dec 2015
So people can read it: Cell array indexing is simple:
  • curly braces {} refer to the cell contents.
  • parentheses () refer to the cells themselves.
Note that this usage of parentheses () is consistent in all of MATLAB: if you use them to access part of a numeric matrix, you get a numeric matrix. Use them to access part of a table, you get a table. Use them to access part of a call array, you get a cell array. Ditto character, etc.
This is totally different to curly braces {}, which refer to the data inside a cell array or table.
dpb
dpb on 15 Dec 2015
Yes, that's true as far as it goes. The problems arise as for the OP when one actually tries to make use of the content inside the cell array under general programming constructs; then the interactions and gotcha's! arise.
As in the example we finally got to on down; it turns out that OP placed an empty numeric cell inside the rest of a cell array of strings. That, when addressed by the string functions, hiccuped (not surprisingly, agreed) but it isn't at all apparent at the higher level where things went south to the newer user.
So it's not that the individual components are difficult granted, it's that in application one gets into situations that aren't that easy to solve. Some of these are errors as in OP's case which he solved by correcting the underlying issue (altho he didn't really recognize the issue precisely as being type, not content); others are that when addressing portions of interest for a given computation the result isn't always just a cell or a whatever, it's a list of whatevers which has it's own set of rules. On top of which there's the problem you can't always even generate the desired list as illustrated above.

Sign in to comment.

Answers (2)

dpb
dpb on 10 Dec 2015
It's a cellstr array, and yes, it can be confusing how to address them. The content of the cell is a character string referenced by "the curlies" where as the cell or a set of cells are addressed with regular parens. There's just no difference in what V or V(1) are; the latter is just a single cell of the former array. The V{} notation, however, returns the CONTENT of the cell which is something entirely different (albeit that may also be a cell or a double or a handle or, as in this case, a string (which is, in turn, an array of char()).
I put a subset into a shorter variable name V here for illustration --
>> whos V % a cell array of nine elements
Name Size Bytes Class Attributes
V 1x9 832 cell
>> v=V(1); whos v % make an object of the first cell -- same excepting just one element now
Name Size Bytes Class Attributes
v 1x1 94 cell
>> v=V{1}; whos v % samething for the _content_ it contains a character string (array of char)
Name Size Bytes Class Attributes
v 1x17 34 char
>> v=v(1); whos v % the content within -- one element is just a single letter now...
Name Size Bytes Class Attributes
v 1x1 2 char
>> v
v =
T
>> V(1)
ans =
'Trend 1 - abc def'
>> V{1}
ans =
Trend 1 - abc def
>>
Note in particular the difference between the last two -- the cell content is displayed inside single quotes which delimit a cellstring; the content is shown as simply the string of characters themselves; no quotes.
Cell strings are a great convenience as you can reference them with a single subscript and many functions are enhanced to handle them. But, it is somewhat roundabout road to get results such as you're looking for from the search functions, etc., granted because operations on cell arrays tend to return cell arrays.
To find your substring, try something like
>> ~cellfun(@isempty,strfind(V,'vwx'))
ans =
0 0 0 1 0 0 0 0 0
which is a logical array of T for the cell index(/indices) which contain the subject string, F elsewhere. You can get the single numeric array index by wrapping this in a call to find --
>> find(~cellfun(@isempty,strfind(V,'vwx')))
ans =
4
>>
Note here that cellfun is the way to get the results of the test for the various cell content compacted into an "ordinary" vector from the cell array of individual results. It takes some getting used to, but with familiarity it becomes more or less natural.
  5 Comments
Johannes Elfner
Johannes Elfner on 15 Dec 2015
Yea, strcmp was not the right choice... Especially if used like above. But I tried it in all kinds of combinations and the way I wrote it down here was just what I thought it would be the easiest way to show my problem. And well, I failed at that... ;) I've been using strcmp and mostly strncmp in the right way in my code, but still didn't get any results. I guess it was due to empty cells like in cell 3 of V:
V =
'abc' 'abc' [] 'abc' 'abc'
My workaround is to fill all empty cells with strings:
IndexEmptyFields = cellfun('isempty',V);
V(IndexEmptyFields) = {'-'};
Using this workaround searching the cell-array works fine using either your cellfun-code or strncmp. Thanks for your help!
dpb
dpb on 15 Dec 2015
V = 'abc' 'abc' [] 'abc' 'abc'
There's your problem, indeed--
>> class(V{3})
ans =
double
>>
You're trying to treat cell content that isn't a string as if it were; "You can't do that!" :) It's the right idea above but to be safer, use the empty string {''} to fill the locations (or, better yet, use that when creating the cellstr array in the first place); then there's still the "isempty" return to identify the locations but the string functions will function as intended.

Sign in to comment.


Guillaume
Guillaume on 15 Dec 2015
VarNames = {'Trend 1 - abc def', ...
'Trend 2 - ghi mmm jkl', ...
'Trend 3 - mno ttt kkk pqr', ...
'Trend 4 - stu vwx aaa', ...
[], ...
'Trend 6 - yzz yxw', ...
'Trend 7 - vut rrr srq', ...
'Trend 8 - pon mlk', ...
'Trend 9 - jih gfe'};
VarNames(cellfun(@isempty, VarNames)) = {''}; %replace empty matrices by empty strings
idx = find(~cellfun(@isempty, strfind(VarNames, 'vwx'))) %find cells that match
As to indexing () vs {}:
  • () returns a portion of the cell array. It always output a cell array, even if you only ask for one cell. You use this when you want to manipulate the container and don't care about what's in the cells.
  • {} returns the content of the cell array. It is always of the type of whatever is inside the cells. You use this when you want to access what is in the container.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!