Documentation Center

  • Trial Software
  • Product Updates

Index and Search Using Categorical Arrays

    Note:   The nominal and ordinal array data types might be removed in a future release. To represent ordered and unordered discrete, nonnumeric data, use the MATLAB® categorical data type instead.

Index By Category

It is often useful to index and search data by its category, or group. If you store categories as string labels inside a cell array of strings or char array, it can be difficult to index and search the categories. When using categorical arrays, you can easily:

  • Index elements from particular categories. For both nominal and ordinal arrays, you can use the logical operators == and ~= to index the observations that are in, or not in, a particular category. For ordinal arrays, which have an encoded order, you can also use inequalities, >, >=, <, and <=, to find observations in categories above or below a particular category.

  • Search for members of a category. In addition to the logical operator ==, you can use ismember to find observations in a particular group.

  • Find elements that are not in a defined category. Categorical arrays indicate which elements do not belong to a defined category by <undefined>. You can use isundefined to find observations missing a category.

  • Delete observations that are in a particular category. You can use logical operators to include or exclude observations from particular categories. Even if you remove all observations from a category, the category level remains defined unless you remove it using droplevels.

Common Indexing and Searching Methods

This example shows several common indexing and searching methods.

Load the sample data.

load('carsmall');

Convert the char array, Origin, to a nominal array. This variable contains the country of origin, or manufacture, for each sample car.

Origin = nominal(Origin);

Search for observations in a category. Determine if there are any cars in the sample that were manufactured in Canada.

any(Origin=='Canada')
ans =

     0

There are no sample cars manufactured in Canada.

List the countries that are levels of Origin.

getlevels(Origin)
ans = 

     France      Germany      Italy      Japan      Sweden      USA 

Index elements that are in a particular category. Plot a histogram of the acceleration measurements for cars made in the U.S.

figure();
hist(Acceleration(Origin=='USA'))
title('Acceleration of Cars Made in the USA')

Delete observations that are in a particular category. Delete all cars made in Sweden from Origin.

Origin = Origin(Origin~='Sweden');
any(ismember(Origin,'Sweden'))
ans =

     0

The cars made in Sweden are deleted from Origin, but Sweden is still a level of Origin.

getlevels(Origin)
ans = 

     France      Germany      Italy      Japan      Sweden      USA 

Remove Sweden from the levels of Origin.

Origin = droplevels(Origin,'Sweden');
getlevels(Origin)
ans = 

     France      Germany      Italy      Japan      USA 

Check for observations not in a defined category. Get the indices for the cars made in France.

ix = find(Origin=='France')
ix =

    11
    27
    39
    61

There are four cars from France. Remove France from the levels of Origin.

Origin = droplevels(Origin,'France');

This returns a warning indicating that you are dropping a category level that has elements in it. These observations are no longer in a defined category, indicated by undefined.

Origin(ix)
ans = 

     <undefined> 
     <undefined> 
     <undefined> 
     <undefined> 

You can use isundefined to search for observations with an undefined category.

find(isundefined(Origin))
ans =

    11
    27
    39
    61

These indices correspond to the observations that were in category France, before that category was dropped from Origin.

See Also

| | | |

Related Examples

More About

Was this topic helpful?