Quantcast

Documentation Center

  • Trial Software
  • Product Updates

Create Categorical Arrays

This example shows how to create a categorical array. categorical is a data type for storing data with values from a finite set of discrete categories. These categories can have a natural order, but it is not required. A categorical array provides efficient storage and convenient manipulation of data, while also maintaining meaningful names for the values. Categorical arrays are often used in a table to define groups of rows.

By default, categorical arrays contain categories that have no mathematical ordering. For example, the discrete set of pet categories {'dog' 'cat' 'bird'} has no meaningful mathematical ordering, so MATLAB® uses the alphabetical ordering {'bird' 'cat' 'dog'}. ordinal categorical arrays contain categories that have a meaningful mathematical ordering. For example, the discrete set of size categories {'small', 'medium', 'large'} has the mathematical ordering small < medium < large.

Create Categorical Array from Cell Array of Strings

You can use the categorical function to create a categorical array from a numeric array, logical array, cell array of strings, or an existing categorical array.

Create a 1-by-11 cell array of strings containing state names from New England.

state = {'MA','ME','CT','VT','ME','NH','VT','MA','NH','CT','RI'};

Convert the cell array of strings, state, to a categorical array that has no mathematical order.

state = categorical(state)

class(state)
state = 

  Columns 1 through 9

     MA      ME      CT      VT      ME      NH      VT      MA      NH 

  Columns 10 through 11

     CT      RI 


ans =

categorical

List the discrete categories in the variable state.

categories(state)
ans = 

    'CT'
    'MA'
    'ME'
    'NH'
    'RI'
    'VT'

The categories are listed in alphabetical order.

Create Ordinal Categorical Array from Cell Array of Strings

Create a 1-by-8 cell array of strings containing the sizes of eight objects.

AllSizes = {'medium','large','small','small','medium',...
            'large','medium','small'};

The cell array of strings, AllSizes, has three distinct values: 'large', 'medium', and 'small'. With the cell array of strings, there is no convenient way to indicate that small < medium < large.

Convert the cell array of strings, AllSizes, to an ordinal categorical array. Use valueset to specify the values small, medium, and large, which define the categories. For an ordinal categorical array, the first category specified is the smallest and the last category is the largest.

valueset = {'small','medium','large'};
sizeOrd = categorical(AllSizes,valueset,'Ordinal',true)

class(sizeOrd)
sizeOrd = 

  Columns 1 through 6

     medium      large      small      small      medium      large 

  Columns 7 through 8

     medium      small 


ans =

categorical

The order of the values in the categorical array, sizeOrd, remains unchanged.

List the discrete categories in the categorical variable, sizeOrd.

categories(sizeOrd)
ans = 

    'small'
    'medium'
    'large'

The categories are listed in the specified order to match the mathematical ordering small < medium < large.

Create Ordinal Categorical Array by Binning Data

Create a vector of 100 integers between 1 and 44.

x = gallery('integerdata',44,[100,1],1);

Use the histc function to create three bins for the data from x. Put all values between 1 and 15 into bin one, all the values between 15 and 30 into bin two, and all the values between 30 and 45 into bin three. Each bin includes the left endpoint, but does not include the right endpoint.

[~,bin] = histc(x,[1,15,30,45]);

bin is a 100-by-1 vector indicating the bin number for each entry from x.

Create and ordinal categorical array, sizeOrd2, where the three bins become the categories, small, medium, and large.

valueset = 1:3;
catnames = {'small','medium','large'};

sizeOrd2 = categorical(bin,valueset,catnames,'Ordinal',true);

sizeOrd2 is a 100-by-1 ordinal categorical array with three categories, such that small < medium < large.

Use the summary function to print a summary of the categorical array.

summary(sizeOrd2)
     small       33 
     medium      36 
     large       31 

There are 33 elements in the category small, 36 in the category medium, and 31 in the category large.

See Also

| |

Related Examples

More About

Was this topic helpful?