# categorical

Create categorical array

Use `categorical` to create a categorical array from data with values from a finite set of discrete categories. To group numeric data into categories, use `discretize`.

## Syntax

• `B = categorical(A)` example
• `B = categorical(A,valueset)` example
• `B = categorical(A,valueset,catnames)` example
• `B = categorical(A,___,Name,Value)` example

## Description

example

````B = categorical(A)` creates a categorical array from the array, `A`. The categories of `B` are the sorted unique values from `A`.For more information on creating and using categorical arrays, see Categorical Arrays.```

example

````B = categorical(A,valueset)` creates one category for each value in `valueset`. The categories of `B` are in the same order as the values of `valueset`.You can use `valueset` to include categories for values not present in `A`. Conversely, if `A` contains any values not present in `valueset`, the corresponding elements of `B` are undefined.```

example

````B = categorical(A,valueset,catnames)` names the categories in `B` by matching the category values in `valueset` with the names in `catnames`.```

example

````B = categorical(A,___,Name,Value)` creates a categorical array with additional options specified by one or more `Name,Value` pair arguments. You can include any of the input arguments in previous syntaxes.For example, you can specify that the categories have a mathematical ordering.```

## Examples

collapse all

### Create Categorical Array from Strings

Convert a cell array of strings to a categorical array.

Create a cell array of strings.

`A = {'r' 'b' 'g'; 'g' 'r' 'b'; 'b' 'r' 'g'}`
```A = 'r' 'b' 'g' 'g' 'r' 'b' 'b' 'r' 'g'```

`A` is a 3-by-3 cell array containing three unique values.

Convert the cell array of strings, `A`, to a categorical array, `B`.

`B = categorical(A)`
```B = r b g g r b b r g ```

The contents of `B` match the contents of `A`.

Display the categories of `B`.

`categories(B)`
```ans = 'b' 'g' 'r'```

The categories of `B` are the unique values from `A` in alphabetical order.

### Create Categorical Array and Specify Possible Unique Values

Convert a cell array of strings, `A`, to a categorical array. Specify a list of categories that includes values that are not present in `A`.

Create a cell array of strings.

`A = {'republican' 'democrat'; 'democrat' 'democrat'; 'democrat' 'republican'}`
```A = 'republican' 'democrat' 'democrat' 'democrat' 'democrat' 'republican'```

`A` is a 3-by-2 cell array containing two unique values.

Convert the cell array of strings, `A`, to a categorical array, `B` and include a category for `independent`.

```valueset = {'democrat' 'republican' 'independent'} B = categorical(A,valueset)```
```B = republican democrat democrat democrat democrat republican ```

The contents of `B` match the contents of `A`.

Display the categories of `B`.

`categories(B)`
```ans = 'democrat' 'republican' 'independent'```

The categories of `B` are in the same order as the values specified in `valueset`.

### Create Categorical Array and Specify Category Names

Create a cell array of strings.

`A = {'r' 'b' 'g'; 'g' 'r' 'b'; 'b' 'r' 'g'}`
```A = 'r' 'b' 'g' 'g' 'r' 'b' 'b' 'r' 'g'```

`A` is a 3-by-3 cell array containing three unique values.

Convert the cell array of strings, `A`, to a categorical array, `B`, and specify category names.

`B = categorical(A,{'r' 'g' 'b'},{'red' 'green' 'blue'})`
```B = red blue green green red blue blue red green ```

`B` uses the specified category names for the contents from `A`.

Display the categories of `B`.

`categories(B)`
```ans = 'red' 'green' 'blue'```

The categories of `B` are in the order they were specified.

### Create Categorical Array from Integers

Create a 2-by-3 numeric array.

`A = gallery('integerdata',3,[2,3],3)`
```A = 2 1 2 1 1 3```

`A` contains the values `1`, `2`, and `3`.

Convert the numeric array, `A`, to a categorical array. Use the values `1`, `2`, and `3` to define the categories `car`, `bus`, and `bike`, respectively.

```valueset = 1:3; catnames = {'car' 'bus' 'bike'}; B = categorical(A,valueset,catnames)```
```B = bus car bus car car bike ```

`categorical` maps the numeric values in `valueset` to the category names in `catnames`.

The 2-by-3 categorical array, `B`, is not ordinal. Therefore, you can only compare the values in `B` for equality. To compare the values in `B` using relational operators, such as less than and greater than, you must include the `'Ordinal',true` name-value pair argument.

### Create Ordinal Categorical Array from Integers

Create a 5-by-2 numeric array.

`A = gallery('integerdata',3,[5,2],1)`
```A = 3 2 3 3 3 2 2 1 3 2```

`A` contains the values `1`, `2`, and `3`.

Convert the numeric array, `A`, to an ordinal categorical array where 1, 2, and 3 represent child, adult, and senior respectively.

```valueset = [1:3]; catnames = {'child' 'adult' 'senior'}; B = categorical(A,valueset,catnames,'Ordinal',true)```
```B = senior adult senior senior senior adult adult child senior adult ```

Since `B` is ordinal, the categories of `B` have a mathematical ordering, `child < adult < senior`.

### Create Categorical Array by Binning Numeric Data

Use the `discretize` function (instead of `categorical`) to bin 100 random numbers into three categories.

```x = rand(100,1); y = discretize(x,[0 .25 .75 1],'categorical',{'small','medium','large'}); summary(y) ```
``` small 22 medium 46 large 32 ```

## Input Arguments

collapse all

### `A` — Input arraynumeric array | logical array | categorical array | cell array of strings | ...

Input array, specified as a numeric array, logical array, categorical array, or cell array of strings.

If `A` contains missing values, the corresponding element of `B` is `<undefined>`. Missing values are `NaN` for numeric arrays, the empty string (`''`) for cell arrays of strings, and `<undefined>` for categorical arrays. `B` does not have a category for undefined values. To create an explicit category for missing or undefined values, you must include the desired category name in `catnames`, and `NaN`, the empty string, or `<undefined>` in `valueset`.

In addition to an array, `A` can be an object with the following class methods:

• `unique`

• `eq`

### `valueset` — Values to define categories`unique(A)` (default) | vector of unique values

Values to define categories, specified as a vector of unique values. The data type of `valueset` and the data type of `A` must be the same.

### `catnames` — Category namescell array of strings

Category names, specified as a cell array of strings. If you do not specify the `catnames` input argument, `categorical` uses the values in `valueset` as category names.

To merge multiple distinct values in `A` into a single category in `B`, include duplicate names corresponding to those values.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside single quotes (`' '`). You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'Ordinal',true` specifies that the categories have a mathematical ordering

### `'Ordinal'` — Mathematical ordering indicator`false` (default) | `true` | `0` | `1`

Mathematical ordering indicator, specified as the comma-separated pair consisting of `'Ordinal'` and either `false`, `true`, `0`, or `1`.

 `false` `categorical` creates a categorical array that is not ordinal. This is the default behavior.The categories of `B` have no mathematical ordering. Therefore, you can only compare the values in `B` for equality. `true` `categorical` creates an ordinal categorical array.The categories of `B` have a mathematical ordering, such that the first category specified is the smallest and the last category is the largest. You can compare the values in `B` using relational operators, such as less than and greater than, in addition to comparing the values for equality.

### `'Protected'` — Category protection indicator`false` | `true` | `0` | `1`

Category protection indicator specified as the comma-separated pair consisting of `'Protected'` and either `false`, `true`, `0`, or `1`. The categories of ordinal categorical arrays are always protected. The default value is `true` when you specify `'Ordinal',true` and false otherwise.

 `false` When you assign new values to `B`, the categories update automatically. Therefore, you can combine (nonordinal) categorical arrays that have different categories. The categories can update accordingly to include the categories from both arrays. `true` When you assign new values to `B`, the values must belong to one of the existing categories. Therefore, you can only combine arrays that have the same categories. To add new categories to `B`, you must use the function `addcats`.