MATLAB Examples

Merge Category Levels

This example shows how to merge categories in a categorical array using mergelevels. This is useful for collapsing categories with few observations.

Contents

Load sample data.

load carsmall

Create a nominal array.

The variable Origin is a character array containing the country of origin for 100 sample cars. Convert Origin to a nominal array.

Origin = nominal(Origin);
getlevels(Origin)
ans = 

  1x6 nominal array

     France      Germany      Italy      Japan      Sweden      USA 

There are six unique countries of origin in the data.

Tabulate category counts.

Explore the elements of the categorical array.

tabulate(Origin)
    Value    Count   Percent
   France        4      4.00%
  Germany        9      9.00%
    Italy        1      1.00%
    Japan       15     15.00%
   Sweden        2      2.00%
      USA       69     69.00%

There are relatively few observations in each European country.

Merge categories.

Merge the categories France, Germany, Italy, and Sweden into one category called Europe.

Origin = mergelevels(Origin,{'France','Germany','Italy','Sweden'},...
                     'Europe');
getlevels(Origin)
ans = 

  1x3 nominal array

     Europe      Japan      USA 

The variable Origin now has only three category levels.

Tabulate category counts.

Explore the elements of the merged categories.

tabulate(Origin)
   Value    Count   Percent
  Europe       16     16.00%
   Japan       15     15.00%
     USA       69     69.00%

The category Europe has the 16% of observations that were previously distributed across four countries.