Merge Category Levels

This example shows how to merge categories in a categorical array using mergelevels. This is useful for collapsing categories with few observations.

Load sample data.

load('carsmall')

Create a nominal array.

The variable Origin is a character array containing the country of origin for 100 sample cars. Convert Origin to a nominal array.

Origin = nominal(Origin);
getlevels(Origin)
ans = 

     France      Germany      Italy      Japan      Sweden      USA 

There are six unique countries of origin in the data.

Tabulate category counts.

Explore the elements of the categorical array.

tabulate(Origin)
    Value    Count   Percent
   France        4      4.00%
  Germany        9      9.00%
    Italy        1      1.00%
    Japan       15     15.00%
   Sweden        2      2.00%
      USA       69     69.00%

There are relatively few observations in each European country.

Merge categories.

Merge the categories France, Germany, Italy, and Sweden into one category called Europe.

Origin = mergelevels(Origin,{'France','Germany','Italy','Sweden'},...
                     'Europe');
getlevels(Origin)
ans = 

     Japan      USA      Europe 

The variable Origin now has only three category levels.

Tabulate category counts.

Explore the elements of the merged categories.

tabulate(Origin)
   Value    Count   Percent
   Japan       15     15.00%
     USA       69     69.00%
  Europe       16     16.00%

The category Europe has the 16% of observations that were previously distributed across four countries.

See Also

|

Related Examples

More About

Was this topic helpful?