standardizeMissing

Insert missing value indicators into table

Syntax

  • B = standardizeMissing(A,id) example
  • B = standardizeMissing(A,id,'DataVariables',vars) example

Description

example

B = standardizeMissing(A,id) replaces all instances of the values specified in id occurring within table A with the standard missing value indicators. B is a table.

The standard missing value indicators depend on the data type:

  • NaN for double and single floating-point arrays

  • <undefined> for categorical arrays

  • empty string, {''}, for cell arrays of strings

  • blank string, [' '], for character arrays

standardizeMissing checks double and single variables in A against numeric values from id and checks string and categorical variables in A against strings from id. The function standardizeMissing ignores integer data types because they cannot contain NaN.

example

B = standardizeMissing(A,id,'DataVariables',vars) replaces values only in variables specified by vars.

Examples

expand all

Replace All Instances of Specified Values

Create a table containing Inf and 'N/A' to represent missing values.

dblVar = [NaN;3;Inf;7;9];
cellstrVar = {'one';'three';'';'NA';'nine'};
charVar = ['A';'C';'E';' ';'I'];
categoryVar = categorical({'red';'yellow';'blue';'violet';''});

A = table(dblVar,cellstrVar,charVar,categoryVar)
A = 

    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

    NaN       'one'         A          red        
      3       'three'       C          yellow     
    Inf       ''            E          blue       
      7       'NA'                     violet     
      9       'nine'        I          <undefined>

Replace all instances of Inf with NaN and replace all instances of 'NA' with the empty string, ''.

B = standardizeMissing(A,{Inf,'NA'})
B = 

    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

    NaN       'one'         A          red        
      3       'three'       C          yellow     
    NaN       ''            E          blue       
      7       ''                       violet     
      9       'nine'        I          <undefined>

Replace Only Values in Specified Variables

Replace instances of Inf, and 'N/A', occurring in specified variables of a table, with the standard missing value indicators.

Create a table containing Inf and 'N/A' to represent missing values.

a = {'alpha';'bravo';'charlie';'';'N/A'};
x = [1;NaN;3;Inf;5];
y = [57;732;93;1398;Inf];

A = table(a,x,y)
A = 

        a         x      y  
    _________    ___    ____

    'alpha'        1      57
    'bravo'      NaN     732
    'charlie'      3      93
    ''           Inf    1398
    'N/A'          5     Inf

For the variables a and x, replace instances of Inf with NaN and 'N/A' with the empty string, ''.

B = standardizeMissing(A,{Inf,'N/A'},'DataVariables',{'a','x'})
B = 

        a         x      y  
    _________    ___    ____

    'alpha'        1      57
    'bravo'      NaN     732
    'charlie'      3      93
    ''           NaN    1398
    ''             5     Inf

Inf in the variable y remains unchanged because y is not included in the 'DataVariables' name-value pair argument.

Input Arguments

expand all

A — Input tabletable

Input table, specified as a table.

id — Nonstandard missing value indicatorsnumeric vector | string | cell array containing numeric values and strings

Nonstandard missing value indicators, specified as a numeric vector, string, or cell array containing numeric values and strings.

vars — Subset of variables to considerpositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Subset of variables to consider, specified as a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

Output Arguments

expand all

B — Output tabletable

Output table, returned as a table. The table can store metadata such as descriptions, variable units, variable names, and row names. For more information, see Table Properties.

More About

expand all

Algorithms

standardizeMissing treats leading and trailing white space differently for cell arrays of strings, character arrays, and categorical arrays.

  • For cell arrays of strings, standardizeMissing does not ignore white space. All strings must match exactly a string specified in id.

  • For character arrays, standardizeMissing ignores trailing white space.

  • For categorical arrays, standardizeMissing ignores leading and trailing white space.

See Also

|

Was this topic helpful?