Documentation

This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

standardizeMissing

Insert missing value indicators into table

Syntax

  • B = standardizeMissing(A,id)
    example
  • B = standardizeMissing(A,id,'DataVariables',vars)
    example

Description

example

B = standardizeMissing(A,id) replaces all instances of the values specified in id occurring within table A with the standard missing value indicators. B is a table.

The standard missing value indicators depend on the data type:

  • NaN for double and single floating-point arrays

  • <undefined> for categorical arrays

  • empty character vector, {''}, for cell arrays of character vectors

  • blank character vector, [' '], for character arrays

standardizeMissing checks double and single variables in A against numeric values from id and checks character vector and categorical variables in A against character vectors from id. The function standardizeMissing ignores integer data types because they cannot contain NaN.

example

B = standardizeMissing(A,id,'DataVariables',vars) replaces values only in variables specified by vars.

Examples

collapse all

Replace All Instances of Specified Values

Create a table containing Inf and 'N/A' to represent missing values.

dblVar = [NaN;3;Inf;7;9];
cellstrVar = {'one';'three';'';'NA';'nine'};
charVar = ['A';'C';'E';' ';'I'];
categoryVar = categorical({'red';'yellow';'blue';'violet';''});

A = table(dblVar,cellstrVar,charVar,categoryVar)
A = 

    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

    NaN       'one'         A          red        
      3       'three'       C          yellow     
    Inf       ''            E          blue       
      7       'NA'                     violet     
      9       'nine'        I          <undefined>

Replace all instances of Inf with NaN and replace all instances of 'NA' with the empty character vector, ''.

B = standardizeMissing(A,{Inf,'NA'})
B = 

    dblVar    cellstrVar    charVar    categoryVar
    ______    __________    _______    ___________

    NaN       'one'         A          red        
      3       'three'       C          yellow     
    NaN       ''            E          blue       
      7       ''                       violet     
      9       'nine'        I          <undefined>

Replace Only Values in Specified Variables

Replace instances of Inf, and 'N/A', occurring in specified variables of a table, with the standard missing value indicators.

Create a table containing Inf and 'N/A' to represent missing values.

a = {'alpha';'bravo';'charlie';'';'N/A'};
x = [1;NaN;3;Inf;5];
y = [57;732;93;1398;Inf];

A = table(a,x,y)
A = 

        a         x      y  
    _________    ___    ____

    'alpha'        1      57
    'bravo'      NaN     732
    'charlie'      3      93
    ''           Inf    1398
    'N/A'          5     Inf

For the variables a and x, replace instances of Inf with NaN and 'N/A' with the empty character vector, ''.

B = standardizeMissing(A,{Inf,'N/A'},'DataVariables',{'a','x'})
B = 

        a         x      y  
    _________    ___    ____

    'alpha'        1      57
    'bravo'      NaN     732
    'charlie'      3      93
    ''           NaN    1398
    ''             5     Inf

Inf in the variable y remains unchanged because y is not included in the 'DataVariables' name-value pair argument.

Input Arguments

collapse all

A — Input tabletable

Input table, specified as a table.

id — Nonstandard missing value indicatorsnumeric vector | character vector | cell array containing numeric values and character vectors

Nonstandard missing value indicators, specified as a numeric vector, character vector, or cell array containing numeric values and character vectors.

vars — Subset of variables to considerpositive integer | vector of positive integers | variable name | cell array of variable names | logical vector

Subset of variables to consider, specified as a positive integer, vector of positive integers, variable name, cell array of variable names, or logical vector.

Output Arguments

collapse all

B — Output tabletable

Output table, returned as a table. The table can store metadata such as descriptions, variable units, variable names, and row names. For more information, see Table Properties.

More About

collapse all

Algorithms

standardizeMissing treats leading and trailing white space differently for cell arrays of character vectors, character arrays, and categorical arrays.

  • For cell arrays of character vectors, standardizeMissing does not ignore white space. All character vectors must match exactly a character vector specified in id.

  • For character arrays, standardizeMissing ignores trailing white space.

  • For categorical arrays, standardizeMissing ignores leading and trailing white space.

See Also

|

Introduced in R2013b

Was this topic helpful?