This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

Data with Missing Values

Many data sets have one or more missing values. It is convenient to code missing values as NaN (Not a Number) to preserve the structure of data sets across multiple variables and observations.

Normal MATLAB® arithmetic operations yield NaN values when operands are NaN. Removing the NaN values would destroy the matrix structure. Removing the rows containing the NaN values would discard data. Statistics and Machine Learning Toolbox™ functions in the following table remove NaN values only for the purposes of computation.


Covariance matrix, ignoring NaN values


Maximum, ignoring NaN values


Mean, ignoring NaN values


Median, ignoring NaN values


Minimum, ignoring NaN values


Standard deviation, ignoring NaN values


Sum, ignoring NaN values


Variance, ignoring NaN values

Other Statistics and Machine Learning Toolbox functions also ignore NaN values. These include iqr, kurtosis, mad, prctile, range, skewness, and trimmean.

Working with Data with Missing Values

Create a 3-by-3 matrix of sample data. Remove two data values by replacing them with NaN.

X = magic(3);
X([1 5]) = [NaN NaN]
X =

   NaN     1     6
     3   NaN     7
     4     9     2

Compute the sum of for each column of the sample data matrix using the sum function.

s1 = sum(X)
s1 =

   NaN   NaN    15

If a column contains a NaN value, then the sum function will return NaN as the sum of the data in that column.

For comparison, compute the sum for each column of the sample data matrix using the nansum function.

s2 = nansum(X)
s2 =

     7    10    15

If a column contains a NaN value, then the nansum function ignores the NaN value and returns the sum of the remaining values in the column.

Was this topic helpful?