Often, you represent missing or
unavailable data values in MATLAB^{®} code with the
special value, NaN
,
which stands for Not-a-Number.
The IEEE^{®} floating-point arithmetic convention defines NaN
as
the result of an undefined operation, such as 0/0.
When you perform calculations on an IEEE variable that
contains NaN
s, the NaN
values
are often propagated to the final result. This behavior might render the result useless.
For example, consider a matrix containing the
3-by-3 magic square with its center element replaced with NaN
:
a = magic(3); a(2,2) = NaN a = 8 1 6 3 NaN 7 4 9 2
Compute the sum for each column in the matrix:
sum(a) ans = 15 NaN 15
Notice that the sum of the elements in the middle column is
a NaN
value because that column contains a NaN
.
Sometimes removing NaN
s from the data yields a
more meaningful result.
There are multiple strategies for removing NaN
s
from array computations. Some functions, such as sum
, allow an optional input argument
that causes MATLAB to ignore NaN
s in the calculation.
Instead of using sum(a)
in the previous example,
you can use the following command:
sum(a,'omitnan') ans = 15 10 15
NaN
in the
second column is ignored and only the non-NaN
elements
are summed.It is often useful to identify where the NaN
s
are located within an array before deciding on a strategy that removes
them. The functions isnan
and ismissing
can identify which elements
of an array are NaN
s. For an input array a
,
both of these functions return a logical array of the same size as a
.
Elements of the logical array are 1 (true
) when
the corresponding elements of a
are NaN
s,
and 0 (false
) otherwise.
The rmmissing
function
directly removes NaN
s from data. rmmissing
removes NaN
s
from vectors and can remove entire rows or columns from a matrix if
there is at least one NaN
in that row or column.
The following table summarizes techniques for removing NaN
s
from data.
Note:
By IEEE arithmetic convention,
the logical comparison |
Code | Description |
---|---|
| Remove |
| Remove |
| Remove |
| Find the indices of elements in a vector |
| Remove any rows containing NaN s from a matrix A . |
| Remove any rows containing |
| Remove NaN s along any dimension dim of
a multidimensional array M . For example, if M is
a matrix, use rmmissing(M,2) to remove columns
containing NaN . |
When NaN
s
are present in data, you can replace them with non-NaN
values.
The fillmissing
function offers
several methods for replacing missing values. You can fill NaN
s
with the following:
a constant
'previous'
— previous non-missing
value
'next'
— next non-missing
value
'nearest'
— nearest non-missing
value
'linear'
— linear interpolation
of neighboring, non-missing values
'spline'
— piecewise cubic
spline interpolation
'pchip'
— shape-preserving
piecewise cubic interpolation
While fillmissing
works on numeric
arrays containing NaN
s, it also operates on arrays,
tables, and timetables that can contain non-numeric data types such
as categorical
, datetime
, duration
,
and string
. For example, a missing datetime
value
can be represented with NaT
, and a missing categorical
value
is represented as <undefined>
. fillmissing
, ismissing
, standardizeMissing
, and rmmissing
all can operate on arrays,
tables, and timetables containing non-numeric data types.
For numeric 1-D data, you also can interpolate over missing
values with the interp1
function. For more information,
see interp1
.