[W,H] = nnmf(A,k) factors the nonnegative n-by-m matrix A into
nonnegative factors W (n-by-k)
and H (k-by-m).
The factorization is not exact; W*H is a lower-rank
approximation to A. The factors W and H are
chosen to minimize the root-mean-squared residual D between A and W*H:
D = norm(A-W*H,'fro')/sqrt(N*M)
The factorization uses an iterative method starting with random
initial values for W and H.
Because the root-mean-squared residual D may have
local minima, repeated factorizations may yield different W and H.
Sometimes the algorithm converges to a solution of lower rank than k,
which may indicate that the result is not optimal.
W and H are normalized
so that the rows of H have unit length. The columns
of W are ordered by decreasing length.
[W,H] = nnmf(A,k,param1,val1,param2,val2,...) specifies
optional parameter name/value pairs from the following table.
Parameter
Value
'algorithm'
Either 'als' (the default) to use
an alternating least-squares algorithm, or 'mult' to
use a multiplicative update algorithm.
In general, the 'als' algorithm
converges faster and more consistently. The 'mult' algorithm
is more sensitive to initial values, which makes it a good choice
when using 'replicates' to find W and H from
multiple random starting values.
'w0'
An n-by-k matrix
to be used as the initial value for W.
'h0'
A k-by-m matrix
to be used as the initial value for H.
'options'
An options structure as created by the statset function. nnmf uses
the following fields of the options structure:
Display — Level of display.
Choices:
'off' (default) — No display
'final' — Display final
result
'iter' — Iterative display
of intermediate results
MaxIter — Maximum number
of iterations. Default is 100. Unlike in optimization
settings, reaching MaxIter iterations is treated
as convergence.
TolFun — Termination tolerance
on change in size of the residual. Default is 1e-4.
TolX — Termination tolerance
on relative change in the elements of W and H.
Default is 1e-4.
UseParallel — Set to true to
compute in parallel. Default is false.
UseSubstreams — Set to true to
compute in parallel in a reproducible fashion. Default is false.
To compute reproducibly, set Streams to a type
allowing substreams: 'mlfg6331_64' or 'mrg32k3a'.
Streams — A RandStream object or cell array of such
objects. If you do not specify Streams, nnmf uses
the default stream or streams. If you choose to specify Streams,
use a single object except in the case
You have an open Parallel pool
UseParallel is true
UseSubstreams is false
In that case, use a cell array the same size as the
Parallel pool.
'replicates'
The number of times to repeat the factorization, using
new random starting values for W and H,
except at the first replication if 'w0' and 'h0' are
given. This is most beneficial with the 'mult'
algorithm. The default is 1.
[W,H,D] = nnmf(...) also returns D,
the root mean square residual.
Compute a nonnegative rank-two approximation of the measurements of the four variables in Fisher's iris data.
[W,H] = nnmf(meas,2);
H
H =
0.6943 0.2853 0.6223 0.2220
0.8020 0.5685 0.1830 0.0147
The first and third variables in meas (sepal length and petal length, with coefficients 0.6852 and 0.6357, respectively) provide relatively strong weights to the first column of W . The first and second variables in meas (sepal length and sepal width, with coefficients 0.8011 and 0.5740) provide relatively strong weights to the second column of W .
Create a biplot of the data and the variables in meas in the column space of W .
rep iteration rms resid |delta x|
1 24 0.257336 0.00271859
Final root mean square residual = 0.257336
References
[1] Berry, M. W., et al. "Algorithms
and Applications for Approximate Nonnegative Matrix Factorization." Computational
Statistics and Data Analysis. Vol. 52, No. 1, 2007, pp.
155–173.