| Statistics Toolbox™ | ![]() |
[W,H] = nnmf(A,k)
[W,H] = nnmf(A,k,param1,val1,param2,val2,...)
[W,H,D] = nnmf(...)
[W,H] = nnmf(A,k) factors the nonnegative n-by-m matrix A into nonnegative factors W (n-by-k) and H (k-by-m). The factorization is not exact; W*H is a lower-rank approximation to A. The factors W and H are chosen to minimize the root-mean-squared residual D between A and W*H:
D = sqrt(norm(A-W*H,'fro')/(N*M))
The factorization uses an iterative method starting with random initial values for W and H. Because the root-mean-squared residual D may have local minima, repeated factorizations may yield different W and H. Sometimes the algorithm converges to a solution of lower rank than k, which may indicate that the result is not optimal.
W and H are normalized so that the rows of H have unit length. The columns of W are ordered by decreasing length.
[W,H] = nnmf(A,k,param1,val1,param2,val2,...) specifies optional parameter name/value pairs from the following table.
| Name | Value |
|---|---|
| 'algorithm' | Either 'als' (the default) to use an alternating least-squares algorithm, or 'mult' to use a multiplicative update algorithm. In general, the 'als' algorithm converges faster and more consistently. The 'mult' algorithm is more sensitive to initial values, which makes it a good choice when using 'replicates' to find W and H from multiple random starting values. |
| 'w0' | An n-by-k matrix to be used as the initial value for W. |
| 'h0' | A k-by-m matrix to be used as the initial value for H. |
| 'options' | An options structure as created by the statset function. nnmf uses the following fields of the options structure: Display, TolX, TolFun, and MaxIter. Unlike in optimization settings, reaching MaxIter iterations is treated as convergence. |
| 'replicates' | The number of times to repeat the factorization, using new random starting values for W and H, except at the first replication if 'w0' and 'h0' are given. This is most beneficial with the 'mult' algorithm. The default is 1. |
[W,H,D] = nnmf(...) also returns D, the root mean square residual.
Compute a nonnegative rank-two approximation of the measurements of the four variables in Fisher's iris data:
load fisheriris
[W,H] = nnmf(meas,2);
H
H =
0.6852 0.2719 0.6357 0.2288
0.8011 0.5740 0.1694 0.0087The first and third variables in meas (sepal length and petal length, with coefficients 0.6852 and 0.6357, respectively) provide relatively strong weights to the first column of W. The first and second variables in meas (sepal length and sepal width, with coefficients 0.8011and 0.5740) provide relatively strong weights to the second column of W.
Create a biplot of the data and the variables in meas in the column space of W:
biplot(H','scores',W,'varlabels',{'sl','sw','pl','pw'});
axis([0 1.1 0 1.1])
xlabel('Column 1')
ylabel('Column 2')

Starting from a random array X with rank 20, try a few iterations at several replicates using the multiplicative algorithm:
X = rand(100,20)*rand(20,50);
opt = statset('MaxIter',5,'Display','final');
[W0,H0] = nnmf(X,5,'replicates',10,...
'options',opt,...
'algorithm','mult');
rep iteration rms resid |delta x|
1 5 0.560887 0.0245182
2 5 0.66418 0.0364471
3 5 0.609125 0.0358355
4 5 0.608894 0.0415491
5 5 0.619291 0.0455135
6 5 0.621549 0.0299965
7 5 0.640549 0.0438758
8 5 0.673015 0.0366856
9 5 0.606835 0.0318931
10 5 0.633526 0.0319591
Final root mean square residual = 0.560887Continue with more iterations from the best of these results using alternating least squares:
opt = statset('Maxiter',1000,'Display','final');
[W,H] = nnmf(X,5,'w0',W0,'h0',H0,...
'options',opt,...
'algorithm','als');
rep iteration rms resid |delta x|
1 80 0.256914 9.78625e-005
Final root mean square residual = 0.256914[1] M.W. Berry et al., "Algorithms and Applications for Approximate Nonnegative Matrix Factorization," Computational Statistics and Data Analysis, Vol. 52, No. 1, pp. 155-173, 2007.
![]() | nlpredci | nodeerr | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |