image thumbnail

LBFGSB (L-BFGS-B) mex wrapper

by

 

16 Feb 2012 (Updated )

Mex wrapper for lbfgsb v3.0 fortan library. L-bfgs-b solves box-constrained optimization.

Non-negative least-squares (NNLS) using L-BFBS-B

Non-negative least-squares (NNLS) using L-BFBS-B

Non-negative least-squares solves the following problem: $$ \min_x \|Ax \textrm{--} b\|^2_2 \quad\textrm{such that}\quad x \ge 0 $$

The matrix 'A' may have more columns than rows (the 'underdetermined' case), or more rows than columns (the 'overdetermined' case), or the same number of rows and columns. Some solvers, such as the PQN method described in "Tackling Box-Constrained Optimization via a new Projected Quasi-Newton Approach" by Dongmin Kim, Suvrit Sra, and Inderjit Dhillon (http://www.cs.utexas.edu/users/inderjit/public_papers/pqnj_sisc10.pdf), only work for the overdetermined case.

To quote from that paper, "Not surprisingly, some constrained optimization methods have also been applied to solve NNLS. It is interesting to note that for large scale problems these specialized algorithms are outperformed by modern methods such as TRON, LBFGS-B, or the methods of this paper. Curiously this fact has not yet been widely adopted by the wider research community (footnote: This could be because Matlab continues to ship the antiquated lsqnonneg function, which is an implementation of the original NNLS algorithm of Lawson and Hanson 1974 )."

The Kim/Sra/Dhillon paper compares the following algorithms:

Fast NNLS, by Rasmus Bro. Available at: http://www.mathworks.com/matlabcentral/fileexchange/3388-nnls-and-constrained-regression

mtron, mex wrapper by Christoph Ortner, available at: http://www.mathworks.com/matlabcentral/fileexchange/14848-mtron Based on the fortran tron algorithm by Chih-Jen Lin and Jorge Moreé, "Newton's method for large bound-constrained optimization problems", SIAM Journal on Optimization, 9(4), pp. 1100-1127, 1999. http://www-unix.mcs.anl.gov/~more/tron/

L-BFGS-B. mex wrapper for v2.1 of the fortran files. R. Byrd, P. Lu, J. Nocedal, and C. Zhu, "A Limited Memory Algorithm for Bound Constrained Optimization", SIAM Journal on Scientific Computing, 16 (1995), pp. 1190--1208.

Contents

This demo

Here, we use the mex wrapper for L-BFGS-B v3.0, which is a significantly improved version of L-BFGS-B from v2.1. We show how to use the software and the fminunc_wrapper helper file.

It also compares to some NNLS implementations availabe on the matworks file exchange. In addition to Fast NNLS (FNNLS), mtron, and LBFGS, we compare with the following algorithms, all written by Uriel Roque and based on: Portugal, Judice and Vicente, "A comparison of block pivoting and interior pointalgorithms for linear least squares problems with nonnegative variables", Mathematics of Computation, 63(1994), pp. 625-643

activeset.m This is pretty fast for medium-scale and smaller problems

blocknnls.m Similar to activeset.m in performance

newton.m Very slow for large problems

pcnnls.m (predictor-corrector method) Very slow for large problems

The most interesting tests use large matrices. For small matrices, tests are pointless, because any of the methods are suitable.

Setup a problem

% The best codes handle N = 20,000 as long as the matrix is very sparse.
% N   = 3000; M = 4000; % Large scale. Things start to get interesting
N   = 1000; M = 1500;     % at this size, some algo take a long time!
% N   = 100; M = 150;     % at this size, all algorithms take < 14 seconds
A   = randn(M,N);
b   = randn(M,1);

fcn     = @(x) norm( A*x - b)^2;
% here are two equivalent ways to make the gradient. grad2 is sometimes faster
grad1    = @(x) 2*A'*(A*x-b);
AtA     = A'*A; Ab = A'*b;
grad2    = @(x) 2*( AtA*x - Ab );

grad    = grad2;


x = [];
time = [];

Solve NNLS with L-BFGS-B

l  = zeros(N,1);    % lower bound
u  = inf(N,1);      % there is no upper bound
tstart=tic;
fun     = @(x)fminunc_wrapper( x, fcn, grad);
% Request very high accuracy for this test:
opts    = struct( 'factr', 1e4, 'pgtol', 1e-8, 'm', 10);
opts.printEvery     = 5;
if N > 10000
    opts.m  = 50;
end
% Run the algorithm:
[xk, ~, info] = lbfgsb(fun, l, u, opts );
t=toc(tstart)
% Record results
x.lbfgsb    = xk;
time.lbfgsb = t;
Iteration    5, f = 1.02e+03, ||g||_inf = 2.24e+02
Iteration   10, f = 1.02e+03, ||g||_inf = 2.22e+02
Iteration   15, f = 1.02e+03, ||g||_inf = 2.22e+02
Iteration   20, f = 1.02e+03, ||g||_inf = 2.22e+02
Iteration   25, f = 1.02e+03, ||g||_inf = 2.22e+02

t =

    0.0723

Solve with TRON, via MTRON interface

% Only run this if you have mtron installed and it is in the path
if exist( 'itron.m', 'file' )

    x0   = zeros(N,1);
    xl   = zeros(N,1);
    xu   = +1e300*ones(N,1);
    fmin = -1e300;
    H       = sparse(AtA/2); % will crash if not a sparse matrix
    tstart=tic;
    hess    = @(x) H;
    fun     = @(x)fminunc_wrapper( x, fcn, grad, hess );
    [xk, fval, exitflag, output] = itron(fun, x0, xl, xu, fmin );
    t=toc(tstart)
    x.tron    = xk;
    time.tron = t;
end
    n          F            ||G||           delta       #PCG
--------------------------------------------------------------
     0    1.467953e+03   2.628940e+02   1.000000e+00      0
nnz is 999200 and 0
     1    1.214217e+03   1.365992e+02   4.630697e-01      1
     2    1.126312e+03   9.195589e+01   4.630697e-01      1
     3    1.080455e+03   6.437660e+01   4.630697e-01      1
     4    1.058020e+03   4.912519e+01   2.489937e-01      2
     5    1.043430e+03   3.835540e+01   2.489937e-01      1
     6    1.033755e+03   3.079695e+01   2.489937e-01      1
     7    1.021824e+03   2.689033e+01   1.496872e-01      1
     8    1.018031e+03   2.012526e+01   1.496872e-01      1
     9    1.016678e+03   1.223149e+01   4.550400e-02      2
    10    1.015961e+03   8.451285e+00   4.550400e-02      1
    11    1.015568e+03   6.149324e+00   4.550400e-02      1
    12    1.015387e+03   4.663170e+00   2.099033e-02      2
    13    1.015276e+03   3.478738e+00   2.099033e-02      1
    14    1.015209e+03   2.985610e+00   2.099033e-02      1
    15    1.015174e+03   4.751213e+00   1.024076e-02      1
    16    1.015140e+03   4.403443e+00   1.024076e-02      1
    17    1.015123e+03   4.218798e+00   1.024076e-02      1
    18    1.015115e+03   4.131072e+00   3.851126e-03      2
    19    1.015111e+03   4.085310e+00   3.851126e-03      1
    20    1.015108e+03   4.065339e+00   3.851126e-03      1
    21    1.015107e+03   4.057408e+00   1.712096e-03      2
    22    1.015106e+03   4.057592e+00   1.712096e-03      1
    23    1.015106e+03   4.062857e+00   1.712096e-03      1
    24    1.015106e+03   4.088518e+00   8.230127e-04      1
    25    1.015106e+03   4.102803e+00   8.230127e-04      1
    26    1.015105e+03   4.112518e+00   8.230127e-04      1
    27    1.015105e+03   4.118033e+00   3.519882e-04      2
    28    1.015105e+03   4.121924e+00   3.519882e-04      1
    29    1.015105e+03   4.124739e+00   3.519882e-04      1
    30    1.015105e+03   4.126361e+00   1.576322e-04      2
    31    1.015105e+03   4.127610e+00   1.576322e-04      1
    32    1.015105e+03   4.128612e+00   1.576322e-04      1
    33    1.015105e+03   4.133059e+00   7.585400e-05      1
    34    1.015105e+03   4.132609e+00   7.585400e-05      1
    35    1.015105e+03   4.132417e+00   7.585400e-05      1
    36    1.015105e+03   4.132341e+00   3.273477e-05      2
    37    1.015105e+03   4.132361e+00   3.273477e-05      1
    38    1.015105e+03   4.132446e+00   3.273477e-05      1
    39    1.015105e+03   4.132846e+00   1.456003e-05      1
    40    1.015105e+03   4.133034e+00   1.456003e-05      1
    41    1.015105e+03   4.133165e+00   1.456003e-05      1
    42    1.015105e+03   4.133257e+00   1.456003e-05      1
    43    1.015105e+03   4.133308e+00   5.400072e-06      2
    44    1.015105e+03   4.133345e+00   5.400072e-06      1
    45    1.015105e+03   4.133372e+00   5.400072e-06      1
    46    1.015105e+03   4.133388e+00   2.396176e-06      2
    47    1.015105e+03   4.133400e+00   2.396176e-06      1
itron: Change in function value less then tol.
CONVERGENCE: FRTOL TEST SATISFIED                           

t =

   11.2811

Active set. Fast on medium problems

if exist( 'activeset.m', 'file' )
    tstart=tic;
    [xk,y]  = activeset(A,b);
    t=toc(tstart)
    x.activeset    = xk;
    time.activeset = t;
end
t =

   38.6452

Block pivoting. Fast on medium problems

if exist( 'blocknnls.m', 'file' )
    tstart=tic;
    [xk]  = blocknnls(A,b, 'fixed');
    t=toc(tstart)
    x.blockPivot    = xk;
    time.blockPivot = t;
end
t =

    0.7564

Newton. Slow!

if exist( 'newton.m', 'file' ) && N < 500
    tstart=tic;
    [xk,y]  = newton(A,b, ones(N,1), 100); % can't have 0 starting vector
    t=toc(tstart)
    x.newton    = xk;
    time.newton = t;
else
    fprintf('Skipping Newton method because we can''t find it, or it is too slow\n');
end
Skipping Newton method because we can't find it, or it is too slow

Predictor-Corrector. Can be very slow

if exist( 'pcnnls.m', 'file' ) && N < 500
    tstart=tic;
    [xk,y,nits]  = pcnnls(A,b,ones(N,1), 3000);
    t=toc(tstart)
    x.predCorr    = xk;
    time.predCorr = t;
else
    fprintf('Skipping predCorr method because we can''t find it, or it is too slow\n');
end
Skipping predCorr method because we can't find it, or it is too slow

Run Matlab's default (Lawson and Hanson) Very slow on large problems

tstart=tic;
xk = lsqnonneg(A,b);
t=toc(tstart)
x.lsqnonneg   = xk;
time.lsqnonneg = t;
t =

   36.5255

Fast NNLS, modification of Lawson and Hanson. Much better for large problems

if exist( 'fnnls.m', 'file' )
    tstart=tic;
    [xk]  = fnnls(A'*A,A'*b);
    t=toc(tstart)
    x.fnnls    = xk;
    time.fnnls = t;
end
t =

    2.0540

PQN-LBFGS and PQN-BB algorithms of Kim/Sra/Dhillon. Very fast.

if exist( 'solnls.m', 'file' )
    opt = solopt;
    opt.maxtime     = 2000;
    opt.verbose     = 0;
    tstart=tic;
    % run their 'BB' variant
    opt.algo = 'BB';
    out     = solnls( A, b, zeros(N,1), opt );
    t=toc(tstart)
    x.PQN_BB   = out.x;
    time.PQN_BB = t;

    % and run their 'PLB' variant (their 'PQN' variant is much slower)
    %   which uses L-BFGS (not to be confused with L-BFGS-B)
    opt.algo = 'PLB';

    tstart=tic;
    out     = solnls( A, b, zeros(N,1), opt );
    t=toc(tstart)

    x.PQN   = out.x;
    time.PQN = t;

end
t =

    0.3320


t =

    0.3373

Results

Find the best answer, and use that as the reference.

fMin = Inf;
for f=fieldnames(x)',
    if fcn(x.(f{1})) < fMin,
        fMin = fcn(x.(f{1}));
        best = f{1};
    end
end
xReference = x.(best);
errFcn      = @(x) norm(x-xReference)/norm(xReference);

% Print out info. Verify that the solution is indeed non-negative (hence the
%   min(x) information), and the objective function, and the error
%   against the reference solution. Also display the time.
fprintf('== Size of problem is %d x %d == \n', M, N );
for f=fieldnames(x)',
    fprintf('%10s:  obj is %7.2f, min(x) is %7.1d, err is %.2e, time is %6.3f s\n', ...
        f{1}, fcn(x.(f{1})), min(x.(f{1})), errFcn(x.(f{1})), time.(f{1}) );
end
== Size of problem is 1500 x 1000 == 
    lbfgsb:  obj is 1015.11, min(x) is       0, err is 1.60e-06, time is  0.072 s
      tron:  obj is 1015.11, min(x) is       0, err is 2.37e-06, time is 11.281 s
 activeset:  obj is 1015.11, min(x) is       0, err is 2.22e-08, time is 38.645 s
blockPivot:  obj is 1015.11, min(x) is       0, err is 2.22e-08, time is  0.756 s
 lsqnonneg:  obj is 1015.11, min(x) is       0, err is 2.22e-08, time is 36.526 s
     fnnls:  obj is 1015.11, min(x) is       0, err is 2.22e-08, time is  2.054 s
    PQN_BB:  obj is 1015.11, min(x) is       0, err is 2.22e-08, time is  0.332 s
       PQN:  obj is 1015.11, min(x) is       0, err is 0.00e+00, time is  0.337 s

Contact us