Documentation |
On this page… |
---|
Linear programming is the problem of finding a vector x that minimizes a linear function f^{T}x subject to linear constraints:
$$\underset{x}{\mathrm{min}}{f}^{T}x$$
such that one or more of the following hold: A·x ≤ b, Aeq·x = beq, l ≤ x ≤ u.
The default interior-point method is based on LIPSOL ([52]), which is a variant of Mehrotra's predictor-corrector algorithm ([47]), a primal-dual interior-point method.
The algorithm begins by applying a series of preprocessing steps (see Preprocessing). After preprocessing, the problem has the form
$$\underset{x}{\mathrm{min}}{f}^{T}x\text{suchthat}\{\begin{array}{c}A\cdot x=b\\ 0\le x\le u.\end{array}$$ | (8-1) |
The upper bounds constraints are implicitly included in the constraint matrix A. With the addition of primal slack variables s, Equation 8-1 becomes
$$\underset{x}{\mathrm{min}}{f}^{T}x\text{suchthat}\{\begin{array}{c}A\cdot x=b\\ x+s=u\\ x\ge 0,\text{}s\ge 0.\end{array}$$ | (8-2) |
which is referred to as the primal problem: x consists of the primal variables and s consists of the primal slack variables. The dual problem is
$$\mathrm{max}{b}^{T}y-{u}^{T}w\text{suchthat}\{\begin{array}{c}{A}^{T}\cdot y-w+z=f\\ z\ge 0,\text{}w\ge 0,\end{array}$$ | (8-3) |
where y and w consist of the dual variables and z consists of the dual slacks. The optimality conditions for this linear program, i.e., the primal Equation 8-2 and the dual Equation 8-3, are
$$\begin{array}{l}F(x,y,z,s,w)=\left(\begin{array}{c}A\cdot x-b\\ x+s-u\\ {A}^{T}\cdot y-w+z-f\\ {x}_{i}{z}_{i}\\ {s}_{i}{w}_{i}\end{array}\right)=0,\\ \text{}x\ge 0,\text{}z\ge 0,\text{}s\ge 0,\text{}w\ge 0,\end{array}$$ | (8-4) |
where x_{i}z_{i} and s_{i}w_{i} denote component-wise multiplication.
The quadratic equations x_{i}z_{i} = 0 and s_{i}w_{i} = 0 are called the complementarity conditions for the linear program; the other (linear) equations are called the feasibility conditions. The quantity
x^{T}z + s^{T}w
is the duality gap, which measures the residual of the complementarity portion of F when (x,z,s,w) ≥ 0.
The algorithm is a primal-dual algorithm, meaning that both the primal and the dual programs are solved simultaneously. It can be considered a Newton-like method, applied to the linear-quadratic system F(x,y,z,s,w) = 0 in Equation 8-4, while at the same time keeping the iterates x, z, w, and s positive, thus the name interior-point method. (The iterates are in the strictly interior region represented by the inequality constraints in Equation 8-2.)
The algorithm is a variant of the predictor-corrector algorithm proposed by Mehrotra. Consider an iterate v = [x;y;z;s;w], where [x;z;s;w] > 0 First compute the so-called prediction direction
$$\Delta {v}_{p}=-{\left({F}^{T}(v)\right)}^{-1}F(v),$$
which is the Newton direction; then the so-called corrector direction
$$\Delta {v}_{c}=-{\left({F}^{T}(v)\right)}^{-1}F\left(v+\Delta {v}_{p}\right)-\mu \widehat{e},$$
where μ > 0 is called the centering parameter and must be chosen carefully. $$\widehat{e}$$ is a zero-one vector with the ones corresponding to the quadratic equations in F(v), i.e., the perturbations are only applied to the complementarity conditions, which are all quadratic, but not to the feasibility conditions, which are all linear. The two directions are combined with a step length parameter α > 0 and update v to obtain the new iterate v^{+}:
$${v}^{+}=v+\alpha \left(\Delta {v}_{p}+\Delta {v}_{c}\right),$$
where the step length parameter α is chosen so that
v^{+} = [x^{+};y^{+};z^{+};s^{+};w^{+}]
satisfies
[x^{+};z^{+};s^{+};w^{+}] > 0.
In solving for the preceding predictor/corrector directions, the algorithm computes a (sparse) direct factorization on a modification of the Cholesky factors of A·A^{T}. If A has dense columns, it instead uses the Sherman-Morrison formula. If that solution is not adequate (the residual is too large), it performs an LDL factorization of an augmented system form of the step equations to find a solution. (See Example 4 — The Structure of D in the MATLAB^{®} ldl function reference page.)
The algorithm then loops until the iterates converge. The main stopping criteria is a standard one:
$$\frac{\Vert {r}_{b}\Vert}{\mathrm{max}\left(1,\Vert b\Vert \right)}+\frac{\Vert {r}_{f}\Vert}{\mathrm{max}\left(1,\Vert f\Vert \right)}+\frac{\Vert {r}_{u}\Vert}{\mathrm{max}\left(1,\Vert u\Vert \right)}+\frac{\left|{f}^{T}x-{b}^{T}y+{u}^{T}w\right|}{\mathrm{max}\left(1,\left|{f}^{T}x\right|,\left|{b}^{T}y-{u}^{T}w\right|\right)}\le tol,$$
where
$$\begin{array}{c}{r}_{b}=Ax-b\\ {r}_{f}={A}^{T}y-w+z-f\\ {r}_{u}=x+s-u\end{array}$$
are the primal residual, dual residual, and upper-bound feasibility respectively, and
$${f}^{T}x-{b}^{T}y+{u}^{T}w$$
is the difference between the primal and dual objective values, and tol is some tolerance. The sum in the stopping criteria measures the total relative errors in the optimality conditions in Equation 8-4.
A number of preprocessing steps occur before the actual iterative algorithm begins. The resulting transformed problem is one where
All variables are bounded below by zero.
All constraints are equalities.
Fixed variables, those with equal upper and lower bounds, are removed.
Rows of all zeros in the constraint matrix are removed.
Columns of all zeros in the constraint matrix are removed.
When a significant number of singleton rows exist in the constraint matrix, the associated variables are solved for and the rows removed.
While these preprocessing steps can do much to speed up the iterative part of the algorithm, if the Lagrange multipliers are required, the preprocessing steps must be undone since the multipliers calculated during the algorithm are for the transformed problem, and not the original. Thus, if the multipliers are not requested, this transformation back is not computed, and might save some time computationally.
The medium-scale active-set linear programming algorithm is a variant of the sequential quadratic programming algorithm used by fmincon (Sequential Quadratic Programming (SQP)). The difference is that the quadratic term is set to zero.
At each major iteration of the SQP method, a QP problem of the following form is solved, where A_{i} refers to the ith row of the m-by-n matrix A.
$$\begin{array}{c}\underset{d\in {\Re}^{n}}{\mathrm{min}}q(d)={c}^{T}d,\\ {A}_{i}d={b}_{i},\text{}i=1,\mathrm{...},{m}_{e}\\ {A}_{i}d\le {b}_{i},\text{}i={m}_{e}+1,\mathrm{...},m.\end{array}$$
The method used in Optimization Toolbox™ functions is an active set strategy (also known as a projection method) similar to that of Gill et al., described in [18] and [17]. It has been modified for both Linear Programming (LP) and Quadratic Programming (QP) problems.
The solution procedure involves two phases. The first phase involves the calculation of a feasible point (if one exists). The second phase involves the generation of an iterative sequence of feasible points that converge to the solution. In this method an active set, $${\overline{A}}_{k}$$, is maintained that is an estimate of the active constraints (i.e., those that are on the constraint boundaries) at the solution point. Virtually all QP algorithms are active set methods. This point is emphasized because there exist many different methods that are very similar in structure but that are described in widely different terms.
$${\overline{A}}_{k}$$ is updated at each iteration k, and this is used to form a basis for a search direction $${\widehat{d}}_{k}$$. Equality constraints always remain in the active set $${\overline{A}}_{k}$$. The notation for the variable $${\widehat{d}}_{k}$$ is used here to distinguish it from d_{k} in the major iterations of the SQP method. The search direction $${\widehat{d}}_{k}$$ is calculated and minimizes the objective function while remaining on any active constraint boundaries. The feasible subspace for $${\widehat{d}}_{k}$$ is formed from a basis Z_{k} whose columns are orthogonal to the estimate of the active set $${\overline{A}}_{k}$$ (i.e., $${\overline{A}}_{k}{Z}_{k}=0$$). Thus a search direction, which is formed from a linear summation of any combination of the columns of Z_{k}, is guaranteed to remain on the boundaries of the active constraints.
The matrix Z_{k} is formed from the last m – l columns of the QR decomposition of the matrix $${\overline{A}}_{k}^{T}$$, where l is the number of active constraints and l < m. That is, Z_{k} is given by
$${Z}_{k}=Q\left[:,l+1:m\right],$$ | (8-5) |
where
$${Q}^{T}{\overline{A}}_{k}^{T}=\left[\begin{array}{c}R\\ 0\end{array}\right].$$
Once Z_{k} is found, a new search direction $${\widehat{d}}_{k}$$ is sought that minimizes q(d) where $${\widehat{d}}_{k}$$ is in the null space of the active constraints. That is, $${\widehat{d}}_{k}$$ is a linear combination of the columns of Z_{k}: $${\widehat{d}}_{k}={Z}_{k}p$$ for some vector p.
Then if you view the quadratic as a function of p, by substituting for $${\widehat{d}}_{k}$$, you have
$$q(p)=\frac{1}{2}{p}^{T}{Z}_{k}^{T}H{Z}_{k}p+{c}^{T}{Z}_{k}p.$$ | (8-6) |
Differentiating this with respect to p yields
$$\nabla q(p)={Z}_{k}^{T}H{Z}_{k}p+{Z}_{k}^{T}c.$$ | (8-7) |
∇q(p) is referred to as the projected gradient of the quadratic function because it is the gradient projected in the subspace defined by Z_{k}. The term $${Z}_{k}^{T}H{Z}_{k}$$ is called the projected Hessian. Assuming the Hessian matrix H is positive definite (which is the case in this implementation of SQP), then the minimum of the function q(p) in the subspace defined by Z_{k} occurs when ∇q(p) = 0, which is the solution of the system of linear equations
$${Z}_{k}^{T}H{Z}_{k}p=-{Z}_{k}^{T}c.$$ | (8-8) |
A step is then taken of the form
$${x}_{k+1}={x}_{k}+\alpha {\widehat{d}}_{k},\text{where}{\widehat{d}}_{k}={Z}_{k}^{T}p.$$ | (8-9) |
At each iteration, because of the quadratic nature of the objective function, there are only two choices of step length α. A step of unity along $${\widehat{d}}_{k}$$ is the exact step to the minimum of the function restricted to the null space of $${\overline{A}}_{k}$$. If such a step can be taken, without violation of the constraints, then this is the solution to QP (Equation 8-6). Otherwise, the step along $${\widehat{d}}_{k}$$ to the nearest constraint is less than unity and a new constraint is included in the active set at the next iteration. The distance to the constraint boundaries in any direction $${\widehat{d}}_{k}$$ is given by
$$\alpha =\underset{i\in \left\{1,\mathrm{...},m\right\}}{\mathrm{min}}\left\{\frac{-\left({A}_{i}{x}_{k}-{b}_{i}\right)}{{A}_{i}{\widehat{d}}_{k}}\right\},$$ | (8-10) |
which is defined for constraints not in the active set, and where the direction $${\widehat{d}}_{k}$$ is towards the constraint boundary, i.e., $${A}_{i}{\widehat{d}}_{k}>0,\text{}i=1,\mathrm{...},m$$.
When n independent constraints are included in the active set, without location of the minimum, Lagrange multipliers, λ_{k}, are calculated that satisfy the nonsingular set of linear equations
$${\overline{A}}_{k}^{T}{\lambda}_{k}=c.$$ | (8-11) |
If all elements of λ_{k} are positive, x_{k} is the optimal solution of QP (Equation 8-6). However, if any component of λ_{k} is negative, and the component does not correspond to an equality constraint, then the corresponding element is deleted from the active set and a new iterate is sought.
The algorithm requires a feasible point to start. If the current point from the SQP method is not feasible, then you can find a point by solving the linear programming problem
$$\begin{array}{l}\underset{\gamma \in \Re ,\text{}x\in {\Re}^{n}}{\mathrm{min}}\gamma \text{suchthat}\\ {A}_{i}x={b}_{i},\text{}i=1,\mathrm{...},{m}_{e}\\ {A}_{i}x-\gamma \le {b}_{i},\text{}i={m}_{e}+1,\mathrm{...},m.\end{array}$$ | (8-12) |
The notation A_{i} indicates the ith row of the matrix A. You can find a feasible point (if one exists) to Equation 8-12 by setting x to a value that satisfies the equality constraints. You can determine this value by solving an under- or overdetermined set of linear equations formed from the set of equality constraints. If there is a solution to this problem, then the slack variable γ is set to the maximum inequality constraint at this point.
You can modify the preceding QP algorithm for LP problems by setting the search direction to the steepest descent direction at each iteration, where g_{k} is the gradient of the objective function (equal to the coefficients of the linear objective function).
$${\widehat{d}}_{k}=-{Z}_{k}{Z}_{k}^{T}{g}_{k}.$$ | (8-13) |
If a feasible point is found using the preceding LP method, the main QP phase is entered. The search direction $${\widehat{d}}_{k}$$ is initialized with a search direction $${\widehat{d}}_{1}$$ found from solving the set of linear equations
$$H{\widehat{d}}_{1}=-{g}_{k},$$ | (8-14) |
where g_{k} is the gradient of the objective function at the current iterate x_{k} (i.e., Hx_{k} + c).
If a feasible solution is not found for the QP problem, the direction of search for the main SQP routine is taken as one that minimizes γ.
The simplex algorithm, invented by George Dantzig in 1947, is one of the earliest and best known optimization algorithms. The algorithm solves the linear programming problem
$$\underset{x}{\mathrm{min}}{f}^{T}x\text{suchthat}\{\begin{array}{c}A\cdot x\le b,\\ Aeq\cdot x=beq,\\ lb\le x\le ub.\end{array}$$
The algorithm moves along the edges of the polyhedron defined by the constraints, from one vertex to another, while decreasing the value of the objective function, f^{T}x, at each step. This section describes an improved version of the original simplex algorithm that returns a vertex optimal solution.
This section covers the following topics:
The simplex algorithm has two phases:
Phase 1 — Compute an initial basic feasible point.
Phase 2 — Compute the optimal solution to the original problem.
Note You cannot supply an initial point x0 for linprog with the simplex algorithm. If you pass in x0 as an input argument, linprog ignores x0 and computes its own initial point for the algorithm. |
Phase 1. In phase 1, the algorithm finds an initial basic feasible solution (see Basic and Nonbasic Variables for a definition) by solving an auxiliary piecewise linear programming problem. The objective function of the auxiliary problem is the linear penalty function $$P={\displaystyle \sum _{j}{P}_{j}\left({x}_{j}\right)}$$,
where P_{j}(x_{j}) is defined by
$${P}_{j}\left({x}_{j}\right)=\{\begin{array}{ll}{x}_{j}-{u}_{j}\hfill & \text{if}{x}_{j}{u}_{j}\hfill \\ 0\hfill & \text{if}{l}_{j}\le {x}_{j}\le {u}_{j}\hfill \\ {l}_{j}-{x}_{j}\hfill & \text{if}{l}_{j}{x}_{j}.\hfill \end{array}$$
P(x) measures how much a point x violates the lower and upper bound conditions. The auxiliary problem is
$$\underset{x}{\mathrm{min}}{\displaystyle \sum _{j}{P}_{j}}\text{subjectto}\{\begin{array}{c}A\cdot x\le b\\ Aeq\cdot x=beq.\end{array}$$
The original problem has a feasible basis point if and only if the auxiliary problem has minimum value 0.
The algorithm finds an initial point for the auxiliary problem by a heuristic method that adds slack and artificial variables as necessary. The algorithm then uses this initial point together with the simplex algorithm to solve the auxiliary problem. The solution is the initial point for phase 2 of the main algorithm.
Phase 2. In phase 2, the algorithm applies the simplex algorithm, starting at the initial point from phase 1, to solve the original problem. At each iteration, the algorithm tests the optimality condition and stops if the current solution is optimal. If the current solution is not optimal, the algorithm
Chooses one variable, called the entering variable, from the nonbasic variables and adds the corresponding column of the nonbasis to the basis (see Basic and Nonbasic Variables for definitions).
Chooses a variable, called the leaving variable, from the basic variables and removes the corresponding column from the basis.
Updates the current solution and the current objective value.
The algorithm chooses the entering and the leaving variables by solving two linear systems while maintaining the feasibility of the solution.
The algorithm detects when there is no progress in the Phase 2 solution process. It attempts to continue by performing bound perturbation. For an explanation of this part of the algorithm, see Applegate, Bixby, Chvatal, and Cook [59].
The simplex algorithm uses the same preprocessing steps as the interior-point linear programming solver, which are described in Preprocessing. In addition, the algorithm uses two other steps:
Eliminates columns that have only one nonzero element and eliminates their corresponding rows.
For each constraint equation a·x = b, where a is a row of Aeq, the algorithm computes the lower and upper bounds of the linear combination a·x as rlb and rub if the lower and upper bounds are finite. If either rlb or rub equals b, the constraint is called a forcing constraint. The algorithm sets each variable corresponding to a nonzero coefficient of a·x equal to its upper or lower bound, depending on the forcing constraint. The algorithm then deletes the columns corresponding to these variables and deletes the rows corresponding to the forcing constraints.
To use the simplex method, set the Algorithm option to 'simplex' using optimoptions.
options = optimoptions(@linprog,'Algorithm','simplex');
Then call linprog with the options input argument. See the reference page for linprog for more information.
linprog returns empty output arguments for x and fval if it detects infeasibility or unboundedness in the preprocessing procedure. linprog returns the current point when it
Exceeds the maximum number of iterations
Detects that the problem is infeasible or unbounded in phases 1 or 2
When the problem is unbounded, linprog returns x and fval in the unbounded direction.
This section defines the terms basis, nonbasis, and basic feasible solutions for a linear programming problem. The definition assumes that the problem is given in the following standard form:
$$\underset{x}{\mathrm{min}}{f}^{T}x\text{suchthat}\{\begin{array}{c}A\cdot x=b,\\ lb\le x\le ub.\end{array}$$
(Note that A and b are not the matrix and vector defining the inequalities in the original problem.) Assume that A is an m-by-n matrix, of rank m < n, whose columns are {a_{1}, a_{2}, ..., a_{n}}. Suppose that $$\left\{{a}_{{i}_{1}},{a}_{{i}_{2}},\mathrm{...},{a}_{{i}_{m}}\right\}$$ is a basis for the column space of A, with index set B = {i_{1}, i_{2}, ..., i_{m}}, and that N = {1, 2, ..., n}\B is the complement of B. The submatrix A_{B} is called a basis and the complementary submatrix A_{N} is called a nonbasis. The vector of basic variables is x_{B } and the vector of nonbasic variables is x_{N}. At each iteration in phase 2, the algorithm replaces one column of the current basis with a column of the nonbasis and updates the variables x_{B} and x_{N} accordingly.
If x is a solution to A·x = b and all the nonbasic variables in x_{N} are equal to either their lower or upper bounds, x is called a basic solution. If, in addition, the basic variables in x_{B} satisfy their lower and upper bounds, so that x is a feasible point, x is called a basic feasible solution.
At a high level, the linprog 'dual-simplex' algorithm essentially performs a simplex algorithm on the dual problem.
The algorithm begins by preprocessing as described in Preprocessing. For details, see Andersen and Andersen [1] and Nocedal and Wright [4], Chapter 13. This preprocessing reduces the original linear programming problem to the form of Equation 8-1:
$$\underset{x}{\mathrm{min}}{f}^{T}x\text{suchthat}\{\begin{array}{c}A\cdot x=b\\ 0\le x\le u.\end{array}$$
A and b are transformed versions of the original constraint matrices. This is the primal problem.
As explained in Equation 8-3, the dual problem is to find vectors y and w, and a slack variable vector z that solve
$$\mathrm{max}{b}^{T}y-{u}^{T}w\text{suchthat}\{\begin{array}{c}{A}^{T}\cdot y-w+z=f\\ z\ge 0,\text{}w\ge 0.\end{array}$$
It is well known (for example, see [4]) that any finite solution of the dual problem gives a solution to the primal problem, and any finite solution of the primal problem gives a solution of the dual problem. Furthermore, if either the primal or dual problem is unbounded, then the other problem is infeasible. And if either the primal or dual problem is infeasible, then the other problem is either infeasible or unbounded. Therefore, the two problems are equivalent in terms of obtaining a finite solution, if one exists. Because the primal and dual problems are mathematically equivalent, but the computational steps differ, it can be better to solve the primal problem by solving the dual problem.
To help alleviate degeneracy (see Nocedal and Wright [4], page 366), the dual simplex algorithm begins by perturbing the objective function.
Phase 1 of the dual simplex algorithm is to find a dual feasible point. The algorithm does this by solving an auxiliary linear programming problem, similar to Phase 1 for the simplex algorithm.
During Phase 2, the solver repeatedly chooses an entering variable and a leaving variable, analogously to Phase 2 for the primal simplex algorithm. The algorithm chooses a leaving variable according to a technique suggested by Forrest and Goldfarb [2] called dual steepest-edge pricing. The algorithm chooses an entering variable using the variation of Harris' ratio test suggested by Koberstein [3]. To help alleviate degeneracy, the algorithm can introduce additional perturbations during Phase 2.
The solver iterates, attempting to maintain dual feasibility while reducing primal infeasibility, until the solution to the reduced, perturbed problem is both primal feasible and dual feasible. The algorithm unwinds the perturbations that it introduced. If the solution (to the perturbed problem) is dual infeasible for the unperturbed (original) problem, then the solver restores dual feasibility using primal simplex or Phase 1 algorithms. Finally, the solver unwinds the preprocessing steps to return the solution to the original problem.
[1] Andersen, E. D., and K. D. Andersen. Presolving in linear programming. Math. Programming 71, pp. 221–245, 1995.
[2] Forrest, J. J., and D. Goldfarb. Steepest-edge simplex algorithms for linear programming. Math. Programming 57, pp. 341–374, 1992.
[3] Koberstein, A. Progress in the dual simplex algorithm for solving large scale LP problems: techniques for a fast and stable implementation. Computational Optim. and Application 41, pp. 185–204, 2008.
[4] Nocedal, J., and S. J. Wright. Numerical Optimization, Second Edition. Springer Series in Operations Research, Springer-Verlag, 2006.