Find the least-squares solution to a large, sparse, linear system of equations.
The function solves Ax = b or min ||b - Ax||^2 or min ||Ax - b||^2 + d^2 ||x||^2.
The matrix A may be square or rectangular (over-determined or under-determined), and may have any rank.
1. Unsymmetric equations -- solve A*x = b
2. Linear least squares -- solve A*x = b
in the least-squares sense
3. Damped least squares -- solve ( A )*x = ( b )
( damp*I ) ( 0 )
in the least-squares sense
Parameters : | A : {sparse matrix, ndarray, LinearOperatorLinear}
b : (m,) ndarray
damp : float
atol, btol : float
conlim : float
iter_lim : int
show : bool
calc_var : bool
|
---|---|
Returns : | x : ndarray of float
istop : int
itn : int
r1norm : float
r2norm : float
anorm : float
acond : float
arnorm : float
xnorm : float
var : ndarray of float
|
Notes
LSQR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of A should therefore be avoided where possible.
For example, in problem 1 the solution is unaltered by row-scaling. If a row of A is very small or large compared to the other rows of A, the corresponding row of ( A b ) should be scaled up or down.
In problems 1 and 2, the solution x is easily recovered following column-scaling. Unless better information is known, the nonzero columns of A should be scaled so that they all have the same Euclidean norm (e.g., 1.0).
In problem 3, there is no freedom to re-scale if damp is nonzero. However, the value of damp should be assigned only after attention has been paid to the scaling of A.
The parameter damp is intended to help regularize ill-conditioned systems, by preventing the true solution from being very large. Another aid to regularization is provided by the parameter acond, which may be used to terminate iterations before the computed solution becomes very large.
If some initial estimate x0 is known and if damp == 0, one could proceed as follows:
- Compute a residual vector r0 = b - A*x0.
- Use LSQR to solve the system A*dx = r0.
- Add the correction dx to obtain a final solution x = x0 + dx.
This requires that x0 be available before and after the call to LSQR. To judge the benefits, suppose LSQR takes k1 iterations to solve A*x = b and k2 iterations to solve A*dx = r0. If x0 is “good”, norm(r0) will be smaller than norm(b). If the same stopping tolerances atol and btol are used for each system, k1 and k2 will be similar, but the final solution x0 + dx should be more accurate. The only way to reduce the total work is to use a larger stopping tolerance for the second system. If some value btol is suitable for A*x = b, the larger value btol*norm(b)/norm(r0) should be suitable for A*dx = r0.
Preconditioning is another way to reduce the number of iterations. If it is possible to solve a related system M*x = b efficiently, where M approximates A in some helpful way (e.g. M - A has low rank or its elements are small relative to those of A), LSQR may converge more rapidly on the system A*M(inverse)*z = b, after which x can be recovered by solving M*x = z.
If A is symmetric, LSQR should not be used!
Alternatives are the symmetric conjugate-gradient method (cg) and/or SYMMLQ. SYMMLQ is an implementation of symmetric cg that applies to any symmetric A and will converge more rapidly than LSQR. If A is positive definite, there are other implementations of symmetric cg that require slightly less work per iteration than SYMMLQ (but will take the same number of iterations).
References
[R108] | C. C. Paige and M. A. Saunders (1982a). “LSQR: An algorithm for sparse linear equations and sparse least squares”, ACM TOMS 8(1), 43-71. |
[R109] | C. C. Paige and M. A. Saunders (1982b). “Algorithm 583. LSQR: Sparse linear equations and least squares problems”, ACM TOMS 8(2), 195-209. |
[R110] | M. A. Saunders (1995). “Solution of sparse rectangular systems using LSQR and CRAIG”, BIT 35, 588-604. |