scipy.optimize.minimize¶

scipy.optimize.
minimize
(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None)[source]¶ Minimization of scalar function of one or more variables.
Parameters:  fun : callable
The objective function to be minimized.
fun(x, *args) > float
where x is an 1D array with shape (n,) and args is a tuple of the fixed parameters needed to completely specify the function.
 x0 : ndarray, shape (n,)
Initial guess. Array of real elements of size (n,), where ‘n’ is the number of independent variables.
 args : tuple, optional
Extra arguments passed to the objective function and its derivatives (fun, jac and hess functions).
 method : str or callable, optional
Type of solver. Should be one of
 ‘NelderMead’ (see here)
 ‘Powell’ (see here)
 ‘CG’ (see here)
 ‘BFGS’ (see here)
 ‘NewtonCG’ (see here)
 ‘LBFGSB’ (see here)
 ‘TNC’ (see here)
 ‘COBYLA’ (see here)
 ‘SLSQP’ (see here)
 ‘trustconstr’(see here)
 ‘dogleg’ (see here)
 ‘trustncg’ (see here)
 ‘trustexact’ (see here)
 ‘trustkrylov’ (see here)
 custom  a callable object (added in version 0.14.0), see below for description.
If not given, chosen to be one of
BFGS
,LBFGSB
,SLSQP
, depending if the problem has constraints or bounds. jac : {callable, ‘2point’, ‘3point’, ‘cs’, bool}, optional
Method for computing the gradient vector. Only for CG, BFGS, NewtonCG, LBFGSB, TNC, SLSQP, dogleg, trustncg, trustkrylov, trustexact and trustconstr. If it is a callable, it should be a function that returns the gradient vector:
jac(x, *args) > array_like, shape (n,)
where x is an array with shape (n,) and args is a tuple with the fixed parameters. Alternatively, the keywords {‘2point’, ‘3point’, ‘cs’} select a finite difference scheme for numerical estimation of the gradient. Options ‘3point’ and ‘cs’ are available only to ‘trustconstr’. If jac is a Boolean and is True, fun is assumed to return the gradient along with the objective function. If False, the gradient will be estimated using ‘2point’ finite difference estimation.
 hess : {callable, ‘2point’, ‘3point’, ‘cs’, HessianUpdateStrategy}, optional
Method for computing the Hessian matrix. Only for NewtonCG, dogleg, trustncg, trustkrylov, trustexact and trustconstr. If it is callable, it should return the Hessian matrix:
hess(x, *args) > {LinearOperator, spmatrix, array}, (n, n)
where x is a (n,) ndarray and args is a tuple with the fixed parameters. LinearOperator and sparse matrix returns are allowed only for ‘trustconstr’ method. Alternatively, the keywords {‘2point’, ‘3point’, ‘cs’} select a finite difference scheme for numerical estimation. Or, objects implementing
HessianUpdateStrategy
interface can be used to approximate the Hessian. Available quasiNewton methods implementing this interface are:Whenever the gradient is estimated via finitedifferences, the Hessian cannot be estimated with options {‘2point’, ‘3point’, ‘cs’} and needs to be estimated using one of the quasiNewton strategies. Finitedifference options {‘2point’, ‘3point’, ‘cs’} and
HessianUpdateStrategy
are available only for ‘trustconstr’ method. hessp : callable, optional
Hessian of objective function times an arbitrary vector p. Only for NewtonCG, trustncg, trustkrylov, trustconstr. Only one of hessp or hess needs to be given. If hess is provided, then hessp will be ignored. hessp must compute the Hessian times an arbitrary vector:
hessp(x, p, *args) > ndarray shape (n,)
where x is a (n,) ndarray, p is an arbitrary vector with dimension (n,) and args is a tuple with the fixed parameters.
 bounds : sequence or
Bounds
, optional Bounds on variables for LBFGSB, TNC, SLSQP and trustconstr methods. There are two ways to specify the bounds:
 Instance of
Bounds
class.  Sequence of
(min, max)
pairs for each element in x. None is used to specify no bound.
 Instance of
 constraints : {Constraint, dict} or List of {Constraint, dict}, optional
Constraints definition (only for COBYLA, SLSQP and trustconstr). Constraints for ‘trustconstr’ are defined as a single object or a list of objects specifying constraints to the optimization problem. Available constraints are:
Constraints for COBYLA, SLSQP are defined as a list of dictionaries. Each dictionary with fields:
 type : str
Constraint type: ‘eq’ for equality, ‘ineq’ for inequality.
 fun : callable
The function defining the constraint.
 jac : callable, optional
The Jacobian of fun (only for SLSQP).
 args : sequence, optional
Extra arguments to be passed to the function and Jacobian.
Equality constraint means that the constraint function result is to be zero whereas inequality means that it is to be nonnegative. Note that COBYLA only supports inequality constraints.
 tol : float, optional
Tolerance for termination. For detailed control, use solverspecific options.
 options : dict, optional
A dictionary of solver options. All methods accept the following generic options:
 maxiter : int
Maximum number of iterations to perform.
 disp : bool
Set to True to print convergence messages.
For methodspecific options, see
show_options
. callback : callable, optional
Called after each iteration. For ‘trustconstr’ it is a callable with the signature:
callback(xk, OptimizeResult state) > bool
where
xk
is the current parameter vector. andstate
is anOptimizeResult
object, with the same fields as the ones from the return. If callback returns True the algorithm execution is terminated. For all the other methods, the signature is:callback(xk)
where
xk
is the current parameter vector.
Returns:  res : OptimizeResult
The optimization result represented as a
OptimizeResult
object. Important attributes are:x
the solution array,success
a Boolean flag indicating if the optimizer exited successfully andmessage
which describes the cause of the termination. SeeOptimizeResult
for a description of other attributes.
See also
minimize_scalar
 Interface to minimization algorithms for scalar univariate functions
show_options
 Additional options accepted by the solvers
Notes
This section describes the available solvers that can be selected by the ‘method’ parameter. The default method is BFGS.
Unconstrained minimization
Method NelderMead uses the Simplex algorithm [1], [2]. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the first and/or second derivatives information might be preferred for their better performance in general.
Method Powell is a modification of Powell’s method [3], [4] which is a conjugate direction method. It performs sequential onedimensional minimizations along each vector of the directions set (direc field in options and info), which is updated at each iteration of the main minimization loop. The function need not be differentiable, and no derivatives are taken.
Method CG uses a nonlinear conjugate gradient algorithm by Polak and Ribiere, a variant of the FletcherReeves method described in [5] pp. 120122. Only the first derivatives are used.
Method BFGS uses the quasiNewton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) [5] pp. 136. It uses the first derivatives only. BFGS has proven good performance even for nonsmooth optimizations. This method also returns an approximation of the Hessian inverse, stored as hess_inv in the OptimizeResult object.
Method NewtonCG uses a NewtonCG algorithm [5] pp. 168 (also known as the truncated Newton method). It uses a CG method to the compute the search direction. See also TNC method for a boxconstrained minimization with a similar algorithm. Suitable for largescale problems.
Method dogleg uses the dogleg trustregion algorithm [5] for unconstrained minimization. This algorithm requires the gradient and Hessian; furthermore the Hessian is required to be positive definite.
Method trustncg uses the Newton conjugate gradient trustregion algorithm [5] for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for largescale problems.
Method trustkrylov uses the Newton GLTR trustregion algorithm [14], [15] for unconstrained minimization. This algorithm requires the gradient and either the Hessian or a function that computes the product of the Hessian with a given vector. Suitable for largescale problems. On indefinite problems it requires usually less iterations than the trustncg method and is recommended for medium and largescale problems.
Method trustexact is a trustregion method for unconstrained minimization in which quadratic subproblems are solved almost exactly [13]. This algorithm requires the gradient and the Hessian (which is not required to be positive definite). It is, in many situations, the Newton method to converge in fewer iteraction and the most recommended for small and mediumsize problems.
BoundConstrained minimization
Method LBFGSB uses the LBFGSB algorithm [6], [7] for bound constrained minimization.
Method TNC uses a truncated Newton algorithm [5], [8] to minimize a function with variables subject to bounds. This algorithm uses gradient information; it is also called Newton ConjugateGradient. It differs from the NewtonCG method described above as it wraps a C implementation and allows each variable to be given upper and lower bounds.
Constrained Minimization
Method COBYLA uses the Constrained Optimization BY Linear Approximation (COBYLA) method [9], [10], [11]. The algorithm is based on linear approximations to the objective function and each constraint. The method wraps a FORTRAN implementation of the algorithm. The constraints functions ‘fun’ may return either a single number or an array or list of numbers.
Method SLSQP uses Sequential Least SQuares Programming to minimize a function of several variables with any combination of bounds, equality and inequality constraints. The method wraps the SLSQP Optimization subroutine originally implemented by Dieter Kraft [12]. Note that the wrapper handles infinite values in bounds by converting them into large floating values.
Method trustconstr is a trustregion algorithm for constrained optimization. It swiches between two implementations depending on the problem definition. It is the most versatile constrained minimization algorithm implemented in SciPy and the most appropriate for largescale problems. For equality constrained problems it is an implementation of ByrdOmojokun TrustRegion SQP method described in [17] and in [5], p. 549. When inequality constraints are imposed as well, it swiches to the trustregion interior point method described in [16]. This interior point algorithm, in turn, solves inequality constraints by introducing slack variables and solving a sequence of equalityconstrained barrier problems for progressively smaller values of the barrier parameter. The previously described equality constrained SQP method is used to solve the subproblems with increasing levels of accuracy as the iterate gets closer to a solution.
FiniteDifference Options
For Method trustconstr the gradient and the Hessian may be approximated using three finitedifference schemes: {‘2point’, ‘3point’, ‘cs’}. The scheme ‘cs’ is, potentially, the most accurate but it requires the function to correctly handles complex inputs and to be differentiable in the complex plane. The scheme ‘3point’ is more accurate than ‘2point’ but requires twice as much operations.
Custom minimizers
It may be useful to pass a custom minimization method, for example when using a frontend to this method such as
scipy.optimize.basinhopping
or a different library. You can simply pass a callable as themethod
parameter.The callable is called as
method(fun, x0, args, **kwargs, **options)
wherekwargs
corresponds to any other parameters passed tominimize
(such as callback, hess, etc.), except the options dict, which has its contents also passed as method parameters pair by pair. Also, if jac has been passed as a bool type, jac and fun are mangled so that fun returns just the function values and jac is converted to a function returning the Jacobian. The method shall return anOptimizeResult
object.The provided method callable must be able to accept (and possibly ignore) arbitrary parameters; the set of parameters accepted by
minimize
may expand in future versions and then these parameters will be passed to the method. You can find an example in the scipy.optimize tutorial.New in version 0.11.0.
References
[1] (1, 2) Nelder, J A, and R Mead. 1965. A Simplex Method for Function Minimization. The Computer Journal 7: 30813. [2] (1, 2) Wright M H. 1996. Direct search methods: Once scorned, now respectable, in Numerical Analysis 1995: Proceedings of the 1995 Dundee Biennial Conference in Numerical Analysis (Eds. D F Griffiths and G A Watson). Addison Wesley Longman, Harlow, UK. 191208. [3] (1, 2) Powell, M J D. 1964. An efficient method for finding the minimum of a function of several variables without calculating derivatives. The Computer Journal 7: 155162. [4] (1, 2) Press W, S A Teukolsky, W T Vetterling and B P Flannery. Numerical Recipes (any edition), Cambridge University Press. [5] (1, 2, 3, 4, 5, 6, 7, 8, 9) Nocedal, J, and S J Wright. 2006. Numerical Optimization. Springer New York. [6] (1, 2) Byrd, R H and P Lu and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM Journal on Scientific and Statistical Computing 16 (5): 11901208. [7] (1, 2) Zhu, C and R H Byrd and J Nocedal. 1997. LBFGSB: Algorithm 778: LBFGSB, FORTRAN routines for large scale bound constrained optimization. ACM Transactions on Mathematical Software 23 (4): 550560. [8] (1, 2) Nash, S G. NewtonType Minimization Via the Lanczos Method. 1984. SIAM Journal of Numerical Analysis 21: 770778. [9] (1, 2) Powell, M J D. A direct search optimization method that models the objective and constraint functions by linear interpolation. 1994. Advances in Optimization and Numerical Analysis, eds. S. Gomez and JP Hennart, Kluwer Academic (Dordrecht), 5167. [10] (1, 2) Powell M J D. Direct search algorithms for optimization calculations. 1998. Acta Numerica 7: 287336. [11] (1, 2) Powell M J D. A view of algorithms for optimization without derivatives. 2007.Cambridge University Technical Report DAMTP 2007/NA03 [12] (1, 2) Kraft, D. A software package for sequential quadratic programming. 1988. Tech. Rep. DFVLRFB 8828, DLR German Aerospace Center – Institute for Flight Mechanics, Koln, Germany. [13] (1, 2) Conn, A. R., Gould, N. I., and Toint, P. L. Trust region methods. 2000. Siam. pp. 169200. [14] (1, 2) F. Lenders, C. Kirches, A. Potschka: “trlib: A vectorfree implementation of the GLTR method for iterative solution of the trust region problem”, https://arxiv.org/abs/1611.04718 [15] (1, 2) N. Gould, S. Lucidi, M. Roma, P. Toint: “Solving the TrustRegion Subproblem using the Lanczos Method”, SIAM J. Optim., 9(2), 504–525, (1999). [16] (1, 2) Byrd, Richard H., Mary E. Hribar, and Jorge Nocedal. 1999. An interior point algorithm for largescale nonlinear programming. SIAM Journal on Optimization 9.4: 877900. [17] (1, 2) Lalee, Marucha, Jorge Nocedal, and Todd Plantega. 1998. On the implementation of an algorithm for largescale equality constrained optimization. SIAM Journal on Optimization 8.3: 682706. Examples
Let us consider the problem of minimizing the Rosenbrock function. This function (and its respective derivatives) is implemented in
rosen
(resp.rosen_der
,rosen_hess
) in thescipy.optimize
.>>> from scipy.optimize import minimize, rosen, rosen_der
A simple application of the NelderMead method is:
>>> x0 = [1.3, 0.7, 0.8, 1.9, 1.2] >>> res = minimize(rosen, x0, method='NelderMead', tol=1e6) >>> res.x array([ 1., 1., 1., 1., 1.])
Now using the BFGS algorithm, using the first derivative and a few options:
>>> res = minimize(rosen, x0, method='BFGS', jac=rosen_der, ... options={'gtol': 1e6, 'disp': True}) Optimization terminated successfully. Current function value: 0.000000 Iterations: 26 Function evaluations: 31 Gradient evaluations: 31 >>> res.x array([ 1., 1., 1., 1., 1.]) >>> print(res.message) Optimization terminated successfully. >>> res.hess_inv array([[ 0.00749589, 0.01255155, 0.02396251, 0.04750988, 0.09495377], # may vary [ 0.01255155, 0.02510441, 0.04794055, 0.09502834, 0.18996269], [ 0.02396251, 0.04794055, 0.09631614, 0.19092151, 0.38165151], [ 0.04750988, 0.09502834, 0.19092151, 0.38341252, 0.7664427 ], [ 0.09495377, 0.18996269, 0.38165151, 0.7664427, 1.53713523]])
Next, consider a minimization problem with several constraints (namely Example 16.4 from [5]). The objective function is:
>>> fun = lambda x: (x[0]  1)**2 + (x[1]  2.5)**2
There are three constraints defined as:
>>> cons = ({'type': 'ineq', 'fun': lambda x: x[0]  2 * x[1] + 2}, ... {'type': 'ineq', 'fun': lambda x: x[0]  2 * x[1] + 6}, ... {'type': 'ineq', 'fun': lambda x: x[0] + 2 * x[1] + 2})
And variables must be positive, hence the following bounds:
>>> bnds = ((0, None), (0, None))
The optimization problem is solved using the SLSQP method as:
>>> res = minimize(fun, (2, 0), method='SLSQP', bounds=bnds, ... constraints=cons)
It should converge to the theoretical solution (1.4 ,1.7).