doc/manual/tao.md

*7f296bb3SBarry Smith(ch_tao)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith# TAO: Optimization Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Toolkit for Advanced Optimization (TAO) focuses on algorithms for the
*7f296bb3SBarry Smithsolution of large-scale optimization problems on high-performance
*7f296bb3SBarry Smitharchitectures. Methods are available for
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- {any}`sec_tao_leastsquares`
*7f296bb3SBarry Smith- {any}`sec_tao_quadratic`
*7f296bb3SBarry Smith- {any}`sec_tao_unconstrained`
*7f296bb3SBarry Smith- {any}`sec_tao_bound`
*7f296bb3SBarry Smith- {any}`sec_tao_constrained`
*7f296bb3SBarry Smith- {any}`sec_tao_complementary`
*7f296bb3SBarry Smith- {any}`sec_tao_pde_constrained`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_getting_started)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith## Getting Started: A Simple TAO Example
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTo help the user start using TAO immediately, we introduce here a simple
*7f296bb3SBarry Smithuniprocessor example. Please read {any}`sec_tao_solvers`
*7f296bb3SBarry Smithfor a more in-depth discussion on using the TAO solvers. The code
*7f296bb3SBarry Smithpresented {any}`below <tao_example1>` minimizes the
*7f296bb3SBarry Smithextended Rosenbrock function $f: \mathbb R^n \to \mathbb R$
*7f296bb3SBarry Smithdefined by
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithf(x) = \sum_{i=0}^{m-1} \left( \alpha(x_{2i+1}-x_{2i}^2)^2 + (1-x_{2i})^2 \right),
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $n = 2m$ is the number of variables. Note that while we use
*7f296bb3SBarry Smiththe C language to introduce the TAO software, the package is fully
*7f296bb3SBarry Smithusable from C++ and Fortran.
*7f296bb3SBarry Smith{any}`ch_fortran` discusses additional
*7f296bb3SBarry Smithissues concerning Fortran usage.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe code in {any}`the example <tao_example1>` contains many of
*7f296bb3SBarry Smiththe components needed to write most TAO programs and thus is
*7f296bb3SBarry Smithillustrative of the features present in complex optimization problems.
*7f296bb3SBarry SmithNote that for display purposes we have omitted some nonessential lines
*7f296bb3SBarry Smithof code as well as the (essential) code required for the routine
*7f296bb3SBarry Smith`FormFunctionGradient`, which evaluates the function and gradient, and
*7f296bb3SBarry Smiththe code for `FormHessian`, which evaluates the Hessian matrix for
*7f296bb3SBarry SmithRosenbrock’s function. The complete code is available in
*7f296bb3SBarry Smith<a href="PETSC_DOC_OUT_ROOT_PLACEHOLDER/src/tao/unconstrained/tutorials/rosenbrock1.c.html">\$TAO_DIR/src/unconstrained/tutorials/rosenbrock1.c</a>.
*7f296bb3SBarry SmithThe following sections annotate the lines of code in
*7f296bb3SBarry Smith{any}`the example <tao_example1>`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(tao_example1)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith:::{admonition} Listing: `src/tao/unconstrained/tutorials/rosenbrock1.c`
*7f296bb3SBarry Smith```{literalinclude} /../src/tao/unconstrained/tutorials/rosenbrock1.c
*7f296bb3SBarry Smith:append: return ierr;}
*7f296bb3SBarry Smith:end-at: PetscFinalize
*7f296bb3SBarry Smith:prepend: '#include <petsctao.h>'
*7f296bb3SBarry Smith:start-at: typedef struct
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith:::
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_workflow)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith## TAO Workflow
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithMany TAO applications will follow an ordered set of procedures for
*7f296bb3SBarry Smithsolving an optimization problem: The user creates a `Tao` context and
*7f296bb3SBarry Smithselects a default algorithm. Call-back routines as well as vector
*7f296bb3SBarry Smith(`Vec`) and matrix (`Mat`) data structures are then set. These
*7f296bb3SBarry Smithcall-back routines will be used for evaluating the objective function,
*7f296bb3SBarry Smithgradient, and perhaps the Hessian matrix. The user then invokes TAO to
*7f296bb3SBarry Smithsolve the optimization problem and finally destroys the `Tao` context.
*7f296bb3SBarry SmithA list of the necessary functions for performing these steps using TAO
*7f296bb3SBarry Smithis shown below.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoCreate(MPI_Comm comm, Tao *tao);
*7f296bb3SBarry SmithTaoSetType(Tao tao, TaoType type);
*7f296bb3SBarry SmithTaoSetSolution(Tao tao, Vec x);
*7f296bb3SBarry SmithTaoSetObjectiveAndGradient(Tao tao, Vec g, PetscErrorCode (*FormFGradient)(Tao, Vec, PetscReal*, Vec, void*), void *user);
*7f296bb3SBarry SmithTaoSetHessian(Tao tao, Mat H, Mat Hpre, PetscErrorCode (*FormHessian)(Tao, Vec, Mat, Mat, void*), void *user);
*7f296bb3SBarry SmithTaoSolve(Tao tao);
*7f296bb3SBarry SmithTaoDestroy(Tao tao);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNote that the solver algorithm selected through the function
*7f296bb3SBarry Smith`TaoSetType()` can be overridden at runtime by using an options
*7f296bb3SBarry Smithdatabase. Through this database, the user not only can select a
*7f296bb3SBarry Smithminimization method (e.g., limited-memory variable metric, conjugate
*7f296bb3SBarry Smithgradient, Newton with line search or trust region) but also can
*7f296bb3SBarry Smithprescribe the convergence tolerance, set various monitoring routines,
*7f296bb3SBarry Smithset iterative methods and preconditions for solving the linear systems,
*7f296bb3SBarry Smithand so forth. See {any}`sec_tao_solvers` for more
*7f296bb3SBarry Smithinformation on the solver methods available in TAO.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Header File
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO applications written in C/C++ should have the statement
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith#include <petsctao.h>
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithin each file that uses a routine in the TAO libraries.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Creation and Destruction
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithA TAO solver can be created by calling the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoCreate(MPI_Comm, Tao*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. Much like creating PETSc vector and matrix objects, the first
*7f296bb3SBarry Smithargument is an MPI *communicator*. An MPI [^mpi]
*7f296bb3SBarry Smithcommunicator indicates a collection of processors that will be used to
*7f296bb3SBarry Smithevaluate the objective function, compute constraints, and provide
*7f296bb3SBarry Smithderivative information. When only one processor is being used, the
*7f296bb3SBarry Smithcommunicator `PETSC_COMM_SELF` can be used with no understanding of
*7f296bb3SBarry SmithMPI. Even parallel users need to be familiar with only the basic
*7f296bb3SBarry Smithconcepts of message passing and distributed-memory computing. Most
*7f296bb3SBarry Smithapplications running TAO in parallel environments can employ the
*7f296bb3SBarry Smithcommunicator `PETSC_COMM_WORLD` to indicate all processes known to
*7f296bb3SBarry SmithPETSc in a given run.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetType(Tao, TaoType);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcan be used to set the algorithm TAO uses to solve the application. The
*7f296bb3SBarry Smithvarious types of TAO solvers and the flags that identify them will be
*7f296bb3SBarry Smithdiscussed in the following sections. The solution method should be
*7f296bb3SBarry Smithcarefully chosen depending on the problem being solved. Some solvers,
*7f296bb3SBarry Smithfor instance, are meant for problems with no constraints, whereas other
*7f296bb3SBarry Smithsolvers acknowledge constraints in the problem and handle them
*7f296bb3SBarry Smithaccordingly. The user must also be aware of the derivative information
*7f296bb3SBarry Smiththat is available. Some solvers require second-order information, while
*7f296bb3SBarry Smithother solvers require only gradient or function information. The command
*7f296bb3SBarry Smithline option `-tao_type` followed by
*7f296bb3SBarry Smitha TAO method will override any method specified by the second argument.
*7f296bb3SBarry SmithThe command line option `-tao_type bqnls`, for instance, will
*7f296bb3SBarry Smithspecify the limited-memory quasi-Newton line search method for
*7f296bb3SBarry Smithbound-constrained problems. Note that the `TaoType` variable is a string that
*7f296bb3SBarry Smithrequires quotation marks in an application program, but quotation marks
*7f296bb3SBarry Smithare not required at the command line.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithEach TAO solver that has been created should also be destroyed by using
*7f296bb3SBarry Smiththe
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoDestroy(Tao tao);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. This routine frees the internal data structures used by the
*7f296bb3SBarry Smithsolver.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Command-line Options
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAdditional options for the TAO solver can be set from the command
*7f296bb3SBarry Smithline by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetFromOptions(Tao)
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. This command also provides information about runtime options
*7f296bb3SBarry Smithwhen the user includes the `-help` option on the command line.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn addition to common command line options shared by all TAO solvers, each TAO
*7f296bb3SBarry Smithmethod also implements its own specialized options. Please refer to the
*7f296bb3SBarry Smithdocumentation for individual methods for more details.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Defining Variables
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn all the optimization solvers, the application must provide a `Vec`
*7f296bb3SBarry Smithobject of appropriate dimension to represent the variables. This vector
*7f296bb3SBarry Smithwill be cloned by the solvers to create additional work space within the
*7f296bb3SBarry Smithsolver. If this vector is distributed over multiple processors, it
*7f296bb3SBarry Smithshould have a parallel distribution that allows for efficient scaling,
*7f296bb3SBarry Smithinner products, and function evaluations. This vector can be passed to
*7f296bb3SBarry Smiththe application object by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetSolution(Tao, Vec);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. When using this routine, the application should initialize the
*7f296bb3SBarry Smithvector with an approximate solution of the optimization problem before
*7f296bb3SBarry Smithcalling the TAO solver. This vector will be used by the TAO solver to
*7f296bb3SBarry Smithstore the solution. Elsewhere in the application, this solution vector
*7f296bb3SBarry Smithcan be retrieved from the application object by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoGetSolution(Tao, Vec*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. This routine takes the address of a `Vec` in the second
*7f296bb3SBarry Smithargument and sets it to the solution vector used in the application.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### User Defined Call-back Routines
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithUsers of TAO are required to provide routines that perform function
*7f296bb3SBarry Smithevaluations. Depending on the solver chosen, they may also have to write
*7f296bb3SBarry Smithroutines that evaluate the gradient vector and Hessian matrix.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Application Context
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWriting a TAO application may require use of an *application context*.
*7f296bb3SBarry SmithAn application context is a structure or object defined by an
*7f296bb3SBarry Smithapplication developer, passed into a routine also written by the
*7f296bb3SBarry Smithapplication developer, and used within the routine to perform its stated
*7f296bb3SBarry Smithtask.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor example, a routine that evaluates an objective function may need
*7f296bb3SBarry Smithparameters, work vectors, and other information. This information, which
*7f296bb3SBarry Smithmay be specific to an application and necessary to evaluate the
*7f296bb3SBarry Smithobjective, can be collected in a single structure and used as one of the
*7f296bb3SBarry Smitharguments in the routine. The address of this structure will be cast as
*7f296bb3SBarry Smithtype `(void*)` and passed to the routine in the final argument. Many
*7f296bb3SBarry Smithexamples of these structures are included in the TAO distribution.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis technique offers several advantages. In particular, it allows for a
*7f296bb3SBarry Smithuniform interface between TAO and the applications. The fundamental
*7f296bb3SBarry Smithinformation needed by TAO appears in the arguments of the routine, while
*7f296bb3SBarry Smithdata specific to an application and its implementation is confined to an
*7f296bb3SBarry Smithopaque pointer. The routines can access information created outside the
*7f296bb3SBarry Smithlocal scope without the use of global variables. The TAO solvers and
*7f296bb3SBarry Smithapplication objects will never access this structure, so the application
*7f296bb3SBarry Smithdeveloper has complete freedom to define it. If no such structure or
*7f296bb3SBarry Smithneeded by the application then a NULL pointer can be used.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_fghj)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Objective Function and Gradient Routines
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO solvers that minimize an objective function require the application
*7f296bb3SBarry Smithto evaluate the objective function. Some solvers may also require the
*7f296bb3SBarry Smithapplication to evaluate derivatives of the objective function. Routines
*7f296bb3SBarry Smiththat perform these computations must be identified to the application
*7f296bb3SBarry Smithobject and must follow a strict calling sequence.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithRoutines should follow the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateObjective(Tao, Vec, PetscReal*, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithin order to evaluate an objective function
*7f296bb3SBarry Smith$f: \, \mathbb R^n \to \mathbb R$. The first argument is the TAO
*7f296bb3SBarry SmithSolver object, the second argument is the $n$-dimensional vector
*7f296bb3SBarry Smiththat identifies where the objective should be evaluated, and the fourth
*7f296bb3SBarry Smithargument is an application context. This routine should use the third
*7f296bb3SBarry Smithargument to return the objective value evaluated at the point specified
*7f296bb3SBarry Smithby the vector in the second argument.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine, and the application context, should be passed to the
*7f296bb3SBarry Smithapplication object by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetObjective(Tao, PetscErrorCode(*)(Tao,Vec,PetscReal*,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. The first argument in this routine is the TAO solver object,
*7f296bb3SBarry Smiththe second argument is a function pointer to the routine that evaluates
*7f296bb3SBarry Smiththe objective, and the third argument is the pointer to an appropriate
*7f296bb3SBarry Smithapplication context. Although the final argument may point to anything,
*7f296bb3SBarry Smithit must be cast as a `(void*)` type. This pointer will be passed back
*7f296bb3SBarry Smithto the developer in the fourth argument of the routine that evaluates
*7f296bb3SBarry Smiththe objective. In this routine, the pointer can be cast back to the
*7f296bb3SBarry Smithappropriate type. Examples of these structures and their usage are
*7f296bb3SBarry Smithprovided in the distribution.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithMany TAO solvers also require gradient information from the application
*7f296bb3SBarry SmithThe gradient of the objective function is specified in a similar manner.
*7f296bb3SBarry SmithRoutines that evaluate the gradient should have the calling sequence
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateGradient(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the first argument is the TAO solver object, the second argument
*7f296bb3SBarry Smithis the variable vector, the third argument is the gradient vector, and
*7f296bb3SBarry Smiththe fourth argument is the user-defined application context. Only the
*7f296bb3SBarry Smiththird argument in this routine is different from the arguments in the
*7f296bb3SBarry Smithroutine for evaluating the objective function. The numbers in the
*7f296bb3SBarry Smithgradient vector have no meaning when passed into this routine, but they
*7f296bb3SBarry Smithshould represent the gradient of the objective at the specified point at
*7f296bb3SBarry Smiththe end of the routine. This routine, and the user-defined pointer, can
*7f296bb3SBarry Smithbe passed to the application object by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetGradient(Tao, Vec, PetscErrorCode (*)(Tao,Vec,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. In this routine, the first argument is the Tao object, the second
*7f296bb3SBarry Smithargument is the optional vector to hold the computed gradient, the
*7f296bb3SBarry Smiththird argument is the function pointer, and the fourth object is the
*7f296bb3SBarry Smithapplication context, cast to `(void*)`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithInstead of evaluating the objective and its gradient in separate
*7f296bb3SBarry Smithroutines, TAO also allows the user to evaluate the function and the
*7f296bb3SBarry Smithgradient in the same routine. In fact, some solvers are more efficient
*7f296bb3SBarry Smithwhen both function and gradient information can be computed in the same
*7f296bb3SBarry Smithroutine. These routines should follow the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateFunctionAndGradient(Tao, Vec, PetscReal*, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the first argument is the TAO solver and the second argument
*7f296bb3SBarry Smithpoints to the input vector for use in evaluating the function and
*7f296bb3SBarry Smithgradient. The third argument should return the function value, while the
*7f296bb3SBarry Smithfourth argument should return the gradient vector. The fifth argument is
*7f296bb3SBarry Smitha pointer to a user-defined context. This context and the name of the
*7f296bb3SBarry Smithroutine should be set with the call
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetObjectiveAndGradient(Tao, Vec PetscErrorCode (*)(Tao,Vec,PetscReal*,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the arguments are the TAO application, the optional vector to be
*7f296bb3SBarry Smithused to hold the computed gradient, a function pointer, and a
*7f296bb3SBarry Smithpointer to a user-defined context.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAO example problems demonstrate the use of these application
*7f296bb3SBarry Smithcontexts as well as specific instances of function, gradient, and
*7f296bb3SBarry SmithHessian evaluation routines. All these routines should return the
*7f296bb3SBarry Smithinteger $0$ after successful completion and a nonzero integer if
*7f296bb3SBarry Smiththe function is undefined at that point or an error occurred.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_matrixfree)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Hessian Evaluation
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome optimization routines also require a Hessian matrix from the user.
*7f296bb3SBarry SmithThe routine that evaluates the Hessian should have the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateHessian(Tao, Vec, Mat, Mat, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the first argument of this routine is a TAO solver object. The
*7f296bb3SBarry Smithsecond argument is the point at which the Hessian should be evaluated.
*7f296bb3SBarry SmithThe third argument is the Hessian matrix, and the sixth argument is a
*7f296bb3SBarry Smithuser-defined context. Since the Hessian matrix is usually used in
*7f296bb3SBarry Smithsolving a system of linear equations, a preconditioner for the matrix is
*7f296bb3SBarry Smithoften needed. The fourth argument is the matrix that will be used for
*7f296bb3SBarry Smithpreconditioning the linear system; in most cases, this matrix will be
*7f296bb3SBarry Smiththe same as the Hessian matrix. The fifth argument is the flag used to
*7f296bb3SBarry Smithset the Hessian matrix and linear solver in the routine
*7f296bb3SBarry Smith`KSPSetOperators()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne can set the Hessian evaluation routine by calling the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetHessian(Tao, Mat, Mat, PetscErrorCode (*)(Tao,Vec,Mat,Mat,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. The first argument is the TAO Solver object. The second and
*7f296bb3SBarry Smiththird arguments are, respectively, the Mat object where the Hessian will
*7f296bb3SBarry Smithbe stored and the Mat object that will be used for the preconditioning
*7f296bb3SBarry Smith(they may be the same). The fourth argument is the function that
*7f296bb3SBarry Smithevaluates the Hessian, and the fifth argument is a pointer to a
*7f296bb3SBarry Smithuser-defined context, cast to `(void*)`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Finite Differences
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFinite-difference approximations can be used to compute the gradient and
*7f296bb3SBarry Smiththe Hessian of an objective function. These approximations will slow the
*7f296bb3SBarry Smithsolve considerably and are recommended primarily for checking the
*7f296bb3SBarry Smithaccuracy of hand-coded gradients and Hessians. These routines are
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoDefaultComputeGradient(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoDefaultComputeHessian(Tao, Vec, Mat*, Mat*,void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithrespectively. They can be set by using `TaoSetGradient()` and
*7f296bb3SBarry Smith`TaoSetHessian()` or through the options database with the
*7f296bb3SBarry Smithoptions `-tao_fdgrad` and `-tao_fd`, respectively.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe efficiency of the finite-difference Hessian can be improved if the
*7f296bb3SBarry Smithcoloring of the matrix is known. If the application programmer creates a
*7f296bb3SBarry SmithPETSc `MatFDColoring` object, it can be applied to the
*7f296bb3SBarry Smithfinite-difference approximation by setting the Hessian evaluation
*7f296bb3SBarry Smithroutine to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoDefaultComputeHessianColor(Tao, Vec, Mat*, Mat*, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand using the `MatFDColoring` object as the last (`void *`) argument
*7f296bb3SBarry Smithto `TaoSetHessian()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne also can use finite-difference approximations to directly check the
*7f296bb3SBarry Smithcorrectness of the gradient and/or Hessian evaluation routines. This
*7f296bb3SBarry Smithprocess can be initiated from the command line by using the special TAO
*7f296bb3SBarry Smithsolver `tao_fd_test` together with the option `-tao_test_gradient`
*7f296bb3SBarry Smithor `-tao_test_hessian`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Matrix-Free Methods
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO fully supports matrix-free methods. The matrices specified in the
*7f296bb3SBarry SmithHessian evaluation routine need not be conventional matrices; instead,
*7f296bb3SBarry Smiththey can point to the data required to implement a particular
*7f296bb3SBarry Smithmatrix-free method. The matrix-free variant is allowed *only* when the
*7f296bb3SBarry Smithlinear systems are solved by an iterative method in combination with no
*7f296bb3SBarry Smithpreconditioning (`PCNONE` or `-pc_type none`), a user-provided
*7f296bb3SBarry Smithpreconditioner matrix, or a user-provided preconditioner shell
*7f296bb3SBarry Smith(`PCSHELL`). In other words, matrix-free methods cannot be used if a
*7f296bb3SBarry Smithdirect solver is to be employed. Details about using matrix-free methods
*7f296bb3SBarry Smithare provided in the {doc}`/manual/index`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith:::{figure} /images/manual/taofig.svg
*7f296bb3SBarry Smith:name: fig_taocallbacks
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTao use of PETSc and callbacks
*7f296bb3SBarry Smith:::
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bounds)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Constraints
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome optimization problems also impose constraints on the variables or
*7f296bb3SBarry Smithintermediate application states. The user defines these constraints through
*7f296bb3SBarry Smiththe appropriate TAO interface functions and call-back routines where necessary.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Variable Bounds
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe simplest type of constraint on an optimization problem puts lower or
*7f296bb3SBarry Smithupper bounds on the variables. Vectors that represent lower and upper
*7f296bb3SBarry Smithbounds for each variable can be set with the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetVariableBounds(Tao, Vec, Vec);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. The first vector and second vector should contain the lower and
*7f296bb3SBarry Smithupper bounds, respectively. When no upper or lower bound exists for a
*7f296bb3SBarry Smithvariable, the bound may be set to `PETSC_INFINITY` or `PETSC_NINFINITY`.
*7f296bb3SBarry SmithAfter the two bound vectors have been set, they may be accessed with the
*7f296bb3SBarry Smithcommand `TaoGetVariableBounds()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSince not all solvers recognize the presence of bound constraints on
*7f296bb3SBarry Smithvariables, the user must be careful to select a solver that acknowledges
*7f296bb3SBarry Smiththese bounds.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_programming)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### General Constraints
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome TAO algorithms also support general constraints as a linear or nonlinear
*7f296bb3SBarry Smithfunction of the optimization variables. These constraints can be imposed either
*7f296bb3SBarry Smithas equalities or inequalities. TAO currently does not make any distinctions
*7f296bb3SBarry Smithbetween linear and nonlinear constraints, and implements them through the
*7f296bb3SBarry Smithsame software interfaces.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn the equality constrained case, TAO assumes that the constraints are
*7f296bb3SBarry Smithformulated as $c_e(x) = 0$ and requires the user to implement a call-back
*7f296bb3SBarry Smithroutine for evaluating $c_e(x)$ at a given vector of optimization
*7f296bb3SBarry Smithvariables,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateEqualityConstraints(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAs in the previous call-back routines, the first argument is the TAO solver
*7f296bb3SBarry Smithobject. The second and third arguments are the vector of optimization variables
*7f296bb3SBarry Smith(input) and vector of equality constraints (output), respectively. The final
*7f296bb3SBarry Smithargument is a pointer to the user-defined application context, cast into
*7f296bb3SBarry Smith`(void*)`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithGenerally constrained TAO algorithms also require a second user call-back
*7f296bb3SBarry Smithfunction to compute the constraint Jacobian matrix $\nabla_x c_e(x)$,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateEqualityJacobian(Tao, Vec, Mat, Mat, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the first and last arguments are the TAO solver object and the application
*7f296bb3SBarry Smithcontext pointer as before. The second argument is the vector of optimization
*7f296bb3SBarry Smithvariables at which the computation takes place. The third and fourth arguments
*7f296bb3SBarry Smithare the constraint Jacobian and its pseudo-inverse (optional), respectively. The
*7f296bb3SBarry Smithpseudoinverse is optional, and if not available, the user can simply set it
*7f296bb3SBarry Smithto the constraint Jacobian itself.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThese call-back functions are then given to the TAO solver using the
*7f296bb3SBarry Smithinterface functions
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetEqualityConstraintsRoutine(Tao, Vec, PetscErrorCode (*)(Tao,Vec,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetJacobianEqualityRoutine(Tao, Mat, Mat, PetscErrorCode (*)(Tao,Vec,Mat,Mat,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithInequality constraints are assumed to be formulated as $c_i(x) \leq 0$
*7f296bb3SBarry Smithand follow the same workflow as equality constraints using the
*7f296bb3SBarry Smith`TaoSetInequalityConstraintsRoutine()` and `TaoSetJacobianInequalityRoutine()`
*7f296bb3SBarry Smithinterfaces.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome TAO algorithms may adopt an alternative double-sided
*7f296bb3SBarry Smith$c_l \leq c_i(x) \leq c_u$ formulation and require the lower and upper
*7f296bb3SBarry Smithbounds $c_l$ and $c_u$ to be set using the
*7f296bb3SBarry Smith`TaoSetInequalityBounds(Tao,Vec,Vec)` interface. Please refer to the
*7f296bb3SBarry Smithdocumentation for each TAO algorithm for further details.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Solving
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOnce the application and solver have been set up, the solver can be
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSolve(Tao);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. We discuss several universal options below.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_customize)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Convergence
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAlthough TAO and its solvers set default parameters that are useful for
*7f296bb3SBarry Smithmany problems, the user may need to modify these parameters in order to
*7f296bb3SBarry Smithchange the behavior and convergence of various algorithms.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne convergence criterion for most algorithms concerns the number of
*7f296bb3SBarry Smithdigits of accuracy needed in the solution. In particular, the
*7f296bb3SBarry Smithconvergence test employed by TAO attempts to stop when the error in the
*7f296bb3SBarry Smithconstraints is less than $\epsilon_{crtol}$ and either
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smith||g(X)|| &\leq& \epsilon_{gatol}, \\
*7f296bb3SBarry Smith||g(X)||/|f(X)| &\leq& \epsilon_{grtol}, \quad \text{or} \\
*7f296bb3SBarry Smith||g(X)||/|g(X_0)| &\leq& \epsilon_{gttol},
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $X$ is the current approximation to the true solution
*7f296bb3SBarry Smith$X^*$ and $X_0$ is the initial guess. $X^*$ is
*7f296bb3SBarry Smithunknown, so TAO estimates $f(X) - f(X^*)$ with either the square
*7f296bb3SBarry Smithof the norm of the gradient or the duality gap. A relative tolerance of
*7f296bb3SBarry Smith$\epsilon_{frtol}=0.01$ indicates that two significant digits are
*7f296bb3SBarry Smithdesired in the objective function. Each solver sets its own convergence
*7f296bb3SBarry Smithtolerances, but they can be changed by using the routine
*7f296bb3SBarry Smith`TaoSetTolerances()`. Another set of convergence tolerances terminates
*7f296bb3SBarry Smiththe solver when the norm of the gradient function (or Lagrangian
*7f296bb3SBarry Smithfunction for bound-constrained problems) is sufficiently close to zero.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOther stopping criteria include a minimum trust-region radius or a
*7f296bb3SBarry Smithmaximum number of iterations. These parameters can be set with the
*7f296bb3SBarry Smithroutines `TaoSetTrustRegionTolerance()` and
*7f296bb3SBarry Smith`TaoSetMaximumIterations()` Similarly, a maximum number of function
*7f296bb3SBarry Smithevaluations can be set with the command
*7f296bb3SBarry Smith`TaoSetMaximumFunctionEvaluations()`. `-tao_max_it`, and
*7f296bb3SBarry Smith`-tao_max_funcs`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Viewing Status
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTo see parameters and performance statistics for the solver, the routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoView(Tao tao)
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcan be used. This routine will display to standard output the number of
*7f296bb3SBarry Smithfunction evaluations need by the solver and other information specific
*7f296bb3SBarry Smithto the solver. This same output can be produced by using the command
*7f296bb3SBarry Smithline option `-tao_view`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe progress of the optimization solver can be monitored with the
*7f296bb3SBarry Smithruntime option `-tao_monitor`. Although monitoring routines can be
*7f296bb3SBarry Smithcustomized, the default monitoring routine will print out several
*7f296bb3SBarry Smithrelevant statistics to the screen.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe user also has access to information about the current solution. The
*7f296bb3SBarry Smithcurrent iteration number, objective function value, gradient norm,
*7f296bb3SBarry Smithinfeasibility norm, and step length can be retrieved with the following
*7f296bb3SBarry Smithcommand.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoGetSolutionStatus(Tao tao, PetscInt* iterate, PetscReal* f,
*7f296bb3SBarry Smith                  PetscReal* gnorm, PetscReal* cnorm, PetscReal* xdiff,
*7f296bb3SBarry Smith                  TaoConvergedReason* reason)
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe last argument returns a code that indicates the reason that the
*7f296bb3SBarry Smithsolver terminated. Positive numbers indicate that a solution has been
*7f296bb3SBarry Smithfound, while negative numbers indicate a failure. A list of reasons can
*7f296bb3SBarry Smithbe found in the manual page for `TaoGetConvergedReason()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Obtaining a Solution
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAfter exiting the `TaoSolve()` function, the solution and the gradient can be
*7f296bb3SBarry Smithrecovered with the following routines.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoGetSolution(Tao, Vec*);
*7f296bb3SBarry SmithTaoGetGradient(Tao, Vec*, NULL, NULL);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNote that the `Vec` returned by `TaoGetSolution()` will be the
*7f296bb3SBarry Smithsame vector passed to `TaoSetSolution()`. This information can be
*7f296bb3SBarry Smithobtained during user-defined routines such as a function evaluation and
*7f296bb3SBarry Smithcustomized monitoring routine or after the solver has terminated.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Special Problem structures
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithCertain special classes of problems solved with TAO utilize specialized
*7f296bb3SBarry Smithcode interfaces that are described below per problem type.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_pde_constrained)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### PDE-constrained Optimization
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO solves PDE-constrained optimization problems of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{u,v} & f(u,v) \\
*7f296bb3SBarry Smith\text{subject to} & g(u,v) = 0,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the state variable $u$ is the solution to the discretized
*7f296bb3SBarry Smithpartial differential equation defined by $g$ and parametrized by
*7f296bb3SBarry Smiththe design variable $v$, and $f$ is an objective function.
*7f296bb3SBarry SmithThe Lagrange multipliers on the constraint are denoted by $y$.
*7f296bb3SBarry SmithThis method is set by using the linearly constrained augmented
*7f296bb3SBarry SmithLagrangian TAO solver `tao_lcl`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe make two main assumptions when solving these problems: the objective
*7f296bb3SBarry Smithfunction and PDE constraints have been discretized so that we can treat
*7f296bb3SBarry Smiththe optimization problem as finite dimensional and
*7f296bb3SBarry Smith$\nabla_u g(u,v)$ is invertible for all $u$ and $v$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithUnlike other TAO solvers where the solution vector contains only the
*7f296bb3SBarry Smithoptimization variables, PDE-constrained problems solved with `tao_lcl`
*7f296bb3SBarry Smithcombine the design and state variables together in a monolithic solution vector
*7f296bb3SBarry Smith$x^T = [u^T, v^T]$. Consequently, the user must provide index sets to
*7f296bb3SBarry Smithseparate the two,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetStateDesignIS(Tao, IS, IS);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the first IS is a PETSc IndexSet containing the indices of the
*7f296bb3SBarry Smithstate variables and the second IS the design variables.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithPDE constraints have the general form $g(x) = 0$,
*7f296bb3SBarry Smithwhere $c: \mathbb R^n \to \mathbb R^m$. These constraints should
*7f296bb3SBarry Smithbe specified in a routine, written by the user, that evaluates
*7f296bb3SBarry Smith$g(x)$. The routine that evaluates the constraint equations
*7f296bb3SBarry Smithshould have the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateConstraints(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe first argument of this routine is a TAO solver object. The second
*7f296bb3SBarry Smithargument is the variable vector at which the constraint function should
*7f296bb3SBarry Smithbe evaluated. The third argument is the vector of function values
*7f296bb3SBarry Smith$g(x)$, and the fourth argument is a pointer to a user-defined
*7f296bb3SBarry Smithcontext. This routine and the user-defined context should be set in the
*7f296bb3SBarry SmithTAO solver with the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetConstraintsRoutine(Tao, Vec, PetscErrorCode (*)(Tao,Vec,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. In this function, the first argument is the TAO solver object,
*7f296bb3SBarry Smiththe second argument a vector in which to store the constraints, the
*7f296bb3SBarry Smiththird argument is a function point to the routine for evaluating the
*7f296bb3SBarry Smithconstraints, and the fourth argument is a pointer to a user-defined
*7f296bb3SBarry Smithcontext.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Jacobian of $g(x)$ is the matrix in
*7f296bb3SBarry Smith$\mathbb R^{m \times n}$ such that each column contains the
*7f296bb3SBarry Smithpartial derivatives of $g(x)$ with respect to one variable. The
*7f296bb3SBarry Smithevaluation of the Jacobian of $g$ should be performed by calling
*7f296bb3SBarry Smiththe
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode JacobianState(Tao, Vec, Mat, Mat, Mat, void*);
*7f296bb3SBarry SmithPetscErrorCode JacobianDesign(Tao, Vec, Mat*, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutines. In these functions, The first argument is the TAO solver
*7f296bb3SBarry Smithobject. The second argument is the variable vector at which to evaluate
*7f296bb3SBarry Smiththe Jacobian matrix, the third argument is the Jacobian matrix, and the
*7f296bb3SBarry Smithlast argument is a pointer to a user-defined context. The fourth and
*7f296bb3SBarry Smithfifth arguments of the Jacobian evaluation with respect to the state
*7f296bb3SBarry Smithvariables are for providing PETSc matrix objects for the preconditioner
*7f296bb3SBarry Smithand for applying the inverse of the state Jacobian, respectively. This
*7f296bb3SBarry Smithinverse matrix may be `PETSC_NULL`, in which case TAO will use a PETSc
*7f296bb3SBarry SmithKrylov subspace solver to solve the state system. These evaluation
*7f296bb3SBarry Smithroutines should be registered with TAO by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetJacobianStateRoutine(Tao, Mat, Mat, Mat,
*7f296bb3SBarry Smith                        PetscErrorCode (*)(Tao,Vec,Mat,Mat,void*),
*7f296bb3SBarry Smith                        void*);
*7f296bb3SBarry SmithTaoSetJacobianDesignRoutine(Tao, Mat,
*7f296bb3SBarry Smith                        PetscErrorCode (*)(Tao,Vec,Mat*,void*),
*7f296bb3SBarry Smith                        void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutines. The first argument is the TAO solver object, and the second
*7f296bb3SBarry Smithargument is the matrix in which the Jacobian information can be stored.
*7f296bb3SBarry SmithFor the state Jacobian, the third argument is the matrix that will be
*7f296bb3SBarry Smithused for preconditioning, and the fourth argument is an optional matrix
*7f296bb3SBarry Smithfor the inverse of the state Jacobian. One can use `PETSC_NULL` for
*7f296bb3SBarry Smiththis inverse argument and let PETSc apply the inverse using a KSP
*7f296bb3SBarry Smithmethod, but faster results may be obtained by manipulating the structure
*7f296bb3SBarry Smithof the Jacobian and providing an inverse. The fifth argument is the
*7f296bb3SBarry Smithfunction pointer, and the sixth argument is an optional user-defined
*7f296bb3SBarry Smithcontext. Since no solve is performed with the design Jacobian, there is
*7f296bb3SBarry Smithno need to provide preconditioner or inverse matrices.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_evalsof)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Nonlinear Least Squares
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor nonlinear least squares applications, we are solving the
*7f296bb3SBarry Smithoptimization problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\min_{x} \;\frac{1}{2}||r(x)||_2^2.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor these problems, the objective function value should be computed as a
*7f296bb3SBarry Smithvector of residuals, $r(x)$, computed with a function of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateResidual(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand set with the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetResidualRoutine(Tao, PetscErrorCode (*)(Tao,Vec,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine. If required by the algorithm, the Jacobian of the residual,
*7f296bb3SBarry Smith$J = \partial r(x) / \partial x$, should be computed with a
*7f296bb3SBarry Smithfunction of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateJacobian(Tao, Vec, Mat, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand set with the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetJacobianResidualRoutine(Tao, PetscErrorCode (*)(Tao,Vec,Mat,void*), void *);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithroutine.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_complementary)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Complementarity
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithComplementarity applications have equality constraints in the form of
*7f296bb3SBarry Smithnonlinear equations $C(X) = 0$, where
*7f296bb3SBarry Smith$C: \mathbb R^n \to \mathbb R^m$. These constraints should be
*7f296bb3SBarry Smithspecified in a routine written by the user with the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EqualityConstraints(Tao, Vec, Vec, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smiththat evaluates $C(X)$. The first argument of this routine is a TAO
*7f296bb3SBarry SmithSolver object. The second argument is the variable vector $X$ at
*7f296bb3SBarry Smithwhich the constraint function should be evaluated. The third argument is
*7f296bb3SBarry Smiththe output vector of function values $C(X)$, and the fourth
*7f296bb3SBarry Smithargument is a pointer to a user-defined context.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine and the user-defined context must be registered with TAO by
*7f296bb3SBarry Smithusing the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetConstraintRoutine(Tao, Vec, PetscErrorCode (*)(Tao,Vec,Vec,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. In this command, the first argument is TAO Solver object, the
*7f296bb3SBarry Smithsecond argument is vector in which to store the function values, the
*7f296bb3SBarry Smiththird argument is the user-defined routine that evaluates $C(X)$,
*7f296bb3SBarry Smithand the fourth argument is a pointer to a user-defined context that will
*7f296bb3SBarry Smithbe passed back to the user.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Jacobian of the function is the matrix in
*7f296bb3SBarry Smith$\mathbb R^{m \times n}$ such that each column contains the
*7f296bb3SBarry Smithpartial derivatives of $f$ with respect to one variable. The
*7f296bb3SBarry Smithevaluation of the Jacobian of $C$ should be performed in a routine
*7f296bb3SBarry Smithof the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode EvaluateJacobian(Tao, Vec, Mat, Mat, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn this function, the first argument is the TAO Solver object and the
*7f296bb3SBarry Smithsecond argument is the variable vector at which to evaluate the Jacobian
*7f296bb3SBarry Smithmatrix. The third argument is the Jacobian matrix, and the sixth
*7f296bb3SBarry Smithargument is a pointer to a user-defined context. Since the Jacobian
*7f296bb3SBarry Smithmatrix may be used in solving a system of linear equations, a
*7f296bb3SBarry Smithpreconditioner for the matrix may be needed. The fourth argument is the
*7f296bb3SBarry Smithmatrix that will be used for preconditioning the linear system; in most
*7f296bb3SBarry Smithcases, this matrix will be the same as the Hessian matrix. The fifth
*7f296bb3SBarry Smithargument is the flag used to set the Jacobian matrix and linear solver
*7f296bb3SBarry Smithin the routine `KSPSetOperators()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine should be specified to TAO by using the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetJacobianRoutine(Tao, Mat, Mat, PetscErrorCode (*)(Tao,Vec,Mat,Mat,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. The first argument is the TAO Solver object; the second and
*7f296bb3SBarry Smiththird arguments are the Mat objects in which the Jacobian will be stored
*7f296bb3SBarry Smithand the Mat object that will be used for the preconditioning (they may
*7f296bb3SBarry Smithbe the same), respectively. The fourth argument is the function pointer;
*7f296bb3SBarry Smithand the fifth argument is an optional user-defined context. The Jacobian
*7f296bb3SBarry Smithmatrix should be created in a way such that the product of it and the
*7f296bb3SBarry Smithvariable vector can be stored in the constraint vector.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_solvers)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith## TAO Algorithms
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO includes a variety of optimization algorithms for several classes of
*7f296bb3SBarry Smithproblems (unconstrained, bound-constrained, and PDE-constrained
*7f296bb3SBarry Smithminimization, nonlinear least-squares, and complementarity). The TAO
*7f296bb3SBarry Smithalgorithms for solving these problems are detailed in this section, a
*7f296bb3SBarry Smithparticular algorithm can chosen by using the `TaoSetType()` function
*7f296bb3SBarry Smithor using the command line arguments `-tao_type <name>`. For those
*7f296bb3SBarry Smithinterested in extending these algorithms or using new ones, please see
*7f296bb3SBarry Smith{any}`sec_tao_addsolver` for more information.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_unconstrained)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Unconstrained Minimization
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithUnconstrained minimization is used to minimize a function of many
*7f296bb3SBarry Smithvariables without any constraints on the variables, such as bounds. The
*7f296bb3SBarry Smithmethods available in TAO for solving these problems can be classified
*7f296bb3SBarry Smithaccording to the amount of derivative information required:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith1. Function evaluation only – Nelder-Mead method (`tao_nm`)
*7f296bb3SBarry Smith2. Function and gradient evaluations – limited-memory, variable-metric
*7f296bb3SBarry Smith   method (`tao_lmvm`) and nonlinear conjugate gradient method
*7f296bb3SBarry Smith   (`tao_cg`)
*7f296bb3SBarry Smith3. Function, gradient, and Hessian evaluations – Newton Krylov methods:
*7f296bb3SBarry Smith   Newton line search (`tao_nls`), Newton trust-region (`tao_ntr`),
*7f296bb3SBarry Smith   and Newton trust-region line-search (`tao_ntl`)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe best method to use depends on the particular problem being solved
*7f296bb3SBarry Smithand the accuracy required in the solution. If a Hessian evaluation
*7f296bb3SBarry Smithroutine is available, then the Newton line search and Newton
*7f296bb3SBarry Smithtrust-region methods will likely perform best. When a Hessian evaluation
*7f296bb3SBarry Smithroutine is not available, then the limited-memory, variable-metric
*7f296bb3SBarry Smithmethod is likely to perform best. The Nelder-Mead method should be used
*7f296bb3SBarry Smithonly as a last resort when no gradient information is available.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithEach solver has a set of options associated with it that can be set with
*7f296bb3SBarry Smithcommand line arguments. These algorithms and the associated options are
*7f296bb3SBarry Smithbriefly discussed in this section.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Newton-Krylov Methods
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO features three Newton-Krylov algorithms, separated by their globalization methods
*7f296bb3SBarry Smithfor unconstrained optimization: line search (NLS), trust region (NTR), and trust
*7f296bb3SBarry Smithregion with a line search (NTL). They are available via the TAO solvers
*7f296bb3SBarry Smith`TAONLS`, `TAONTR` and `TAONTL`, respectively, or the `-tao_type`
*7f296bb3SBarry Smith`nls`/`ntr`/`ntl` flag.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Newton Line Search Method (NLS)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Newton line search method solves the symmetric system of equations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithH_k d_k = -g_k
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithto obtain a step $d_k$, where $H_k$ is the Hessian of the
*7f296bb3SBarry Smithobjective function at $x_k$ and $g_k$ is the gradient of the
*7f296bb3SBarry Smithobjective function at $x_k$. For problems where the Hessian matrix
*7f296bb3SBarry Smithis indefinite, the perturbed system of equations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith(H_k + \rho_k I) d_k = -g_k
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithis solved to obtain the direction, where $\rho_k$ is a positive
*7f296bb3SBarry Smithconstant. If the direction computed is not a descent direction, the
*7f296bb3SBarry Smith(scaled) steepest descent direction is used instead. Having obtained the
*7f296bb3SBarry Smithdirection, a Moré-Thuente line search is applied to obtain a step
*7f296bb3SBarry Smithlength, $\tau_k$, that approximately solves the one-dimensional
*7f296bb3SBarry Smithoptimization problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\min_\tau f(x_k + \tau d_k).
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Newton line search method can be selected by using the TAO solver
*7f296bb3SBarry Smith`tao_nls`. The options available for this solver are listed in
*7f296bb3SBarry Smith{numref}`table_nlsoptions`. For the best efficiency, function and
*7f296bb3SBarry Smithgradient evaluations should be performed simultaneously when using this
*7f296bb3SBarry Smithalgorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith> ```{eval-rst}
*7f296bb3SBarry Smith> .. table:: Summary of ``nls`` options
*7f296bb3SBarry Smith>    :name: table_nlsoptions
*7f296bb3SBarry Smith>
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    | Name  ``-tao_nls_``      | Value          | Default            | Description        |
*7f296bb3SBarry Smith>    +==========================+================+====================+====================+
*7f296bb3SBarry Smith>    |          ``ksp_type``    | cg, nash,      | stcg               | KSPType for        |
*7f296bb3SBarry Smith>    |                          |                |                    | linear system      |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pc_type``     | none, jacobi   | lmvm               | PCType for linear  |
*7f296bb3SBarry Smith>    |                          |                |                    | system             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``sval``        | real           | :math:`0`          | Initial            |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``imin``        | real           | :math:`10^{-4}`    | Minimum            |
*7f296bb3SBarry Smith>    |                          |                |                    | initial            |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``imax``        | real           | :math:`100`        | Maximum            |
*7f296bb3SBarry Smith>    |                          |                |                    | initial            |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``imfac``       | real           | :math:`0.1`        | Gradient norm      |
*7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
*7f296bb3SBarry Smith>    |                          |                |                    | initializing       |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pmax``        | real           | :math:`100`        | Maximum            |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    |                          |                |                    | when               |
*7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pgfac``       | real           | :math:`10`         | Perturbation growth|
*7f296bb3SBarry Smith>    |                          |                |                    | when               |
*7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pmgfac``      | real           | :math:`0.1`        | Gradient norm      |
*7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
*7f296bb3SBarry Smith>    |                          |                |                    | increasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pmin``        | real           | :math:`10^{-12}`   | Minimum non-zero   |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    |                          |                |                    | when               |
*7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``psfac``       | real           | :math:`0.4`        | Perturbation shrink|
*7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
*7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | value              |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``pmsfac``      | real           | :math:`0.1`        | Gradient norm      |
*7f296bb3SBarry Smith>    |                          |                |                    | factor when        |
*7f296bb3SBarry Smith>    |                          |                |                    | decreasing         |
*7f296bb3SBarry Smith>    |                          |                |                    | perturbation       |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``nu1``         | real           | 0.25               | :math:`\nu_1`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``nu2``         | real           | 0.50               | :math:`\nu_2`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``nu3``         | real           | 1.00               | :math:`\nu_3`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``nu4``         | real           | 1.25               | :math:`\nu_4`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``omega1``      | real           | 0.25               | :math:`\omega_1`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``omega2``      | real           | 0.50               | :math:`\omega_2`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``omega3``      | real           | 1.00               | :math:`\omega_3`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``omega4``      | real           | 2.00               | :math:`\omega_4`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``omega5``      | real           | 4.00               | :math:`\omega_5`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in ``step``        |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``eta1``        | real           | :math:`10^{-4}`    | :math:`\eta_1`     |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``eta2``        | real           | 0.25               | :math:`\eta_2`     |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``eta3``        | real           | 0.50               | :math:`\eta_3`     |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``eta4``        | real           | 0.90               | :math:`\eta_4`     |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``alpha1``      | real           | 0.25               | :math:`\alpha_1`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``alpha2``      | real           | 0.50               | :math:`\alpha_2`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``alpha3``      | real           | 1.00               | :math:`\alpha_3`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``alpha4``      | real           | 2.00               | :math:`\alpha_4`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``alpha5``      | real           | 4.00               | :math:`\alpha_5`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``reduction``      |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``mu1``         | real           | 0.10               | :math:`\mu_1`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``mu2``         | real           | 0.50               | :math:`\mu_2`      |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``gamma1``      | real           | 0.25               | :math:`\gamma_1`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``gamma2``      | real           | 0.50               | :math:`\gamma_2`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``gamma3``      | real           | 2.00               | :math:`\gamma_3`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``gamma4``      | real           | 4.00               | :math:`\gamma_4`   |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith>    |          ``theta``       | real           | 0.05               | :math:`\theta`     |
*7f296bb3SBarry Smith>    |                          |                |                    | in                 |
*7f296bb3SBarry Smith>    |                          |                |                    | ``interpolation``  |
*7f296bb3SBarry Smith>    |                          |                |                    | update             |
*7f296bb3SBarry Smith>    +--------------------------+----------------+--------------------+--------------------+
*7f296bb3SBarry Smith> ```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe system of equations is approximately solved by applying the
*7f296bb3SBarry Smithconjugate gradient method, Nash conjugate gradient method,
*7f296bb3SBarry SmithSteihaug-Toint conjugate gradient method, generalized Lanczos method, or
*7f296bb3SBarry Smithan alternative Krylov subspace method supplied by PETSc. The method used
*7f296bb3SBarry Smithto solve the systems of equations is specified with the command line
*7f296bb3SBarry Smithargument `-tao_nls_ksp_type <cg,nash,stcg,gltr,gmres,...>`; `stcg`
*7f296bb3SBarry Smithis the default. See the PETSc manual for further information on changing
*7f296bb3SBarry Smiththe behavior of the linear system solvers.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithA good preconditioner reduces the number of iterations required to solve
*7f296bb3SBarry Smiththe linear system of equations. For the conjugate gradient methods and
*7f296bb3SBarry Smithgeneralized Lanczos method, this preconditioner must be symmetric and
*7f296bb3SBarry Smithpositive definite. The available options are to use no preconditioner,
*7f296bb3SBarry Smiththe absolute value of the diagonal of the Hessian matrix, a
*7f296bb3SBarry Smithlimited-memory BFGS approximation to the Hessian matrix, or one of the
*7f296bb3SBarry Smithother preconditioners provided by the PETSc package. These
*7f296bb3SBarry Smithpreconditioners are specified by the command line arguments
*7f296bb3SBarry Smith`-tao_nls_pc_type <none,jacobi,icc,ilu,lmvm>`, respectively. The
*7f296bb3SBarry Smithdefault is the `lmvm` preconditioner, which uses a BFGS approximation
*7f296bb3SBarry Smithof the inverse Hessian. See the PETSc manual for further information on
*7f296bb3SBarry Smithchanging the behavior of the preconditioners.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe perturbation $\rho_k$ is added when the direction returned by
*7f296bb3SBarry Smiththe Krylov subspace method is not a descent direction, the Krylov method
*7f296bb3SBarry Smithdiverged due to an indefinite preconditioner or matrix, or a direction
*7f296bb3SBarry Smithof negative curvature was found. In the last two cases, if the step
*7f296bb3SBarry Smithreturned is a descent direction, it is used during the line search.
*7f296bb3SBarry SmithOtherwise, a steepest descent direction is used during the line search.
*7f296bb3SBarry SmithThe perturbation is decreased as long as the Krylov subspace method
*7f296bb3SBarry Smithreports success and increased if further problems are encountered. There
*7f296bb3SBarry Smithare three cases: initializing, increasing, and decreasing the
*7f296bb3SBarry Smithperturbation. These cases are described below.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith1. If $\rho_k$ is zero and a problem was detected with either the
*7f296bb3SBarry Smith   direction or the Krylov subspace method, the perturbation is
*7f296bb3SBarry Smith   initialized to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith   \rho_{k+1} = \text{median}\left\{\text{imin}, \text{imfac} * \|g(x_k)\|, \text{imax}\right\},
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function and
*7f296bb3SBarry Smith   `imin` is set with the command line argument
*7f296bb3SBarry Smith   `-tao_nls_imin <real>` with a default value of $10^{-4}$,
*7f296bb3SBarry Smith   `imfac` by `-tao_nls_imfac` with a default value of 0.1, and
*7f296bb3SBarry Smith   `imax` by `-tao_nls_imax` with a default value of 100. When using
*7f296bb3SBarry Smith   the `gltr` method to solve the system of equations, an estimate of
*7f296bb3SBarry Smith   the minimum eigenvalue $\lambda_1$ of the Hessian matrix is
*7f296bb3SBarry Smith   available. This value is used to initialize the perturbation to
*7f296bb3SBarry Smith   $\rho_{k+1} = \max\left\{\rho_{k+1}, -\lambda_1\right\}$ in
*7f296bb3SBarry Smith   this case.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith2. If $\rho_k$ is nonzero and a problem was detected with either
*7f296bb3SBarry Smith   the direction or Krylov subspace method, the perturbation is
*7f296bb3SBarry Smith   increased to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith   \rho_{k+1} = \min\left\{\text{pmax}, \max\left\{\text{pgfac} * \rho_k, \text{pmgfac} * \|g(x_k)\|\right\}\right\},
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function and
*7f296bb3SBarry Smith   `pgfac` is set with the command line argument `-tao_nls_pgfac`
*7f296bb3SBarry Smith   with a default value of 10, `pmgfac` by `-tao_nls_pmgfac` with a
*7f296bb3SBarry Smith   default value of 0.1, and `pmax` by `-tao_nls_pmax` with a
*7f296bb3SBarry Smith   default value of 100.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith3. If $\rho_k$ is nonzero and no problems were detected with
*7f296bb3SBarry Smith   either the direction or Krylov subspace method, the perturbation is
*7f296bb3SBarry Smith   decreased to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith   \rho_{k+1} = \min\left\{\text{psfac} * \rho_k, \text{pmsfac} * \|g(x_k)\|\right\},
*7f296bb3SBarry Smith   $$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith   where $g(x_k)$ is the gradient of the objective function,
*7f296bb3SBarry Smith   `psfac` is set with the command line argument `-tao_nls_psfac`
*7f296bb3SBarry Smith   with a default value of 0.4, and `pmsfac` is set by
*7f296bb3SBarry Smith   `-tao_nls_pmsfac` with a default value of 0.1. Moreover, if
*7f296bb3SBarry Smith   $\rho_{k+1} < \text{pmin}$, then $\rho_{k+1} = 0$, where
*7f296bb3SBarry Smith   `pmin` is set with the command line argument `-tao_nls_pmin` and
*7f296bb3SBarry Smith   has a default value of $10^{-12}$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNear a local minimizer to the unconstrained optimization problem, the
*7f296bb3SBarry SmithHessian matrix will be positive-semidefinite; the perturbation will
*7f296bb3SBarry Smithshrink toward zero, and one would eventually observe a superlinear
*7f296bb3SBarry Smithconvergence rate.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWhen using `nash`, `stcg`, or `gltr` to solve the linear systems
*7f296bb3SBarry Smithof equation, a trust-region radius needs to be initialized and updated.
*7f296bb3SBarry SmithThis trust-region radius simultaneously limits the size of the step
*7f296bb3SBarry Smithcomputed and reduces the number of iterations of the conjugate gradient
*7f296bb3SBarry Smithmethod. The method for initializing the trust-region radius is set with
*7f296bb3SBarry Smiththe command line argument
*7f296bb3SBarry Smith`-tao_nls_init_type <constant,direction,interpolation>`;
*7f296bb3SBarry Smith`interpolation`, which chooses an initial value based on the
*7f296bb3SBarry Smithinterpolation scheme found in {cite}`cgt`, is the default.
*7f296bb3SBarry SmithThis scheme performs a number of function and gradient evaluations to
*7f296bb3SBarry Smithdetermine a radius such that the reduction predicted by the quadratic
*7f296bb3SBarry Smithmodel along the gradient direction coincides with the actual reduction
*7f296bb3SBarry Smithin the nonlinear function. The iterate obtaining the best objective
*7f296bb3SBarry Smithfunction value is used as the starting point for the main line search
*7f296bb3SBarry Smithalgorithm. The `constant` method initializes the trust-region radius
*7f296bb3SBarry Smithby using the value specified with the `-tao_trust0 <real>` command
*7f296bb3SBarry Smithline argument, where the default value is 100. The `direction`
*7f296bb3SBarry Smithtechnique solves the first quadratic optimization problem by using a
*7f296bb3SBarry Smithstandard conjugate gradient method and initializes the trust region to
*7f296bb3SBarry Smith$\|s_0\|$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe method for updating the trust-region radius is set with the command
*7f296bb3SBarry Smithline argument `-tao_nls_update_type <step,reduction,interpolation>`;
*7f296bb3SBarry Smith`step` is the default. The `step` method updates the trust-region
*7f296bb3SBarry Smithradius based on the value of $\tau_k$. In particular,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith\omega_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \tau_k \in [0, \nu_1) \\
*7f296bb3SBarry Smith\omega_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \tau_k \in [\nu_1, \nu_2) \\
*7f296bb3SBarry Smith\omega_3 \Delta_k & \text{if } \tau_k \in [\nu_2, \nu_3) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \omega_4 \|d_k\|) & \text{if } \tau_k \in [\nu_3, \nu_4) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \omega_5 \|d_k\|) & \text{if } \tau_k \in [\nu_4, \infty),
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith\right.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere
*7f296bb3SBarry Smith$0 < \omega_1 < \omega_2 < \omega_3 = 1 < \omega_4 < \omega_5$ and
*7f296bb3SBarry Smith$0 < \nu_1 < \nu_2 < \nu_3 < \nu_4$ are constants. The
*7f296bb3SBarry Smith`reduction` method computes the ratio of the actual reduction in the
*7f296bb3SBarry Smithobjective function to the reduction predicted by the quadratic model for
*7f296bb3SBarry Smiththe full step,
*7f296bb3SBarry Smith$\kappa_k = \frac{f(x_k) - f(x_k + d_k)}{q(x_k) - q(x_k + d_k)}$,
*7f296bb3SBarry Smithwhere $q_k$ is the quadratic model. The radius is then updated as
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith\alpha_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in (-\infty, \eta_1) \\
*7f296bb3SBarry Smith\alpha_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in [\eta_1, \eta_2) \\
*7f296bb3SBarry Smith\alpha_3 \Delta_k & \text{if } \kappa_k \in [\eta_2, \eta_3) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_4 \|d_k\|) & \text{if } \kappa_k \in [\eta_3, \eta_4) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_5 \|d_k\|) & \text{if } \kappa_k \in [\eta_4, \infty),
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith\right.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere
*7f296bb3SBarry Smith$0 < \alpha_1 < \alpha_2 < \alpha_3 = 1 < \alpha_4 < \alpha_5$ and
*7f296bb3SBarry Smith$0 < \eta_1 < \eta_2 < \eta_3 < \eta_4$ are constants. The
*7f296bb3SBarry Smith`interpolation` method uses the same interpolation mechanism as in the
*7f296bb3SBarry Smithinitialization to compute a new value for the trust-region radius.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
*7f296bb3SBarry Smiththe Bounded Newton Line Search (BNLS) algorithm that can solve both
*7f296bb3SBarry Smithbound constrained and unconstrained problems.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Newton Trust-Region Method (NTR)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Newton trust-region method solves the constrained quadratic
*7f296bb3SBarry Smithprogramming problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\min_d  & \frac{1}{2}d^T H_k d  + g_k^T d \\
*7f296bb3SBarry Smith\text{subject to} & \|d\| \leq \Delta_k
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithto obtain a direction $d_k$, where $H_k$ is the Hessian of
*7f296bb3SBarry Smiththe objective function at $x_k$, $g_k$ is the gradient of
*7f296bb3SBarry Smiththe objective function at $x_k$, and $\Delta_k$ is the
*7f296bb3SBarry Smithtrust-region radius. If $x_k + d_k$ sufficiently reduces the
*7f296bb3SBarry Smithnonlinear objective function, then the step is accepted, and the
*7f296bb3SBarry Smithtrust-region radius is updated. However, if $x_k + d_k$ does not
*7f296bb3SBarry Smithsufficiently reduce the nonlinear objective function, then the step is
*7f296bb3SBarry Smithrejected, the trust-region radius is reduced, and the quadratic program
*7f296bb3SBarry Smithis re-solved by using the updated trust-region radius. The Newton
*7f296bb3SBarry Smithtrust-region method can be set by using the TAO solver `tao_ntr`. The
*7f296bb3SBarry Smithoptions available for this solver are listed in
*7f296bb3SBarry Smith{numref}`table_ntroptions`. For the best efficiency, function and
*7f296bb3SBarry Smithgradient evaluations should be performed separately when using this
*7f296bb3SBarry Smithalgorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith> ```{eval-rst}
*7f296bb3SBarry Smith> .. table:: Summary of ``ntr`` options
*7f296bb3SBarry Smith>    :name: table_ntroptions
*7f296bb3SBarry Smith>
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | Name ``-tao_ntr_``        | Value          | Default          | Description          |
*7f296bb3SBarry Smith>    +===========================+================+==================+======================+
*7f296bb3SBarry Smith>    | ``ksp_type``              | nash, stcg     | stcg             | KSPType for          |
*7f296bb3SBarry Smith>    |                           |                |                  | linear system        |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``pc_type``               | none, jacobi   | lmvm             | PCType for linear    |
*7f296bb3SBarry Smith>    |                           |                |                  | system               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``init_type``    | constant,      | interpolation    | Radius               |
*7f296bb3SBarry Smith>    |                           | direction,     |                  | initialization       |
*7f296bb3SBarry Smith>    |                           | interpolation  |                  | method               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``mu1_i``        | real           | 0.35             | :math:`\mu_1`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``mu2_i``        | real           | 0.50             | :math:`\mu_2`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma1_i``     | real           | 0.0625           | :math:`\gamma_1`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma2_i``     | real           | 0.50             | :math:`\gamma_2`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma3_i``     | real           | 2.00             | :math:`\gamma_3`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma4_i``     | real           | 5.00             | :math:`\gamma_4`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``theta_i``      | real           | 0.25             | :math:`\theta`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``update_type``  | step,          | step             | Radius               |
*7f296bb3SBarry Smith>    |                           | reduction,     |                  | update method        |
*7f296bb3SBarry Smith>    |                           | interpolation  |                  |                      |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``mu1_i``                 | real           | 0.35             | :math:`\mu_1`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``mu2_i``                 | real           | 0.50             | :math:`\mu_2`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``gamma1_i``              | real           | 0.0625           | :math:`\gamma_1`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``gamma2_i``              | real           | 0.50             | :math:`\gamma_2`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``gamma3_i``              | real           | 2.00             | :math:`\gamma_3`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``gamma4_i``              | real           | 5.00             | :math:`\gamma_4`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    | ``theta_i``               | real           | 0.25             | :math:`\theta`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | init                 |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``eta1``         | real           | :                | :math:`\eta_1`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``eta2``         | real           | 0.25             | :math:`\eta_2`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``eta3``         | real           | 0.50             | :math:`\eta_3`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``eta4``         | real           | 0.90             | :math:`\eta_4`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``alpha1``       | real           | 0.25             | :math:`\alpha_1`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``alpha2``       | real           | 0.50             | :math:`\alpha_2`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``alpha3``       | real           | 1.00             | :math:`\alpha_3`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``alpha4``       | real           | 2.00             | :math:`\alpha_4`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``alpha5``       | real           | 4.00             | :math:`\alpha_5`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in ``reduction``     |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``mu1``          | real           | 0.10             | :math:`\mu_1`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``mu2``          | real           | 0.50             | :math:`\mu_2`        |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma1``       | real           | 0.25             | :math:`\gamma_1`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma2``       | real           | 0.50             | :math:`\gamma_2`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma3``       | real           | 2.00             | :math:`\gamma_3`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``gamma4``       | real           | 4.00             | :math:`\gamma_4`     |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith>    |          ``theta``        | real           | 0.05             | :math:`\theta`       |
*7f296bb3SBarry Smith>    |                           |                |                  | in                   |
*7f296bb3SBarry Smith>    |                           |                |                  | ``interpolation``    |
*7f296bb3SBarry Smith>    |                           |                |                  | update               |
*7f296bb3SBarry Smith>    +---------------------------+----------------+------------------+----------------------+
*7f296bb3SBarry Smith> ```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe quadratic optimization problem is approximately solved by applying
*7f296bb3SBarry Smiththe Nash or Steihaug-Toint conjugate gradient methods or the generalized
*7f296bb3SBarry SmithLanczos method to the symmetric system of equations
*7f296bb3SBarry Smith$H_k d = -g_k$. The method used to solve the system of equations
*7f296bb3SBarry Smithis specified with the command line argument
*7f296bb3SBarry Smith`-tao_ntr_ksp_type <nash,stcg,gltr>`; `stcg` is the default. See the
*7f296bb3SBarry SmithPETSc manual for further information on changing the behavior of these
*7f296bb3SBarry Smithlinear system solvers.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithA good preconditioner reduces the number of iterations required to
*7f296bb3SBarry Smithcompute the direction. For the Nash and Steihaug-Toint conjugate
*7f296bb3SBarry Smithgradient methods and generalized Lanczos method, this preconditioner
*7f296bb3SBarry Smithmust be symmetric and positive definite. The available options are to
*7f296bb3SBarry Smithuse no preconditioner, the absolute value of the diagonal of the Hessian
*7f296bb3SBarry Smithmatrix, a limited-memory BFGS approximation to the Hessian matrix, or
*7f296bb3SBarry Smithone of the other preconditioners provided by the PETSc package. These
*7f296bb3SBarry Smithpreconditioners are specified by the command line argument
*7f296bb3SBarry Smith`-tao_ntr_pc_type <none,jacobi,icc,ilu,lmvm>`, respectively. The
*7f296bb3SBarry Smithdefault is the `lmvm` preconditioner. See the PETSc manual for further
*7f296bb3SBarry Smithinformation on changing the behavior of the preconditioners.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe method for computing an initial trust-region radius is set with the
*7f296bb3SBarry Smithcommand line arguments
*7f296bb3SBarry Smith`-tao_ntr_init_type <constant,direction,interpolation>`;
*7f296bb3SBarry Smith`interpolation`, which chooses an initial value based on the
*7f296bb3SBarry Smithinterpolation scheme found in {cite}`cgt`, is the default.
*7f296bb3SBarry SmithThis scheme performs a number of function and gradient evaluations to
*7f296bb3SBarry Smithdetermine a radius such that the reduction predicted by the quadratic
*7f296bb3SBarry Smithmodel along the gradient direction coincides with the actual reduction
*7f296bb3SBarry Smithin the nonlinear function. The iterate obtaining the best objective
*7f296bb3SBarry Smithfunction value is used as the starting point for the main trust-region
*7f296bb3SBarry Smithalgorithm. The `constant` method initializes the trust-region radius
*7f296bb3SBarry Smithby using the value specified with the `-tao_trust0 <real>` command
*7f296bb3SBarry Smithline argument, where the default value is 100. The `direction`
*7f296bb3SBarry Smithtechnique solves the first quadratic optimization problem by using a
*7f296bb3SBarry Smithstandard conjugate gradient method and initializes the trust region to
*7f296bb3SBarry Smith$\|s_0\|$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe method for updating the trust-region radius is set with the command
*7f296bb3SBarry Smithline arguments `-tao_ntr_update_type <reduction,interpolation>`;
*7f296bb3SBarry Smith`reduction` is the default. The `reduction` method computes the
*7f296bb3SBarry Smithratio of the actual reduction in the objective function to the reduction
*7f296bb3SBarry Smithpredicted by the quadratic model for the full step,
*7f296bb3SBarry Smith$\kappa_k = \frac{f(x_k) - f(x_k + d_k)}{q(x_k) - q(x_k + d_k)}$,
*7f296bb3SBarry Smithwhere $q_k$ is the quadratic model. The radius is then updated as
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith\alpha_1 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in (-\infty, \eta_1) \\
*7f296bb3SBarry Smith\alpha_2 \text{min}(\Delta_k, \|d_k\|) & \text{if } \kappa_k \in [\eta_1, \eta_2) \\
*7f296bb3SBarry Smith\alpha_3 \Delta_k & \text{if } \kappa_k \in [\eta_2, \eta_3) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_4 \|d_k\|) & \text{if } \kappa_k \in [\eta_3, \eta_4) \\
*7f296bb3SBarry Smith\text{max}(\Delta_k, \alpha_5 \|d_k\|) & \text{if } \kappa_k \in [\eta_4, \infty),
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith\right.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere
*7f296bb3SBarry Smith$0 < \alpha_1 < \alpha_2 < \alpha_3 = 1 < \alpha_4 < \alpha_5$ and
*7f296bb3SBarry Smith$0 < \eta_1 < \eta_2 < \eta_3 < \eta_4$ are constants. The
*7f296bb3SBarry Smith`interpolation` method uses the same interpolation mechanism as in the
*7f296bb3SBarry Smithinitialization to compute a new value for the trust-region radius.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
*7f296bb3SBarry Smiththe Bounded Newton Trust Region (BNTR) algorithm that can solve both
*7f296bb3SBarry Smithbound constrained and unconstrained problems.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Newton Trust Region with Line Search (NTL)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNTL safeguards the trust-region globalization such that a line search
*7f296bb3SBarry Smithis used in the event that the step is initially rejected by the
*7f296bb3SBarry Smithpredicted versus actual decrease comparison. If the line search fails to
*7f296bb3SBarry Smithfind a viable step length for the Newton step, it falls back onto a
*7f296bb3SBarry Smithscaled gradient or a gradient descent step. The trust radius is then
*7f296bb3SBarry Smithmodified based on the line search step length.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
*7f296bb3SBarry Smiththe Bounded Newton Trust Region with Line Search (BNTL) algorithm that
*7f296bb3SBarry Smithcan solve both bound constrained and unconstrained problems.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Limited-Memory Variable-Metric Method (LMVM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe limited-memory, variable-metric method (LMVM) computes a positive definite
*7f296bb3SBarry Smithapproximation to the Hessian matrix from a limited number of previous
*7f296bb3SBarry Smithiterates and gradient evaluations. A direction is then obtained by
*7f296bb3SBarry Smithsolving the system of equations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithH_k d_k = -\nabla f(x_k),
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $H_k$ is the Hessian approximation obtained by using the
*7f296bb3SBarry SmithBFGS update formula. The inverse of $H_k$ can readily be applied
*7f296bb3SBarry Smithto obtain the direction $d_k$. Having obtained the direction, a
*7f296bb3SBarry SmithMoré-Thuente line search is applied to compute a step length,
*7f296bb3SBarry Smith$\tau_k$, that approximately solves the one-dimensional
*7f296bb3SBarry Smithoptimization problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\min_\tau f(x_k + \tau d_k).
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe current iterate and Hessian approximation are updated, and the
*7f296bb3SBarry Smithprocess is repeated until the method converges. This algorithm is the
*7f296bb3SBarry Smithdefault unconstrained minimization solver and can be selected by using
*7f296bb3SBarry Smiththe TAO solver `tao_lmvm`. For best efficiency, function and gradient
*7f296bb3SBarry Smithevaluations should be performed simultaneously when using this
*7f296bb3SBarry Smithalgorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe primary factors determining the behavior of this algorithm are the
*7f296bb3SBarry Smithtype of Hessian approximation used, the number of vectors stored for the
*7f296bb3SBarry Smithapproximation and the initialization/scaling of the approximation. These
*7f296bb3SBarry Smithoptions can be configured using the `-tao_lmvm_mat_lmvm` prefix. For
*7f296bb3SBarry Smithfurther detail, we refer the reader to the `MATLMVM` matrix type
*7f296bb3SBarry Smithdefinitions in the PETSc Manual.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe LMVM algorithm also allows the user to define a custom initial
*7f296bb3SBarry SmithHessian matrix $H_{0,k}$ through the interface function
*7f296bb3SBarry Smith`TaoLMVMSetH0()`. This user-provided initialization overrides any
*7f296bb3SBarry Smithother scalar or diagonal initialization inherent to the LMVM
*7f296bb3SBarry Smithapproximation. The provided $H_{0,k}$ must be a PETSc `Mat` type
*7f296bb3SBarry Smithobject that represents a positive-definite matrix. The approximation
*7f296bb3SBarry Smithprefers `MatSolve()` if the provided matrix has `MATOP_SOLVE`
*7f296bb3SBarry Smithimplemented. Otherwise, `MatMult()` is used in a KSP solve to perform
*7f296bb3SBarry Smiththe inversion of the user-provided initial Hessian.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn applications where `TaoSolve()` on the LMVM algorithm is repeatedly
*7f296bb3SBarry Smithcalled to solve similar or related problems, `-tao_lmvm_recycle` flag
*7f296bb3SBarry Smithcan be used to prevent resetting the LMVM approximation between
*7f296bb3SBarry Smithsubsequent solutions. This recycling also avoids one extra function and
*7f296bb3SBarry Smithgradient evaluation, instead re-using the values already computed at the
*7f296bb3SBarry Smithend of the previous solution.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
*7f296bb3SBarry Smiththe Bounded Quasi-Newton Line Search (BQNLS) algorithm that can solve
*7f296bb3SBarry Smithboth bound constrained and unconstrained problems.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Nonlinear Conjugate Gradient Method (CG)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe nonlinear conjugate gradient method can be viewed as an extension of
*7f296bb3SBarry Smiththe conjugate gradient method for solving symmetric, positive-definite
*7f296bb3SBarry Smithlinear systems of equations. This algorithm requires only function and
*7f296bb3SBarry Smithgradient evaluations as well as a line search. The TAO implementation
*7f296bb3SBarry Smithuses a Moré-Thuente line search to obtain the step length. The nonlinear
*7f296bb3SBarry Smithconjugate gradient method can be selected by using the TAO solver
*7f296bb3SBarry Smith`tao_cg`. For the best efficiency, function and gradient evaluations
*7f296bb3SBarry Smithshould be performed simultaneously when using this algorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFive variations are currently supported by the TAO implementation: the
*7f296bb3SBarry SmithFletcher-Reeves method, the Polak-Ribiére method, the Polak-Ribiére-Plus
*7f296bb3SBarry Smithmethod {cite}`nocedal2006numerical`, the Hestenes-Stiefel method, and the
*7f296bb3SBarry SmithDai-Yuan method. These conjugate gradient methods can be specified by
*7f296bb3SBarry Smithusing the command line argument `-tao_cg_type <fr,pr,prp,hs,dy>`,
*7f296bb3SBarry Smithrespectively. The default value is `prp`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe conjugate gradient method incorporates automatic restarts when
*7f296bb3SBarry Smithsuccessive gradients are not sufficiently orthogonal. TAO measures the
*7f296bb3SBarry Smithorthogonality by dividing the inner product of the gradient at the
*7f296bb3SBarry Smithcurrent point and the gradient at the previous point by the square of
*7f296bb3SBarry Smiththe Euclidean norm of the gradient at the current point. When the
*7f296bb3SBarry Smithabsolute value of this ratio is greater than $\eta$, the algorithm
*7f296bb3SBarry Smithrestarts using the gradient direction. The parameter $\eta$ can be
*7f296bb3SBarry Smithset by using the command line argument `-tao_cg_eta <real>`; 0.1 is
*7f296bb3SBarry Smiththe default value.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version and replaced by
*7f296bb3SBarry Smiththe Bounded Nonlinear Conjugate Gradient (BNCG) algorithm that can solve
*7f296bb3SBarry Smithboth bound constrained and unconstrained problems.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Nelder-Mead Simplex Method (NM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Nelder-Mead algorithm {cite}`nelder.mead:simplex` is a
*7f296bb3SBarry Smithdirect search method for finding a local minimum of a function
*7f296bb3SBarry Smith$f(x)$. This algorithm does not require any gradient or Hessian
*7f296bb3SBarry Smithinformation of $f$ and therefore has some expected advantages and
*7f296bb3SBarry Smithdisadvantages compared to the other TAO solvers. The obvious advantage
*7f296bb3SBarry Smithis that it is easier to write an application when no derivatives need to
*7f296bb3SBarry Smithbe calculated. The downside is that this algorithm can be slow to
*7f296bb3SBarry Smithconverge or can even stagnate, and it performs poorly for large numbers
*7f296bb3SBarry Smithof variables.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis solver keeps a set of $N+1$ sorted vectors
*7f296bb3SBarry Smith${x_1,x_2,\ldots,x_{N+1}}$ and their corresponding objective
*7f296bb3SBarry Smithfunction values $f_1 \leq f_2 \leq \ldots \leq f_{N+1}$. At each
*7f296bb3SBarry Smithiteration, $x_{N+1}$ is removed from the set and replaced with
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithx(\mu) = (1+\mu) \frac{1}{N} \sum_{i=1}^N x_i - \mu x_{N+1},
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $\mu$ can be one of
*7f296bb3SBarry Smith${\mu_0,2\mu_0,\frac{1}{2}\mu_0,-\frac{1}{2}\mu_0}$ depending on
*7f296bb3SBarry Smiththe values of each possible $f(x(\mu))$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe algorithm terminates when the residual $f_{N+1} - f_1$ becomes
*7f296bb3SBarry Smithsufficiently small. Because of the way new vectors can be added to the
*7f296bb3SBarry Smithsorted set, the minimum function value and/or the residual may not be
*7f296bb3SBarry Smithimpacted at each iteration.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTwo options can be set specifically for the Nelder-Mead algorithm:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-tao_nm_lambda <value>`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: sets the initial set of vectors ($x_0$ plus `value` in each
*7f296bb3SBarry Smith  coordinate direction); the default value is $1$.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-tao_nm_mu <value>`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: sets the value of $\mu_0$; the default is $\mu_0=1$.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bound)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Bound-Constrained Optimization
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBound-constrained optimization algorithms solve optimization problems of
*7f296bb3SBarry Smiththe form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll} \displaystyle
*7f296bb3SBarry Smith\min_{x} & f(x) \\
*7f296bb3SBarry Smith\text{subject to} & l \leq x \leq u.
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThese solvers use the bounds on the variables as well as objective
*7f296bb3SBarry Smithfunction, gradient, and possibly Hessian information.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor any unbounded variables, the bound value for the associated index
*7f296bb3SBarry Smithcan be set to `PETSC_INFINITY` for the upper bound and
*7f296bb3SBarry Smith`PETSC_NINFINITY` for the lower bound. If all bounds are set to
*7f296bb3SBarry Smithinfinity, then the bounded algorithms are equivalent to their
*7f296bb3SBarry Smithunconstrained counterparts.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBefore introducing specific methods, we will first define two projection
*7f296bb3SBarry Smithoperations used by all bound constrained algorithms.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- Gradient projection:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  $$
*7f296bb3SBarry Smith  \mathfrak{P}(g) = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith  0 & \text{if} \; (x \leq l_i \land g_i > 0) \lor (x \geq u_i \land g_i < 0) \\
*7f296bb3SBarry Smith  g_i & \text{otherwise}
*7f296bb3SBarry Smith  \end{array}
*7f296bb3SBarry Smith  \right.
*7f296bb3SBarry Smith  $$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- Bound projection:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  $$
*7f296bb3SBarry Smith  \mathfrak{B}(x) = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith  l_i & \text{if} \; x_i < l_i \\
*7f296bb3SBarry Smith  u_i & \text{if} \; x_i > u_i \\
*7f296bb3SBarry Smith  x_i & \text{otherwise}
*7f296bb3SBarry Smith  \end{array}
*7f296bb3SBarry Smith  \right.
*7f296bb3SBarry Smith  $$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bnk)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bounded Newton-Krylov Methods
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO features three bounded Newton-Krylov (BNK) class of algorithms,
*7f296bb3SBarry Smithseparated by their globalization methods: projected line search (BNLS),
*7f296bb3SBarry Smithtrust region (BNTR), and trust region with a projected line search
*7f296bb3SBarry Smithfall-back (BNTL). They are available via the TAO solvers `TAOBNLS`,
*7f296bb3SBarry Smith`TAOBNTR` and `TAOBNTL`, respectively, or the `-tao_type`
*7f296bb3SBarry Smith`bnls`/`bntr`/`bntl` flag.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe BNK class of methods use an active-set approach to solve the
*7f296bb3SBarry Smithsymmetric system of equations,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithH_k p_k = -g_k,
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithonly for inactive variables in the interior of the bounds. The
*7f296bb3SBarry Smithactive-set estimation is based on Bertsekas
*7f296bb3SBarry Smith{cite}`bertsekas:projected` with the following variable
*7f296bb3SBarry Smithindex categories:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{rlll} \displaystyle
*7f296bb3SBarry Smith\text{lower bounded}: & \mathcal{L}(x) & = & \{ i \; : \; x_i \leq l_i + \epsilon \; \land \; g(x)_i > 0 \}, \\
*7f296bb3SBarry Smith\text{upper bounded}: & \mathcal{U}(x) & = & \{ i \; : \; x_i \geq u_i + \epsilon \; \land \; g(x)_i < 0 \}, \\
*7f296bb3SBarry Smith\text{fixed}: & \mathcal{F}(x) & = & \{ i \; : \; l_i = u_i \}, \\
*7f296bb3SBarry Smith\text{active-set}: & \mathcal{A}(x) & = & \{ \mathcal{L}(x) \; \bigcup \; \mathcal{U}(x) \; \bigcup \; \mathcal{F}(x) \}, \\
*7f296bb3SBarry Smith\text{inactive-set}: & \mathcal{I}(x) & = & \{ 1,2,\ldots,n \} \; \backslash \; \mathcal{A}(x).
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAt each iteration, the bound tolerance is estimated as
*7f296bb3SBarry Smith$\epsilon_{k+1} = \text{min}(\epsilon_k, ||w_k||_2)$ with
*7f296bb3SBarry Smith$w_k = x_k - \mathfrak{B}(x_k - \beta D_k g_k)$, where the
*7f296bb3SBarry Smithdiagonal matrix $D_k$ is an approximation of the Hessian inverse
*7f296bb3SBarry Smith$H_k^{-1}$. The initial bound tolerance $\epsilon_0$ and the
*7f296bb3SBarry Smithstep length $\beta$ have default values of $0.001$ and can
*7f296bb3SBarry Smithbe adjusted using `-tao_bnk_as_tol` and `-tao_bnk_as_step` flags,
*7f296bb3SBarry Smithrespectively. The active-set estimation can be disabled using the option
*7f296bb3SBarry Smith`-tao_bnk_as_type none`, in which case the algorithm simply uses the
*7f296bb3SBarry Smithcurrent iterate with no bound tolerances to determine which variables
*7f296bb3SBarry Smithare actively bounded and which are free.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNK algorithms invert the reduced Hessian using a Krylov iterative
*7f296bb3SBarry Smithmethod. Trust-region conjugate gradient methods (`KSPNASH`,
*7f296bb3SBarry Smith`KSPSTCG`, and `KSPGLTR`) are required for the BNTR and BNTL
*7f296bb3SBarry Smithalgorithms, and recommended for the BNLS algorithm. The preconditioner
*7f296bb3SBarry Smithtype can be changed using the `-tao_bnk_pc_type`
*7f296bb3SBarry Smith`none`/`ilu`/`icc`/`jacobi`/`lmvm`. The `lmvm` option, which
*7f296bb3SBarry Smithis also the default, preconditions the Krylov solution with a
*7f296bb3SBarry Smith`MATLMVM` matrix. The remaining supported preconditioner types are
*7f296bb3SBarry Smithdefault PETSc types. If Jacobi is selected, the diagonal values are
*7f296bb3SBarry Smithsafeguarded to be positive. `icc` and `ilu` options produce good
*7f296bb3SBarry Smithresults for problems with dense Hessians. The LMVM and Jacobi
*7f296bb3SBarry Smithpreconditioners are also used as the approximate inverse-Hessian in the
*7f296bb3SBarry Smithactive-set estimation. If neither are available, or if the Hessian
*7f296bb3SBarry Smithmatrix does not have `MATOP_GET_DIAGONAL` defined, then the active-set
*7f296bb3SBarry Smithestimation falls back onto using an identity matrix in place of
*7f296bb3SBarry Smith$D_k$ (this is equivalent to estimating the active-set using a
*7f296bb3SBarry Smithgradient descent step).
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithA special option is available to *accelerate* the convergence of the BNK
*7f296bb3SBarry Smithalgorithms by taking a finite number of BNCG iterations at each Newton
*7f296bb3SBarry Smithiteration. By default, the number of BNCG iterations is set to zero and
*7f296bb3SBarry Smiththe algorithms do not take any BNCG steps. This can be changed using the
*7f296bb3SBarry Smithoption flag `-tao_bnk_max_cg_its <i>`. While this reduces the number
*7f296bb3SBarry Smithof Newton iterations, in practice it simply trades off the Hessian
*7f296bb3SBarry Smithevaluations in the BNK solver for more function and gradient evaluations
*7f296bb3SBarry Smithin the BNCG solver. However, it may be useful for certain types of
*7f296bb3SBarry Smithproblems where the Hessian evaluation is disproportionately more
*7f296bb3SBarry Smithexpensive than the objective function or its gradient.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bnls)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Bounded Newton Line Search (BNLS)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNLS safeguards the Newton step by falling back onto a BFGS, scaled
*7f296bb3SBarry Smithgradient, or gradient steps based on descent direction verifications.
*7f296bb3SBarry SmithFor problems with indefinite Hessian matrices, the step direction is
*7f296bb3SBarry Smithcalculated using a perturbed system of equations,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith(H_k + \rho_k I)p_k = -g_k,
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $\rho_k$ is a dynamically adjusted positive constant. The
*7f296bb3SBarry Smithstep is globalized using a projected Moré-Thuente line search. If a
*7f296bb3SBarry Smithtrust-region conjugate gradient method is used for the Hessian
*7f296bb3SBarry Smithinversion, the trust radius is modified based on the line search step
*7f296bb3SBarry Smithlength.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bntr)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Bounded Newton Trust Region (BNTR)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNTR globalizes the Newton step using a trust region method based on the
*7f296bb3SBarry Smithpredicted versus actual reduction in the cost function. The trust radius
*7f296bb3SBarry Smithis increased only if the accepted step is at the trust region boundary.
*7f296bb3SBarry SmithThe reduction check features a safeguard for numerical values below
*7f296bb3SBarry Smithmachine epsilon, scaled by the latest function value, where the full
*7f296bb3SBarry SmithNewton step is accepted without modification.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bntl)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Bounded Newton Trust Region with Line Search (BNTL)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNTL safeguards the trust-region globalization such that a line search
*7f296bb3SBarry Smithis used in the event that the step is initially rejected by the
*7f296bb3SBarry Smithpredicted versus actual decrease comparison. If the line search fails to
*7f296bb3SBarry Smithfind a viable step length for the Newton step, it falls back onto a
*7f296bb3SBarry Smithscaled gradient or a gradient descent step. The trust radius is then
*7f296bb3SBarry Smithmodified based on the line search step length.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bqnls)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bounded Quasi-Newton Line Search (BQNLS)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe BQNLS algorithm uses the BNLS infrastructure, but replaces the step
*7f296bb3SBarry Smithcalculation with a direct inverse application of the approximate Hessian
*7f296bb3SBarry Smithbased on quasi-Newton update formulas. No Krylov solver is used in the
*7f296bb3SBarry Smithsolution, and therefore the quasi-Newton method chosen must guarantee a
*7f296bb3SBarry Smithpositive-definite Hessian approximation. This algorithm is available via
*7f296bb3SBarry Smith`tao_type bqnls`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bqnk)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bounded Quasi-Newton-Krylov
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBQNK algorithms use the BNK infrastructure, but replace the exact
*7f296bb3SBarry SmithHessian with a quasi-Newton approximation. The matrix-free forward
*7f296bb3SBarry Smithproduct operation based on quasi-Newton update formulas are used in
*7f296bb3SBarry Smithconjunction with Krylov solvers to compute step directions. The
*7f296bb3SBarry Smithquasi-Newton inverse application is used to precondition the Krylov
*7f296bb3SBarry Smithsolution, and typically helps converge to a step direction in
*7f296bb3SBarry Smith$\mathcal{O}(10)$ iterations. This approach is most useful with
*7f296bb3SBarry Smithquasi-Newton update types such as Symmetric Rank-1 that cannot strictly
*7f296bb3SBarry Smithguarantee positive-definiteness. The BNLS framework with Hessian
*7f296bb3SBarry Smithshifting, or the BNTR framework with trust region safeguards, can
*7f296bb3SBarry Smithsuccessfully compensate for the Hessian approximation becoming
*7f296bb3SBarry Smithindefinite.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSimilar to the full Newton-Krylov counterpart, BQNK algorithms come in
*7f296bb3SBarry Smiththree forms separated by the globalization technique: line search
*7f296bb3SBarry Smith(BQNKLS), trust region (BQNKTR) and trust region w/ line search
*7f296bb3SBarry Smithfall-back (BQNKTL). These algorithms are available via
*7f296bb3SBarry Smith`tao_type <bqnkls, bqnktr, bqnktl>`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bncg)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bounded Nonlinear Conjugate Gradient (BNCG)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNCG extends the unconstrained nonlinear conjugate gradient algorithm to
*7f296bb3SBarry Smithbound constraints via gradient projections and a bounded Moré-Thuente
*7f296bb3SBarry Smithline search.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithLike its unconstrained counterpart, BNCG offers gradient descent and a
*7f296bb3SBarry Smithvariety of CG updates: Fletcher-Reeves, Polak-Ribiére,
*7f296bb3SBarry SmithPolak-Ribiére-Plus, Hestenes-Stiefel, Dai-Yuan, Hager-Zhang, Dai-Kou,
*7f296bb3SBarry SmithKou-Dai, and the Self-Scaling Memoryless (SSML) BFGS, DFP, and Broyden
*7f296bb3SBarry Smithmethods. These methods can be specified by using the command line
*7f296bb3SBarry Smithargument
*7f296bb3SBarry Smith`-tao_bncg_type <gd,fr,pr,prp,hs,dy,hz,dk,kd,ssml_bfgs,ssml_dfp,ssml_brdn>`,
*7f296bb3SBarry Smithrespectively. The default value is `ssml_bfgs`. We have scalar
*7f296bb3SBarry Smithpreconditioning for these methods, and it is controlled by the flag
*7f296bb3SBarry Smith`tao_bncg_alpha`. To disable rescaling, use $\alpha = -1.0$,
*7f296bb3SBarry Smithotherwise $\alpha \in [0, 1]$. BNCG is available via the TAO
*7f296bb3SBarry Smithsolver `TAOBNCG` or the `-tao_type bncg` flag.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome individual methods also contain their own parameters. The
*7f296bb3SBarry SmithHager-Zhang and Dou-Kai methods have a parameter that determines the
*7f296bb3SBarry Smithminimum amount of contribution the previous search direction gives to
*7f296bb3SBarry Smiththe next search direction. The flags are `-tao_bncg_hz_eta` and
*7f296bb3SBarry Smith`-tao_bncg_dk_eta`, and by default are set to $0.4$ and
*7f296bb3SBarry Smith$0.5$ respectively. The Kou-Dai method has multiple parameters.
*7f296bb3SBarry Smith`-tao_bncg_zeta` serves the same purpose as the previous two; set to
*7f296bb3SBarry Smith$0.1$ by default. There is also a parameter to scale the
*7f296bb3SBarry Smithcontribution of $y_k \equiv \nabla f(x_k) - \nabla f(x_{k-1})$ in
*7f296bb3SBarry Smiththe search direction update. It is controlled by `-tao_bncg_xi`, and
*7f296bb3SBarry Smithis equal to $1.0$ by default. There are also times where we want
*7f296bb3SBarry Smithto maximize the descent as measured by $\nabla f(x_k)^T d_k$, and
*7f296bb3SBarry Smiththat may be done by using a negative value of $\xi$; this achieves
*7f296bb3SBarry Smithbetter performance when not using the diagonal preconditioner described
*7f296bb3SBarry Smithnext. This is enabled by default, and is controlled by
*7f296bb3SBarry Smith`-tao_bncg_neg_xi`. Finally, the Broyden method has its convex
*7f296bb3SBarry Smithcombination parameter, set with `-tao_bncg_theta`. We have this as 1.0
*7f296bb3SBarry Smithby default, i.e. it is by default the BFGS method. One can also
*7f296bb3SBarry Smithindividually tweak the BFGS and DFP contributions using the
*7f296bb3SBarry Smithmultiplicative constants `-tao_bncg_scale`; both are set to $1$
*7f296bb3SBarry Smithby default.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAll methods can be scaled using the parameter `-tao_bncg_alpha`, which
*7f296bb3SBarry Smithcontinuously varies in $[0, 1]$. The default value is set
*7f296bb3SBarry Smithdepending on the method from initial testing.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBNCG also offers a special type of method scaling. It employs Broyden
*7f296bb3SBarry Smithdiagonal scaling as an option for its CG methods, turned on with the
*7f296bb3SBarry Smithflag `-tao_bncg_diag_scaling`. Formulations for both the forward
*7f296bb3SBarry Smith(regular) and inverse Broyden methods are developed, controlled by the
*7f296bb3SBarry Smithflag `-tao_bncg_mat_lmvm_forward`. It is set to True by default.
*7f296bb3SBarry SmithWhether one uses the forward or inverse formulations depends on the
*7f296bb3SBarry Smithmethod being used. For example, in our preliminary computations, the
*7f296bb3SBarry Smithforward formulation works better for the SSML_BFGS method, but the
*7f296bb3SBarry Smithinverse formulation works better for the Hestenes-Stiefel method. The
*7f296bb3SBarry Smithconvex combination parameter for the Broyden scaling is controlled by
*7f296bb3SBarry Smith`-tao_bncg_mat_lmvm_theta`, and is 0 by default. We also employ
*7f296bb3SBarry Smithrescaling of the Broyden diagonal, which aids the linesearch immensely.
*7f296bb3SBarry SmithThe rescaling parameter is controlled by `-tao_bncg_mat_lmvm_alpha`,
*7f296bb3SBarry Smithand should be $\in [0, 1]$. One can disable rescaling of the
*7f296bb3SBarry SmithBroyden diagonal entirely by setting
*7f296bb3SBarry Smith`-tao_bncg_mat_lmvm_sigma_hist 0`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne can also supply their own preconditioner, serving as a Hessian
*7f296bb3SBarry Smithinitialization to the above diagonal scaling. The appropriate user
*7f296bb3SBarry Smithfunction in the code is `TaoBNCGSetH0(tao, H0)` where `H0` is the
*7f296bb3SBarry Smithuser-defined `Mat` object that serves as a preconditioner. For an
*7f296bb3SBarry Smithexample of similar usage, see `tao/tutorials/ex3.c`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe active set estimation uses the Bertsekas-based method described in
*7f296bb3SBarry Smith{any}`sec_tao_bnk`, which can be deactivated using
*7f296bb3SBarry Smith`-tao_bncg_as_type none`, in which case the algorithm will use the
*7f296bb3SBarry Smithcurrent iterate to determine the bounded variables with no tolerances
*7f296bb3SBarry Smithand no look-ahead step. As in the BNK algorithm, the initial bound
*7f296bb3SBarry Smithtolerance and estimator step length used in the Bertsekas method can be
*7f296bb3SBarry Smithset via `-tao_bncg_as_tol` and `-tao_bncg_as_step`, respectively.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn addition to automatic scaled gradient descent restarts under certain
*7f296bb3SBarry Smithlocal curvature conditions, we also employ restarts based on a check on
*7f296bb3SBarry Smithdescent direction such that
*7f296bb3SBarry Smith$\nabla f(x_k)^T d_k \in [-10^{11}, -10^{-9}]$. Furthermore, we
*7f296bb3SBarry Smithallow for a variety of alternative restart strategies, all disabled by
*7f296bb3SBarry Smithdefault. The `-tao_bncg_unscaled_restart` flag allows one to disable
*7f296bb3SBarry Smithrescaling of the gradient for gradient descent steps. The
*7f296bb3SBarry Smith`-tao_bncg_spaced_restart` flag tells the solver to restart every
*7f296bb3SBarry Smith$Mn$ iterations, where $n$ is the problem dimension and
*7f296bb3SBarry Smith$M$ is a constant determined by `-tao_bncg_min_restart_num` and
*7f296bb3SBarry Smithis 6 by default. We also have dynamic restart strategies based on
*7f296bb3SBarry Smithchecking if a function is locally quadratic; if so, go do a gradient
*7f296bb3SBarry Smithdescent step. The flag is `-tao_bncg_dynamic_restart`, disabled by
*7f296bb3SBarry Smithdefault since the CG solver usually does better in those cases anyway.
*7f296bb3SBarry SmithThe minimum number of quadratic-like steps before a restart is set using
*7f296bb3SBarry Smith`-tao_bncg_min_quad` and is 6 by default.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_constrained)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Generally Constrained Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithConstrained solvers solve optimization problems that incorporate either or both
*7f296bb3SBarry Smithequality and inequality constraints, and may optionally include bounds on
*7f296bb3SBarry Smithsolution variables.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Alternating Direction Method of Multipliers (ADMM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAOADMM algorithm is intended to blend the decomposability
*7f296bb3SBarry Smithof dual ascent with the superior convergence properties of the method of
*7f296bb3SBarry Smithmultipliers. {cite}`boyd` The algorithm solves problems in
*7f296bb3SBarry Smiththe form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) + g(z) \\
*7f296bb3SBarry Smith\text{subject to} & Ax + Bz = c
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $x \in \mathbb R^n$, $z \in \mathbb R^m$,
*7f296bb3SBarry Smith$A \in \mathbb R^{p \times n}$,
*7f296bb3SBarry Smith$B \in \mathbb R^{p \times m}$, and $c \in \mathbb R^p$.
*7f296bb3SBarry SmithEssentially, ADMM is a wrapper over two TAO solver, one for
*7f296bb3SBarry Smith$f(x)$, and one for $g(z)$. With method of multipliers, one
*7f296bb3SBarry Smithcan form the augmented Lagrangian
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithL_{\rho}(x,z,y) = f(x) + g(z) + y^T(Ax+Bz-c) + (\rho/2)||Ax+Bz-c||_2^2
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThen, ADMM consists of the iterations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithx^{k+1} := \text{argmin}L_{\rho}(x,z^k,y^k)
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithz^{k+1} := \text{argmin}L_{\rho}(x^{k+1},z,y^k)
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithy^{k+1} := y^k + \rho(Ax^{k+1}+Bz^{k+1}-c)
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn certain formulation of ADMM, solution of $z^{k+1}$ may have
*7f296bb3SBarry Smithclosed-form solution. Currently ADMM provides one default implementation
*7f296bb3SBarry Smithfor $z^{k+1}$, which is soft-threshold. It can be used with either
*7f296bb3SBarry Smith`TaoADMMSetRegularizerType_ADMM()` or
*7f296bb3SBarry Smith`-tao_admm_regularizer_type <regularizer_soft_thresh>`. User can also
*7f296bb3SBarry Smithpass spectral penalty value, $\rho$, with either
*7f296bb3SBarry Smith`TaoADMMSetSpectralPenalty()` or `-tao_admm_spectral_penalty`.
*7f296bb3SBarry SmithCurrently, user can use
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- `TaoADMMSetMisfitObjectiveAndGradientRoutine()`
*7f296bb3SBarry Smith- `TaoADMMSetRegularizerObjectiveAndGradientRoutine()`
*7f296bb3SBarry Smith- `TaoADMMSetMisfitHessianRoutine()`
*7f296bb3SBarry Smith- `TaoADMMSetRegularizerHessianRoutine()`
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAny other combination of routines is currently not supported. Hessian
*7f296bb3SBarry Smithmatrices can either be constant or non-constant, of which fact can be
*7f296bb3SBarry Smithset via `TaoADMMSetMisfitHessianChangeStatus()`, and
*7f296bb3SBarry Smith`TaoADMMSetRegularizerHessianChangeStatus()`. Also, it may appear in
*7f296bb3SBarry Smithcertain cases where augmented Lagrangian’s Hessian may become nearly
*7f296bb3SBarry Smithsingular depending on the $\rho$, which may change in the case of
*7f296bb3SBarry Smith`-tao_admm_dual_update <update_adaptive>, <update_adaptive_relaxed>`.
*7f296bb3SBarry SmithThis issue can be prevented by `TaoADMMSetMinimumSpectralPenalty()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Augmented Lagrangian Method of Multipliers (ALMM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAOALMM method solves generally constrained problems of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) \\
*7f296bb3SBarry Smith\text{subject to} & g(x) = 0\\
*7f296bb3SBarry Smith                  & h(x) \geq 0 \\
*7f296bb3SBarry Smith                  & l \leq x \leq u
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $g(x)$ are equality constraints, $h(x)$ are inequality
*7f296bb3SBarry Smithconstraints and $l$ and $u$ are lower and upper bounds on
*7f296bb3SBarry Smiththe optimization variables, respectively.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAOALMM converts the above general constrained problem into a sequence
*7f296bb3SBarry Smithof bound constrained problems at each outer iteration
*7f296bb3SBarry Smith$k = 1,2,\dots$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{x} & L(x, \lambda_k) \\
*7f296bb3SBarry Smith\text{subject to} & l \leq x \leq u
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $L(x, \lambda_k)$ is the augmented Lagrangian merit function
*7f296bb3SBarry Smithand $\lambda_k$ is the Lagrange multiplier estimates at outer
*7f296bb3SBarry Smithiteration $k$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAOALMM offers two versions of the augmented Lagrangian formulation: the
*7f296bb3SBarry Smithcanonical Hestenes-Powell augmented
*7f296bb3SBarry SmithLagrangian {cite}`hestenes1969multiplier` {cite}`powell1969method`
*7f296bb3SBarry Smithwith inequality constrained converted to equality constraints via slack
*7f296bb3SBarry Smithvariables, and the slack-less Powell-Hestenes-Rockafellar
*7f296bb3SBarry Smithformulation {cite}`rockafellar1974augmented` that utilizes a
*7f296bb3SBarry Smithpointwise `max()` on the inequality constraints. For most
*7f296bb3SBarry Smithapplications, the canonical Hestenes-Powell formulation is likely to
*7f296bb3SBarry Smithperform better. However, the PHR formulation may be desirable for
*7f296bb3SBarry Smithproblems featuring very large numbers of inequality constraints as it
*7f296bb3SBarry Smithavoids inflating the dimension of the subproblem with slack variables.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe inner subproblem is solved using a nested bound-constrained
*7f296bb3SBarry Smithfirst-order TAO solver. By default, TAOALM uses a quasi-Newton-Krylov
*7f296bb3SBarry Smithtrust-region method (TAOBQNKTR). Other first-order methods such as
*7f296bb3SBarry SmithTAOBNCG and TAOBQNLS are also appropriate, but a trust-region
*7f296bb3SBarry Smithglobalization is strongly recommended for most applications.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Primal-Dual Interior-Point Method (PDIPM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAOPDIPM method (`-tao_type pdipm`) implements a primal-dual interior
*7f296bb3SBarry Smithpoint method for solving general nonlinear programming problems of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{x} & f(x) \\
*7f296bb3SBarry Smith\text{subject to} & g(x) = 0 \\
*7f296bb3SBarry Smith                  & h(x) \geq 0 \\
*7f296bb3SBarry Smith                  & x^- \leq x \leq x^+
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$ (eq_nlp_gen1)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithHere, $f(x)$ is the nonlinear objective function, $g(x)$,
*7f296bb3SBarry Smith$h(x)$ are the equality and inequality constraints, and
*7f296bb3SBarry Smith$x^-$ and $x^+$ are the lower and upper bounds on decision
*7f296bb3SBarry Smithvariables $x$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithPDIPM converts the inequality constraints to equalities using slack variables
*7f296bb3SBarry Smith$z$ and a log-barrier term, which transforms {eq}`eq_nlp_gen1` to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith    \text{min}~&f(x) - \mu\sum_{i=1}^{nci}\ln z_i\\
*7f296bb3SBarry Smith    \text{s.t.}& \\
*7f296bb3SBarry Smith        &ce(x) = 0 \\
*7f296bb3SBarry Smith        &ci(x) - z = 0 \\
*7f296bb3SBarry Smith    \end{aligned}
*7f296bb3SBarry Smith$$ (eq_nlp_gen2)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithHere, $ce(x)$ is set of equality constraints that include
*7f296bb3SBarry Smith$g(x)$ and fixed decision variables, i.e., $x^- = x = x^+$.
*7f296bb3SBarry SmithSimilarly, $ci(x)$ are inequality constraints including
*7f296bb3SBarry Smith$h(x)$ and lower/upper/box-constraints on $x$. $\mu$
*7f296bb3SBarry Smithis a parameter that is driven to zero as the optimization progresses.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Lagrangian for {eq}`eq_nlp_gen2`) is
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithL_{\mu}(x,\lambda_{ce},\lambda_{ci},z) = f(x) + \lambda_{ce}^Tce(x) - \lambda_{ci}^T(ci(x) - z) - \mu\sum_{i=1}^{nci}\ln z_i
*7f296bb3SBarry Smith$$ (eq_lagrangian)
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere, $\lambda_{ce}$ and $\lambda_{ci}$ are the Lagrangian
*7f296bb3SBarry Smithmultipliers for the equality and inequality constraints, respectively.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe first order KKT conditions for optimality are as follows
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\nabla L_{\mu}(x,\lambda_{ce},\lambda_{ci},z)    =
*7f296bb3SBarry Smith    \begin{bmatrix}
*7f296bb3SBarry Smith        \nabla f(x) + \nabla ce(x)^T\lambda_{ce} -  \nabla ci(x)^T \lambda_{ci} \\
*7f296bb3SBarry Smith        ce(x) \\
*7f296bb3SBarry Smith        ci(x) - z \\
*7f296bb3SBarry Smith        Z\Lambda_{ci}e - \mu e
*7f296bb3SBarry Smith    \end{bmatrix}
*7f296bb3SBarry Smith= 0
*7f296bb3SBarry Smith$$ (eq_nlp_kkt)
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith{eq}`eq_nlp_kkt` is solved iteratively using Newton’s
*7f296bb3SBarry Smithmethod using PETSc’s SNES object. After each Newton iteration, a
*7f296bb3SBarry Smithline-search is performed to update $x$ and enforce
*7f296bb3SBarry Smith$z,\lambda_{ci} \geq 0$. The barrier parameter $\mu$ is also
*7f296bb3SBarry Smithupdated after each Newton iteration. The Newton update is obtained by
*7f296bb3SBarry Smithsolving the second-order KKT system $Hd = -\nabla L_{\mu}$.
*7f296bb3SBarry SmithHere,$H$ is the Hessian matrix of the KKT system. For
*7f296bb3SBarry Smithinterior-point methods such as PDIPM, the Hessian matrix tends to be
*7f296bb3SBarry Smithill-conditioned, thus necessitating the use of a direct solver. We
*7f296bb3SBarry Smithrecommend using LU preconditioner `-pc_type lu` and using direct
*7f296bb3SBarry Smithlinear solver packages such `SuperLU_Dist` or `MUMPS`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### PDE-Constrained Optimization
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO solves PDE-constrained optimization problems of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{u,v} & f(u,v) \\
*7f296bb3SBarry Smith\text{subject to} & g(u,v) = 0,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the state variable $u$ is the solution to the discretized
*7f296bb3SBarry Smithpartial differential equation defined by $g$ and parametrized by
*7f296bb3SBarry Smiththe design variable $v$, and $f$ is an objective function.
*7f296bb3SBarry SmithThe Lagrange multipliers on the constraint are denoted by $y$.
*7f296bb3SBarry SmithThis method is set by using the linearly constrained augmented
*7f296bb3SBarry SmithLagrangian TAO solver `tao_lcl`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe make two main assumptions when solving these problems: the objective
*7f296bb3SBarry Smithfunction and PDE constraints have been discretized so that we can treat
*7f296bb3SBarry Smiththe optimization problem as finite dimensional and
*7f296bb3SBarry Smith$\nabla_u g(u,v)$ is invertible for all $u$ and $v$.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_lcl)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Linearly-Constrained Augmented Lagrangian Method (LCL)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithGiven the current iterate $(u_k, v_k, y_k)$, the linearly
*7f296bb3SBarry Smithconstrained augmented Lagrangian method approximately solves the
*7f296bb3SBarry Smithoptimization problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{u,v} & \tilde{f}_k(u, v) \\
*7f296bb3SBarry Smith\text{subject to} & A_k (u-u_k) + B_k (v-v_k) + g_k = 0,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $A_k = \nabla_u g(u_k,v_k)$,
*7f296bb3SBarry Smith$B_k = \nabla_v g(u_k,v_k)$, and $g_k = g(u_k, v_k)$ and
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\tilde{f}_k(u,v) = f(u,v) - g(u,v)^T y^k + \frac{\rho_k}{2} \| g(u,v) \|^2
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithis the augmented Lagrangian function. This optimization problem is
*7f296bb3SBarry Smithsolved in two stages. The first computes the Newton direction and finds
*7f296bb3SBarry Smitha feasible point for the linear constraints. The second computes a
*7f296bb3SBarry Smithreduced-space direction that maintains feasibility with respect to the
*7f296bb3SBarry Smithlinearized constraints and improves the augmented Lagrangian merit
*7f296bb3SBarry Smithfunction.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Newton Step
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Newton direction is obtained by fixing the design variables at their
*7f296bb3SBarry Smithcurrent value and solving the linearized constraint for the state
*7f296bb3SBarry Smithvariables. In particular, we solve the system of equations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithA_k du = -g_k
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithto obtain a direction $du$. We need a direction that provides
*7f296bb3SBarry Smithsufficient descent for the merit function
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\frac{1}{2} \|g(u,v)\|^2.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThat is, we require $g_k^T A_k du < 0$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIf the Newton direction is a descent direction, then we choose a penalty
*7f296bb3SBarry Smithparameter $\rho_k$ so that $du$ is also a sufficient descent
*7f296bb3SBarry Smithdirection for the augmented Lagrangian merit function. We then find
*7f296bb3SBarry Smith$\alpha$ to approximately minimize the augmented Lagrangian merit
*7f296bb3SBarry Smithfunction along the Newton direction.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\displaystyle \min_{\alpha \geq 0} \; \tilde{f}_k(u_k + \alpha du, v_k).
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe can enforce either the sufficient decrease condition or the Wolfe
*7f296bb3SBarry Smithconditions during the search procedure. The new point,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smithu_{k+\frac{1}{2}} & = & u_k + \alpha_k du \\
*7f296bb3SBarry Smithv_{k+\frac{1}{2}} & = & v_k,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithsatisfies the linear constraint
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry SmithA_k (u_{k+\frac{1}{2}} - u_k) + B_k (v_{k+\frac{1}{2}} - v_k) + \alpha_k g_k = 0.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIf the Newton direction computed does not provide descent for the merit
*7f296bb3SBarry Smithfunction, then we can use the steepest descent direction
*7f296bb3SBarry Smith$du = -A_k^T g_k$ during the search procedure. However, the
*7f296bb3SBarry Smithimplication that the intermediate point approximately satisfies the
*7f296bb3SBarry Smithlinear constraint is no longer true.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Modified Reduced-Space Step
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe are now ready to compute a reduced-space step for the modified
*7f296bb3SBarry Smithoptimization problem:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{u,v} & \tilde{f}_k(u, v) \\
*7f296bb3SBarry Smith\text{subject to} & A_k (u-u_k) + B_k (v-v_k) + \alpha_k g_k = 0.
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe begin with the change of variables
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{du,dv} & \tilde{f}_k(u_k+du, v_k+dv) \\
*7f296bb3SBarry Smith\text{subject to} & A_k du + B_k dv + \alpha_k g_k = 0
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand make the substitution
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithdu = -A_k^{-1}(B_k dv + \alpha_k g_k).
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithHence, the unconstrained optimization problem we need to solve is
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{dv} & \tilde{f}_k(u_k-A_k^{-1}(B_k dv + \alpha_k g_k), v_k+dv), \\
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhich is equivalent to
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{dv} & \tilde{f}_k(u_{k+\frac{1}{2}} - A_k^{-1} B_k dv, v_{k+\frac{1}{2}}+dv). \\
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe apply one step of a limited-memory quasi-Newton method to this
*7f296bb3SBarry Smithproblem. The direction is obtain by solving the quadratic problem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{dv} & \frac{1}{2} dv^T \tilde{H}_k dv + \tilde{g}_{k+\frac{1}{2}}^T dv,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $\tilde{H}_k$ is the limited-memory quasi-Newton
*7f296bb3SBarry Smithapproximation to the reduced Hessian matrix, a positive-definite matrix,
*7f296bb3SBarry Smithand $\tilde{g}_{k+\frac{1}{2}}$ is the reduced gradient.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smith\tilde{g}_{k+\frac{1}{2}} & = & \nabla_v \tilde{f}_k(u_{k+\frac{1}{2}}, v_{k+\frac{1}{2}}) -
*7f296bb3SBarry Smith          \nabla_u \tilde{f}_k(u_{k+\frac{1}{2}}, v_{k+\frac{1}{2}}) A_k^{-1} B_k \\
*7f296bb3SBarry Smith       & = & d_{k+\frac{1}{2}} + c_{k+\frac{1}{2}} A_k^{-1} B_k
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe reduced gradient is obtained from one linearized adjoint solve
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithy_{k+\frac{1}{2}} = A_k^{-T}c_{k+\frac{1}{2}}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand some linear algebra
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\tilde{g}_{k+\frac{1}{2}} = d_{k+\frac{1}{2}} + y_{k+\frac{1}{2}}^T B_k.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBecause the Hessian approximation is positive definite and we know its
*7f296bb3SBarry Smithinverse, we obtain the direction
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithdv = -H_k^{-1} \tilde{g}_{k+\frac{1}{2}}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand recover the full-space direction from one linearized forward solve,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithdu = -A_k^{-1} B_k dv.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithHaving the full-space direction, which satisfies the linear constraint,
*7f296bb3SBarry Smithwe now approximately minimize the augmented Lagrangian merit function
*7f296bb3SBarry Smithalong the direction.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smith\displaystyle \min_{\beta \geq 0} & \tilde{f_k}(u_{k+\frac{1}{2}} + \beta du, v_{k+\frac{1}{2}} + \beta dv)
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe enforce the Wolfe conditions during the search procedure. The new
*7f296bb3SBarry Smithpoint is
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smithu_{k+1} & = & u_{k+\frac{1}{2}} + \beta_k du \\
*7f296bb3SBarry Smithv_{k+1} & = & v_{k+\frac{1}{2}} + \beta_k dv.
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe reduced gradient at the new point is computed from
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{lcl}
*7f296bb3SBarry Smithy_{k+1} & = & A_k^{-T}c_{k+1} \\
*7f296bb3SBarry Smith\tilde{g}_{k+1} & = & d_{k+1} - y_{k+1}^T B_k,
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $c_{k+1} = \nabla_u \tilde{f}_k (u_{k+1},v_{k+1})$ and
*7f296bb3SBarry Smith$d_{k+1} = \nabla_v \tilde{f}_k (u_{k+1},v_{k+1})$. The
*7f296bb3SBarry Smithmultipliers $y_{k+1}$ become the multipliers used in the next
*7f296bb3SBarry Smithiteration of the code. The quantities $v_{k+\frac{1}{2}}$,
*7f296bb3SBarry Smith$v_{k+1}$, $\tilde{g}_{k+\frac{1}{2}}$, and
*7f296bb3SBarry Smith$\tilde{g}_{k+1}$ are used to update $H_k$ to obtain the
*7f296bb3SBarry Smithlimited-memory quasi-Newton approximation to the reduced Hessian matrix
*7f296bb3SBarry Smithused in the next iteration of the code. The update is skipped if it
*7f296bb3SBarry Smithcannot be performed.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_leastsquares)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Nonlinear Least-Squares
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithGiven a function $F: \mathbb R^n \to \mathbb R^m$, the nonlinear
*7f296bb3SBarry Smithleast-squares problem minimizes
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithf(x)= \| F(x) \|_2^2 = \sum_{i=1}^m F_i(x)^2.
*7f296bb3SBarry Smith$$ (eq_nlsf)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe nonlinear equations $F$ should be specified with the function
*7f296bb3SBarry Smith`TaoSetResidual()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_pounders)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bound-constrained Regularized Gauss-Newton (BRGN)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAOBRGN algorithms is a Gauss-Newton method is used to iteratively solve nonlinear least
*7f296bb3SBarry Smithsquares problem with the iterations
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithx_{k+1} = x_k - \alpha_k(J_k^T J_k)^{-1} J_k^T r(x_k)
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $r(x)$ is the least-squares residual vector,
*7f296bb3SBarry Smith$J_k = \partial r(x_k)/\partial x$ is the Jacobian of the
*7f296bb3SBarry Smithresidual, and $\alpha_k$ is the step length parameter. In other
*7f296bb3SBarry Smithwords, the Gauss-Newton method approximates the Hessian of the objective
*7f296bb3SBarry Smithas $H_k \approx (J_k^T J_k)$ and the gradient of the objective as
*7f296bb3SBarry Smith$g_k \approx -J_k r(x_k)$. The least-squares Jacobian, $J$,
*7f296bb3SBarry Smithshould be provided to Tao using `TaoSetJacobianResidual()` routine.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe BRGN (`-tao_type brgn`) implementation adds a regularization term $\beta(x)$ such
*7f296bb3SBarry Smiththat
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\min_{x} \; \frac{1}{2}||R(x)||_2^2 + \lambda\beta(x),
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $\lambda$ is the scalar weight of the regularizer. BRGN
*7f296bb3SBarry Smithprovides two default implementations for $\beta(x)$:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- **L2-norm** - $\beta(x) = \frac{1}{2}||x_k||_2^2$
*7f296bb3SBarry Smith- **L2-norm Proximal Point** -
*7f296bb3SBarry Smith  $\beta(x) = \frac{1}{2}||x_k - x_{k-1}||_2^2$
*7f296bb3SBarry Smith- **L1-norm with Dictionary** -
*7f296bb3SBarry Smith  $\beta(x) = ||Dx||_1 \approx \sum_{i} \sqrt{y_i^2 + \epsilon^2}-\epsilon$
*7f296bb3SBarry Smith  where $y = Dx$ and $\epsilon$ is the smooth approximation
*7f296bb3SBarry Smith  parameter.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe regularizer weight can be controlled with either
*7f296bb3SBarry Smith`TaoBRGNSetRegularizerWeight()` or `-tao_brgn_regularizer_weight`
*7f296bb3SBarry Smithcommand line option, while the smooth approximation parameter can be set
*7f296bb3SBarry Smithwith either `TaoBRGNSetL1SmoothEpsilon()` or
*7f296bb3SBarry Smith`-tao_brgn_l1_smooth_epsilon`. For the L1-norm term, the user can
*7f296bb3SBarry Smithsupply a dictionary matrix with `TaoBRGNSetDictionaryMatrix()`. If no
*7f296bb3SBarry Smithdictionary is provided, the dictionary is assumed to be an identity
*7f296bb3SBarry Smithmatrix and the regularizer reduces to a sparse solution term.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe regularization selection can be made using the command line option
*7f296bb3SBarry Smith`-tao_brgn_regularization_type <l2pure, l2prox, l1dict, user>` where the `user` option allows
*7f296bb3SBarry Smiththe user to define a custom $\mathcal{C}2$-continuous
*7f296bb3SBarry Smithregularization term. This custom term can be defined by using the
*7f296bb3SBarry Smithinterface functions:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- `TaoBRGNSetRegularizerObjectiveAndGradientRoutine()` - Provide
*7f296bb3SBarry Smith  user-call back for evaluating the function value and gradient
*7f296bb3SBarry Smith  evaluation for the regularization term.
*7f296bb3SBarry Smith- `TaoBRGNSetRegularizerHessianRoutine()` - Provide user call-back
*7f296bb3SBarry Smith  for evaluating the Hessian of the regularization term.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### POUNDERS
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne algorithm for solving the least squares problem
*7f296bb3SBarry Smith({eq}`eq_nlsf`) when the Jacobian of the residual vector
*7f296bb3SBarry Smith$F$ is unavailable is the model-based POUNDERS (Practical
*7f296bb3SBarry SmithOptimization Using No Derivatives for sums of Squares) algorithm
*7f296bb3SBarry Smith(`tao_pounders`). POUNDERS employs a derivative-free trust-region
*7f296bb3SBarry Smithframework as described in {cite}`dfobook` in order to
*7f296bb3SBarry Smithconverge to local minimizers. An example of this version of POUNDERS
*7f296bb3SBarry Smithapplied to a practical least-squares problem can be found in
*7f296bb3SBarry Smith{cite}`unedf0`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Derivative-Free Trust-Region Algorithm
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn each iteration $k$, the algorithm maintains a model
*7f296bb3SBarry Smith$m_k(x)$, described below, of the nonlinear least squares function
*7f296bb3SBarry Smith$f$ centered about the current iterate $x_k$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIf one assumes that the maximum number of function evaluations has not
*7f296bb3SBarry Smithbeen reached and that $\|\nabla m_k(x_k)\|_2>$`gtol`, the next
*7f296bb3SBarry Smithpoint $x_+$ to be evaluated is obtained by solving the
*7f296bb3SBarry Smithtrust-region subproblem
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\min\left\{
*7f296bb3SBarry Smith m_k(x) :
*7f296bb3SBarry Smith \|x-x_k\|_{p} \leq \Delta_k,
*7f296bb3SBarry Smith \right \},
*7f296bb3SBarry Smith$$ (eq_poundersp)
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $\Delta_k$ is the current trust-region radius. By default we
*7f296bb3SBarry Smithuse a trust-region norm with $p=\infty$ and solve
*7f296bb3SBarry Smith({eq}`eq_poundersp`) with the BLMVM method described in
*7f296bb3SBarry Smith{any}`sec_tao_blmvm`. While the subproblem is a
*7f296bb3SBarry Smithbound-constrained quadratic program, it may not be convex and the BQPIP
*7f296bb3SBarry Smithand GPCG methods may not solve the subproblem. Therefore, a bounded
*7f296bb3SBarry SmithNewton-Krylov Method should be used; the default is the BNTR
*7f296bb3SBarry Smithalgorithm. Note: BNTR uses its own internal
*7f296bb3SBarry Smithtrust region that may interfere with the infinity-norm trust region used
*7f296bb3SBarry Smithin the model problem ({eq}`eq_poundersp`).
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe residual vector is then evaluated to obtain $F(x_+)$ and hence
*7f296bb3SBarry Smith$f(x_+)$. The ratio of actual decrease to predicted decrease,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\rho_k = \frac{f(x_k)-f(x_+)}{m_k(x_k)-m_k(x_+)},
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithas well as an indicator, `valid`, on the model’s quality of
*7f296bb3SBarry Smithapproximation on the trust region is then used to update the iterate,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithx_{k+1} = \left\{\begin{array}{ll}
*7f296bb3SBarry Smithx_+ & \text{if } \rho_k \geq \eta_1 \\
*7f296bb3SBarry Smithx_+ & \text{if } 0<\rho_k <\eta_1  \text{ and \texttt{valid}=\texttt{true}}
*7f296bb3SBarry Smith\\
*7f296bb3SBarry Smithx_k & \text{else},
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith\right.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithand trust-region radius,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\Delta_{k+1} = \left\{\begin{array}{ll}
*7f296bb3SBarry Smith \text{min}(\gamma_1\Delta_k, \Delta_{\max}) & \text{if } \rho_k \geq
*7f296bb3SBarry Smith\eta_1 \text{ and } \|x_+-x_k\|_p\geq \omega_1\Delta_k \\
*7f296bb3SBarry Smith\gamma_0\Delta_k & \text{if } \rho_k < \eta_1 \text{ and
*7f296bb3SBarry Smith\texttt{valid}=\texttt{true}} \\
*7f296bb3SBarry Smith\Delta_k &  \text{else,}
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith\right.
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere $0 < \eta_1 < 1$, $0 < \gamma_0 < 1 < \gamma_1$,
*7f296bb3SBarry Smith$0<\omega_1<1$, and $\Delta_{\max}$ are constants.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIf $\rho_k\leq 0$ and `valid` is `false`, the iterate and
*7f296bb3SBarry Smithtrust-region radius remain unchanged after the above updates, and the
*7f296bb3SBarry Smithalgorithm tests whether the direction $x_+-x_k$ improves the
*7f296bb3SBarry Smithmodel. If not, the algorithm performs an additional evaluation to obtain
*7f296bb3SBarry Smith$F(x_k+d_k)$, where $d_k$ is a model-improving direction.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe iteration counter is then updated, and the next model $m_{k}$
*7f296bb3SBarry Smithis obtained as described next.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Forming the Trust-Region Model
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIn each iteration, POUNDERS uses a subset of the available evaluated
*7f296bb3SBarry Smithresidual vectors $\{ F(y_1), F(y_2), \cdots \}$ to form an
*7f296bb3SBarry Smithinterpolatory quadratic model of each residual component. The $m$
*7f296bb3SBarry Smithquadratic models
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithq_k^{(i)}(x) =
*7f296bb3SBarry Smith F_i(x_k) + (x-x_k)^T g_k^{(i)} + \frac{1}{2} (x-x_k)^T H_k^{(i)} (x-x_k),
*7f296bb3SBarry Smith \qquad i = 1, \ldots, m
*7f296bb3SBarry Smith$$ (eq_models)
*7f296bb3SBarry Smith
*7f296bb3SBarry Smiththus satisfy the interpolation conditions
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithq_k^{(i)}(y_j) = F_i(y_j), \qquad i=1, \ldots, m; \, j=1,\ldots , l_k
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithon a common interpolation set $\{y_1, \cdots , y_{l_k}\}$ of size
*7f296bb3SBarry Smith$l_k\in[n+1,$`npmax`$]$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe gradients and Hessians of the models in
*7f296bb3SBarry Smith{any}`eq_models` are then used to construct the main
*7f296bb3SBarry Smithmodel,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smithm_k(x) = f(x_k) +
*7f296bb3SBarry Smith$$ (eq_newton2)
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith2(x-x_k)^T \sum_{i=1}^{m} F_i(x_k) g_k^{(i)} + (x-x_k)^T \sum_{i=1}^{m} \left( g_k^{(i)} \left(g_k^{(i)}\right)^T +  F_i(x_k) H_k^{(i)}\right) (x-x_k).
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe process of forming these models also computes the indicator
*7f296bb3SBarry Smith`valid` of the model’s local quality.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Parameters
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithPOUNDERS supports the following parameters that can be set from the
*7f296bb3SBarry Smithcommand line or PETSc options file:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-tao_pounders_delta <delta>`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: The initial trust-region radius ($>0$, real). This is used to
*7f296bb3SBarry Smith  determine the size of the initial neighborhood within which the
*7f296bb3SBarry Smith  algorithm should look.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-tao_pounders_npmax <npmax>`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: The maximum number of interpolation points used ($n+2\leq$
*7f296bb3SBarry Smith  `npmax` $\leq 0.5(n+1)(n+2)$). This input is made available
*7f296bb3SBarry Smith  to advanced users. We recommend the default value
*7f296bb3SBarry Smith  (`npmax`$=2n+1$) be used by others.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-tao_pounders_gqt`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: Use the gqt algorithm to solve the
*7f296bb3SBarry Smith  subproblem ({eq}`eq_poundersp`) (uses $p=2$)
*7f296bb3SBarry Smith  instead of BQPIP.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith`-pounders_subsolver`
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith: If the default BQPIP algorithm is used to solve the
*7f296bb3SBarry Smith  subproblem ({eq}`eq_poundersp`), the parameters of
*7f296bb3SBarry Smith  the subproblem solver can be accessed using the command line options
*7f296bb3SBarry Smith  prefix `-pounders_subsolver_`. For example,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  ```
*7f296bb3SBarry Smith  -pounders_subsolver_tao_gatol 1.0e-5
*7f296bb3SBarry Smith  ```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  sets the gradient tolerance of the subproblem solver to
*7f296bb3SBarry Smith  $10^{-5}$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAdditionally, the user provides an initial solution vector, a vector for
*7f296bb3SBarry Smithstoring the separable objective function, and a routine for evaluating
*7f296bb3SBarry Smiththe residual vector $F$. These are described in detail in
*7f296bb3SBarry Smith{any}`sec_tao_fghj` and
*7f296bb3SBarry Smith{any}`sec_tao_evalsof`. Here we remark that because gradient
*7f296bb3SBarry Smithinformation is not available for scaling purposes, it can be useful to
*7f296bb3SBarry Smithensure that the problem is reasonably well scaled. A simple way to do so
*7f296bb3SBarry Smithis to rescale the decision variables $x$ so that their typical
*7f296bb3SBarry Smithvalues are expected to lie within the unit hypercube $[0,1]^n$.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith##### Convergence Notes
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBecause the gradient function is not provided to POUNDERS, the norm of
*7f296bb3SBarry Smiththe gradient of the objective function is not available. Therefore, for
*7f296bb3SBarry Smithconvergence criteria, this norm is approximated by the norm of the model
*7f296bb3SBarry Smithgradient and used only when the model gradient is deemed to be a
*7f296bb3SBarry Smithreasonable approximation of the gradient of the objective. In practice,
*7f296bb3SBarry Smiththe typical grounds for termination for expensive derivative-free
*7f296bb3SBarry Smithproblems is the maximum number of function evaluations allowed.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_complementarity)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Complementarity
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithMixed complementarity problems, or box-constrained variational
*7f296bb3SBarry Smithinequalities, are related to nonlinear systems of equations. They are
*7f296bb3SBarry Smithdefined by a continuously differentiable function,
*7f296bb3SBarry Smith$F:\mathbb R^n \to \mathbb R^n$, and bounds,
*7f296bb3SBarry Smith$\ell \in \{\mathbb R\cup \{-\infty\}\}^n$ and
*7f296bb3SBarry Smith$u \in \{\mathbb R\cup \{\infty\}\}^n$, on the variables such that
*7f296bb3SBarry Smith$\ell \leq u$. Given this information,
*7f296bb3SBarry Smith$\mathbf{x}^* \in [\ell,u]$ is a solution to
*7f296bb3SBarry SmithMCP($F$, $\ell$, $u$) if for each
*7f296bb3SBarry Smith$i \in \{1, \ldots, n\}$ we have at least one of the following:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry SmithF_i(x^*) \geq 0 & \text{if } x^*_i = \ell_i \\
*7f296bb3SBarry SmithF_i(x^*) = 0 & \text{if } \ell_i < x^*_i < u_i \\
*7f296bb3SBarry SmithF_i(x^*) \leq 0 & \text{if } x^*_i = u_i.
*7f296bb3SBarry Smith\end{array}\end{aligned}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNote that when $\ell = \{-\infty\}^n$ and
*7f296bb3SBarry Smith$u = \{\infty\}^n$, we have a nonlinear system of equations, and
*7f296bb3SBarry Smith$\ell = \{0\}^n$ and $u = \{\infty\}^n$ correspond to the
*7f296bb3SBarry Smithnonlinear complementarity problem {cite}`cottle:nonlinear`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSimple complementarity conditions arise from the first-order optimality
*7f296bb3SBarry Smithconditions from optimization
*7f296bb3SBarry Smith{cite}`karush:minima` {cite}`kuhn.tucker:nonlinear`. In the simple
*7f296bb3SBarry Smithbound-constrained optimization case, these conditions correspond to
*7f296bb3SBarry SmithMCP($\nabla f$, $\ell$, $u$), where
*7f296bb3SBarry Smith$f: \mathbb R^n \to \mathbb R$ is the objective function. In a
*7f296bb3SBarry Smithone-dimensional setting these conditions are intuitive. If the solution
*7f296bb3SBarry Smithis at the lower bound, then the function must be increasing and
*7f296bb3SBarry Smith$\nabla f \geq 0$. If the solution is at the upper bound, then the
*7f296bb3SBarry Smithfunction must be decreasing and $\nabla f \leq 0$. If the solution
*7f296bb3SBarry Smithis strictly between the bounds, we must be at a stationary point and
*7f296bb3SBarry Smith$\nabla f = 0$. Other complementarity problems arise in economics
*7f296bb3SBarry Smithand engineering {cite}`ferris.pang:engineering`, game theory
*7f296bb3SBarry Smith{cite}`nash:equilibrium`, and finance
*7f296bb3SBarry Smith{cite}`huang.pang:option`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithEvaluation routines for $F$ and its Jacobian must be supplied
*7f296bb3SBarry Smithprior to solving the application. The bounds, $[\ell,u]$, on the
*7f296bb3SBarry Smithvariables must also be provided. If no starting point is supplied, a
*7f296bb3SBarry Smithdefault starting point of all zeros is used.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Semismooth Methods
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO has two implementations of semismooth algorithms
*7f296bb3SBarry Smith{cite}`munson.facchinei.ea:semismooth` {cite}`deluca.facchinei.ea:semismooth`
*7f296bb3SBarry Smith{cite}`facchinei.fischer.ea:semismooth` for solving mixed complementarity
*7f296bb3SBarry Smithproblems. Both are based on a reformulation of the mixed complementarity
*7f296bb3SBarry Smithproblem as a nonsmooth system of equations using the Fischer-Burmeister
*7f296bb3SBarry Smithfunction {cite}`fischer:special`. A nonsmooth Newton method
*7f296bb3SBarry Smithis applied to the reformulated system to calculate a solution. The
*7f296bb3SBarry Smiththeoretical properties of such methods are detailed in the
*7f296bb3SBarry Smithaforementioned references.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe Fischer-Burmeister function, $\phi:\mathbb R^2 \to \mathbb R$,
*7f296bb3SBarry Smithis defined as
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith\phi(a,b) := \sqrt{a^2 + b^2} - a - b.\end{aligned}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis function has the following key property,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith\begin{array}{lcr}
*7f296bb3SBarry Smith        \phi(a,b) = 0 & \Leftrightarrow & a \geq 0,\; b \geq 0,\; ab = 0,
*7f296bb3SBarry Smith\end{array}\end{aligned}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithused when reformulating the mixed complementarity problem as the system
*7f296bb3SBarry Smithof equations $\Phi(x) = 0$, where
*7f296bb3SBarry Smith$\Phi:\mathbb R^n \to \mathbb R^n$. The reformulation is defined
*7f296bb3SBarry Smithcomponentwise as
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith\Phi_i(x) := \left\{ \begin{array}{ll}
*7f296bb3SBarry Smith   \phi(x_i - l_i, F_i(x)) & \text{if } -\infty < l_i < u_i = \infty, \\
*7f296bb3SBarry Smith   -\phi(u_i-x_i, -F_i(x)) & \text{if } -\infty = l_i < u_i < \infty, \\
*7f296bb3SBarry Smith   \phi(x_i - l_i, \phi(u_i - x_i, - F_i(x))) & \text{if } -\infty < l_i < u_i < \infty, \\
*7f296bb3SBarry Smith   -F_i(x) & \text{if } -\infty = l_i < u_i = \infty, \\
*7f296bb3SBarry Smith   l_i - x_i & \text{if } -\infty < l_i = u_i < \infty.
*7f296bb3SBarry Smith   \end{array} \right.\end{aligned}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithWe note that $\Phi$ is not differentiable everywhere but satisfies
*7f296bb3SBarry Smitha semismoothness property
*7f296bb3SBarry Smith{cite}`mifflin:semismooth` {cite}`qi:convergence` {cite}`qi.sun:nonsmooth`.
*7f296bb3SBarry SmithFurthermore, the natural merit function,
*7f296bb3SBarry Smith$\Psi(x) := \frac{1}{2} \| \Phi(x) \|_2^2$, is continuously
*7f296bb3SBarry Smithdifferentiable.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe two semismooth TAO solvers both solve the system $\Phi(x) = 0$
*7f296bb3SBarry Smithby applying a nonsmooth Newton method with a line search. We calculate a
*7f296bb3SBarry Smithdirection, $d^k$, by solving the system
*7f296bb3SBarry Smith$H^kd^k = -\Phi(x^k)$, where $H^k$ is an element of the
*7f296bb3SBarry Smith$B$-subdifferential {cite}`qi.sun:nonsmooth` of
*7f296bb3SBarry Smith$\Phi$ at $x^k$. If the direction calculated does not
*7f296bb3SBarry Smithsatisfy a suitable descent condition, then we use the negative gradient
*7f296bb3SBarry Smithof the merit function, $-\nabla \Psi(x^k)$, as the search
*7f296bb3SBarry Smithdirection. A standard Armijo search
*7f296bb3SBarry Smith{cite}`armijo:minimization` is used to find the new
*7f296bb3SBarry Smithiteration. Nonmonotone searches
*7f296bb3SBarry Smith{cite}`grippo.lampariello.ea:nonmonotone` are also available
*7f296bb3SBarry Smithby setting appropriate runtime options. See
*7f296bb3SBarry Smith{any}`sec_tao_linesearch` for further details.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe first semismooth algorithm available in TAO is not guaranteed to
*7f296bb3SBarry Smithremain feasible with respect to the bounds, $[\ell, u]$, and is
*7f296bb3SBarry Smithtermed an infeasible semismooth method. This method can be specified by
*7f296bb3SBarry Smithusing the `tao_ssils` solver. In this case, the descent test used is
*7f296bb3SBarry Smiththat
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{aligned}
*7f296bb3SBarry Smith\nabla \Psi(x^k)^Td^k \leq -\delta\| d^k \|^\rho.\end{aligned}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBoth $\delta > 0$ and $\rho > 2$ can be modified by using
*7f296bb3SBarry Smiththe runtime options `-tao_ssils_delta <delta>` and
*7f296bb3SBarry Smith`-tao_ssils_rho <rho>`, respectively. By default,
*7f296bb3SBarry Smith$\delta = 10^{-10}$ and $\rho = 2.1$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAn alternative is to remain feasible with respect to the bounds by using
*7f296bb3SBarry Smitha projected Armijo line search. This method can be specified by using
*7f296bb3SBarry Smiththe `tao_ssfls` solver. The descent test used is the same as above
*7f296bb3SBarry Smithwhere the direction in this case corresponds to the first part of the
*7f296bb3SBarry Smithpiecewise linear arc searched by the projected line search. Both
*7f296bb3SBarry Smith$\delta > 0$ and $\rho > 2$ can be modified by using the
*7f296bb3SBarry Smithruntime options `-tao_ssfls_delta <delta>` and
*7f296bb3SBarry Smith`-tao_ssfls_rho <rho>` respectively. By default,
*7f296bb3SBarry Smith$\delta = 10^{-10}$ and $\rho = 2.1$.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe recommended algorithm is the infeasible semismooth method,
*7f296bb3SBarry Smith`tao_ssils`, because of its strong global and local convergence
*7f296bb3SBarry Smithproperties. However, if it is known that $F$ is not defined
*7f296bb3SBarry Smithoutside of the box, $[\ell,u]$, perhaps because of the presence of
*7f296bb3SBarry Smith$\log$ functions, the feasibility-enforcing version of the
*7f296bb3SBarry Smithalgorithm, `tao_ssfls`, is a reasonable alternative.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Active-Set Methods
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO also contained two active-set semismooth methods for solving
*7f296bb3SBarry Smithcomplementarity problems. These methods solve a reduced system
*7f296bb3SBarry Smithconstructed by block elimination of active constraints. The
*7f296bb3SBarry Smithsubdifferential in these cases enables this block elimination.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe first active-set semismooth algorithm available in TAO is not guaranteed to
*7f296bb3SBarry Smithremain feasible with respect to the bounds, $[\ell, u]$, and is
*7f296bb3SBarry Smithtermed an infeasible active-set semismooth method. This method can be
*7f296bb3SBarry Smithspecified by using the `tao_asils` solver.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAn alternative is to remain feasible with respect to the bounds by using
*7f296bb3SBarry Smitha projected Armijo line search. This method can be specified by using
*7f296bb3SBarry Smiththe `tao_asfls` solver.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_quadratic)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Quadratic Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithQuadratic solvers solve optimization problems of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith\begin{array}{ll}
*7f296bb3SBarry Smith\displaystyle \min_{x} & \frac{1}{2}x^T Q x + c^T x \\
*7f296bb3SBarry Smith\text{subject to} & l \geq x \geq u
*7f296bb3SBarry Smith\end{array}
*7f296bb3SBarry Smith$$
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere the gradient and the Hessian of the objective are both constant.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Gradient Projection Conjugate Gradient Method (GPCG)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe GPCG {cite}`more-toraldo` algorithm is much like the
*7f296bb3SBarry SmithTRON algorithm, discussed in Section {any}`sec_tao_tron`, except that
*7f296bb3SBarry Smithit assumes that the objective function is quadratic and convex.
*7f296bb3SBarry SmithTherefore, it evaluates the function, gradient, and Hessian only once.
*7f296bb3SBarry SmithSince the objective function is quadratic, the algorithm does not use a
*7f296bb3SBarry Smithtrust region. All the options that apply to TRON except for trust-region
*7f296bb3SBarry Smithoptions also apply to GPCG. It can be set by using the TAO solver
*7f296bb3SBarry Smith`tao_gpcg` or via the optio flag `-tao_type gpcg`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_bqpip)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Interior-Point Newton’s Method (BQPIP)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe BQPIP algorithm is an interior-point method for bound constrained
*7f296bb3SBarry Smithquadratic optimization. It can be set by using the TAO solver of
*7f296bb3SBarry Smith`tao_bqpip` or via the option flag `-tao_type bgpip`. Since it
*7f296bb3SBarry Smithassumes the objective function is quadratic, it evaluates the function,
*7f296bb3SBarry Smithgradient, and Hessian only once. This method also requires the solution
*7f296bb3SBarry Smithof systems of linear equations, whose solver can be accessed and
*7f296bb3SBarry Smithmodified with the command `TaoGetKSP()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Legacy and Contributed Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bundle Method for Regularized Risk Minimization (BMRM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBMRM is a numerical approach to optimizing an
*7f296bb3SBarry Smithunconstrained objective in the form of
*7f296bb3SBarry Smith$f(x) + 0.5 * \lambda \| x \|^2$. Here $f$ is a convex
*7f296bb3SBarry Smithfunction that is finite on the whole space. $\lambda$ is a
*7f296bb3SBarry Smithpositive weight parameter, and $\| x \|$ is the Euclidean norm of
*7f296bb3SBarry Smith$x$. The algorithm only requires a routine which, given an
*7f296bb3SBarry Smith$x$, returns the value of $f(x)$ and the gradient of
*7f296bb3SBarry Smith$f$ at $x$.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Orthant-Wise Limited-memory Quasi-Newton (OWLQN)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOWLQN {cite}`owlqn` is a numerical approach to optimizing
*7f296bb3SBarry Smithan unconstrained objective in the form of
*7f296bb3SBarry Smith$f(x) + \lambda \|x\|_1$. Here f is a convex and differentiable
*7f296bb3SBarry Smithfunction, $\lambda$ is a positive weight parameter, and
*7f296bb3SBarry Smith$\| x \|_1$ is the $\ell_1$ norm of $x$:
*7f296bb3SBarry Smith$\sum_i |x_i|$. The algorithm only requires evaluating the value
*7f296bb3SBarry Smithof $f$ and its gradient.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_tron)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Trust-Region Newton Method (TRON)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TRON {cite}`lin_c3` algorithm is an active-set method
*7f296bb3SBarry Smiththat uses a combination of gradient projections and a preconditioned
*7f296bb3SBarry Smithconjugate gradient method to minimize an objective function. Each
*7f296bb3SBarry Smithiteration of the TRON algorithm requires function, gradient, and Hessian
*7f296bb3SBarry Smithevaluations. In each iteration, the algorithm first applies several
*7f296bb3SBarry Smithconjugate gradient iterations. After these iterates, the TRON solver
*7f296bb3SBarry Smithmomentarily ignores the variables that equal one of its bounds and
*7f296bb3SBarry Smithapplies a preconditioned conjugate gradient method to a quadratic model
*7f296bb3SBarry Smithof the remaining set of *free* variables.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TRON algorithm solves a reduced linear system defined by the rows
*7f296bb3SBarry Smithand columns corresponding to the variables that lie between the upper
*7f296bb3SBarry Smithand lower bounds. The TRON algorithm applies a trust region to the
*7f296bb3SBarry Smithconjugate gradients to ensure convergence. The initial trust-region
*7f296bb3SBarry Smithradius can be set by using the command
*7f296bb3SBarry Smith`TaoSetInitialTrustRegionRadius()`, and the current trust region size
*7f296bb3SBarry Smithcan be found by using the command `TaoGetCurrentTrustRegionRadius()`.
*7f296bb3SBarry SmithThe initial trust region can significantly alter the rate of convergence
*7f296bb3SBarry Smithfor the algorithm and should be tuned and adjusted for optimal
*7f296bb3SBarry Smithperformance.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version in favor of the
*7f296bb3SBarry SmithBounded Newton Trust Region (BNTR) algorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_blmvm)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Bound-constrained Limited-Memory Variable-Metric Method (BLMVM)
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBLMVM is a limited-memory, variable-metric method and is the
*7f296bb3SBarry Smithbound-constrained variant of the LMVM method for unconstrained
*7f296bb3SBarry Smithoptimization. It uses projected gradients to approximate the Hessian,
*7f296bb3SBarry Smitheliminating the need for Hessian evaluations. The method can be set by
*7f296bb3SBarry Smithusing the TAO solver `tao_blmvm`. For more details, please see the
*7f296bb3SBarry SmithLMVM section in the unconstrained algorithms as well as the LMVM matrix
*7f296bb3SBarry Smithdocumentation in the PETSc manual.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis algorithm will be deprecated in the next version in favor of the
*7f296bb3SBarry SmithBounded Quasi-Newton Line Search (BQNLS) algorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith## Advanced Options
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis section discusses options and routines that apply to most TAO
*7f296bb3SBarry Smithsolvers and problem classes. In particular, we focus on linear solvers,
*7f296bb3SBarry Smithconvergence tests, and line searches.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_linearsolvers)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Linear Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne of the most computationally intensive phases of many optimization
*7f296bb3SBarry Smithalgorithms involves the solution of linear systems of equations. The
*7f296bb3SBarry Smithperformance of the linear solver may be critical to an efficient
*7f296bb3SBarry Smithcomputation of the solution. Since linear equation solvers often have a
*7f296bb3SBarry Smithwide variety of options associated with them, TAO allows the user to
*7f296bb3SBarry Smithaccess the linear solver with the
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoGetKSP(Tao, KSP *);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithcommand. With access to the KSP object, users can customize it for their
*7f296bb3SBarry Smithapplication to achieve improved performance. Additional details on the
*7f296bb3SBarry SmithKSP options in PETSc can be found in the {doc}`/manual/index`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Monitors
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBy default the TAO solvers run silently without displaying information
*7f296bb3SBarry Smithabout the iterations. The user can initiate monitoring with the command
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoMonitorSet(Tao, PetscErrorCode (*mon)(Tao,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe routine `mon` indicates a user-defined monitoring routine, and
*7f296bb3SBarry Smith`void*` denotes an optional user-defined context for private data for
*7f296bb3SBarry Smiththe monitor routine.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe routine set by `TaoMonitorSet()` is called once during each
*7f296bb3SBarry Smithiteration of the optimization solver. Hence, the user can employ this
*7f296bb3SBarry Smithroutine for any application-specific computations that should be done
*7f296bb3SBarry Smithafter the solution update.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_convergence)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Convergence Tests
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithConvergence of a solver can be defined in many ways. The methods TAO
*7f296bb3SBarry Smithuses by default are mentioned in {any}`sec_tao_customize`.
*7f296bb3SBarry SmithThese methods include absolute and relative convergence tolerances as
*7f296bb3SBarry Smithwell as a maximum number of iterations of function evaluations. If these
*7f296bb3SBarry Smithchoices are not sufficient, the user can specify a customized test
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithUsers can set their own customized convergence tests of the form
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode  conv(Tao, void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe second argument is a pointer to a structure defined by the user.
*7f296bb3SBarry SmithWithin this routine, the solver can be queried for the solution vector,
*7f296bb3SBarry Smithgradient vector, or other statistic at the current iteration through
*7f296bb3SBarry Smithroutines such as `TaoGetSolutionStatus()` and `TaoGetTolerances()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTo use this convergence test within a TAO solver, one uses the command
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetConvergenceTest(Tao, PetscErrorCode (*conv)(Tao,void*), void*);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe second argument of this command is the convergence routine, and the
*7f296bb3SBarry Smithfinal argument of the convergence test routine denotes an optional
*7f296bb3SBarry Smithuser-defined context for private data. The convergence routine receives
*7f296bb3SBarry Smiththe TAO solver and this private data structure. The termination flag can
*7f296bb3SBarry Smithbe set by using the routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoSetConvergedReason(Tao, TaoConvergedReason);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_linesearch)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Line Searches
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithBy using the command line option `-tao_ls_type`. Available line
*7f296bb3SBarry Smithsearches include Moré-Thuente {cite}`more:92`, Armijo, gpcg,
*7f296bb3SBarry Smithand unit.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe line search routines involve several parameters, which are set to
*7f296bb3SBarry Smithdefaults that are reasonable for many applications. The user can
*7f296bb3SBarry Smithoverride the defaults by using the following options
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith- `-tao_ls_max_funcs <max>`
*7f296bb3SBarry Smith- `-tao_ls_stepmin <min>`
*7f296bb3SBarry Smith- `-tao_ls_stepmax <max>`
*7f296bb3SBarry Smith- `-tao_ls_ftol <ftol>`
*7f296bb3SBarry Smith- `-tao_ls_gtol <gtol>`
*7f296bb3SBarry Smith- `-tao_ls_rtol <rtol>`
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne should run a TAO program with the option `-help` for details.
*7f296bb3SBarry SmithUsers may write their own customized line search codes by modeling them
*7f296bb3SBarry Smithafter one of the defaults provided.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_recyclehistory)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Recycling History
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithSome TAO algorithms can re-use information accumulated in the previous
*7f296bb3SBarry Smith`TaoSolve()` call to hot-start the new solution. This can be enabled
*7f296bb3SBarry Smithusing the `-tao_recycle_history` flag, or in code via the
*7f296bb3SBarry Smith`TaoSetRecycleHistory()` interface.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor the nonlinear conjugate gradient solver (`TAOBNCG`), this option
*7f296bb3SBarry Smithre-uses the latest search direction from the previous `TaoSolve()`
*7f296bb3SBarry Smithcall to compute the initial search direction of a new `TaoSolve()`. By
*7f296bb3SBarry Smithdefault, the feature is disabled and the algorithm sets the initial
*7f296bb3SBarry Smithdirection as the negative gradient.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFor the quasi-Newton family of methods (`TAOBQNLS`, `TAOBQNKLS`,
*7f296bb3SBarry Smith`TAOBQNKTR`, `TAOBQNKTL`), this option re-uses the accumulated
*7f296bb3SBarry Smithquasi-Newton Hessian approximation from the previous `TaoSolve()`
*7f296bb3SBarry Smithcall. By default, the feature is disabled and the algorithm will reset
*7f296bb3SBarry Smiththe quasi-Newton approximation to the identity matrix at the beginning
*7f296bb3SBarry Smithof every new `TaoSolve()`.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe option flag has no effect on other TAO solvers.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(sec_tao_addsolver)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith## Adding a Solver
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOne of the strengths of both TAO and PETSc is the ability to allow users
*7f296bb3SBarry Smithto extend the built-in solvers with new user-defined algorithms. It is
*7f296bb3SBarry Smithcertainly possible to develop new optimization algorithms outside of TAO
*7f296bb3SBarry Smithframework, but Using TAO to implement a solver has many advantages,
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith1. TAO includes other optimization solvers with an identical interface,
*7f296bb3SBarry Smith   so application problems may conveniently switch solvers to compare
*7f296bb3SBarry Smith   their effectiveness.
*7f296bb3SBarry Smith2. TAO provides support for function evaluations and derivative
*7f296bb3SBarry Smith   information. It allows for the direct evaluation of this information
*7f296bb3SBarry Smith   by the application developer, contains limited support for finite
*7f296bb3SBarry Smith   difference approximations, and allows the uses of matrix-free
*7f296bb3SBarry Smith   methods. The solvers can obtain this function and derivative
*7f296bb3SBarry Smith   information through a simple interface while the details of its
*7f296bb3SBarry Smith   computation are handled within the toolkit.
*7f296bb3SBarry Smith3. TAO provides line searches, convergence tests, monitoring routines,
*7f296bb3SBarry Smith   and other tools that are helpful in an optimization algorithm. The
*7f296bb3SBarry Smith   availability of these tools means that the developers of the
*7f296bb3SBarry Smith   optimization solver do not have to write these utilities.
*7f296bb3SBarry Smith4. PETSc offers vectors, matrices, index sets, and linear solvers that
*7f296bb3SBarry Smith   can be used by the solver. These objects are standard mathematical
*7f296bb3SBarry Smith   constructions that have many different implementations. The objects
*7f296bb3SBarry Smith   may be distributed over multiple processors, restricted to a single
*7f296bb3SBarry Smith   processor, have a dense representation, use a sparse data structure,
*7f296bb3SBarry Smith   or vary in many other ways. TAO solvers do not need to know how these
*7f296bb3SBarry Smith   objects are represented or how the operations defined on them have
*7f296bb3SBarry Smith   been implemented. Instead, the solvers apply these operations through
*7f296bb3SBarry Smith   an abstract interface that leaves the details to PETSc and external
*7f296bb3SBarry Smith   libraries. This abstraction allows solvers to work seamlessly with a
*7f296bb3SBarry Smith   variety of data structures while allowing application developers to
*7f296bb3SBarry Smith   select data structures tailored for their purposes.
*7f296bb3SBarry Smith5. PETSc provides the user a convenient method for setting options at
*7f296bb3SBarry Smith   runtime, performance profiling, and debugging.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith(header_file_1)=
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### Header File
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO solver implementation files must include the TAO implementation file
*7f296bb3SBarry Smith`taoimpl.h`:
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith#include "petsc/private/taoimpl.h"
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis file contains data elements that are generally kept hidden from
*7f296bb3SBarry Smithapplication programmers, but may be necessary for solver implementations
*7f296bb3SBarry Smithto access.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith### TAO Interface with Solvers
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO solvers must be written in C or C++ and include several routines
*7f296bb3SBarry Smithwith a particular calling sequence. Two of these routines are mandatory:
*7f296bb3SBarry Smithone that initializes the TAO structure with the appropriate information
*7f296bb3SBarry Smithand one that applies the algorithm to a problem instance. Additional
*7f296bb3SBarry Smithroutines may be written to set options within the solver, view the
*7f296bb3SBarry Smithsolver, setup appropriate data structures, and destroy these data
*7f296bb3SBarry Smithstructures. In order to implement the conjugate gradient algorithm, for
*7f296bb3SBarry Smithexample, the following structure is useful.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smithtypedef struct{
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscReal beta;
*7f296bb3SBarry Smith  PetscReal eta;
*7f296bb3SBarry Smith  PetscInt  ngradtseps;
*7f296bb3SBarry Smith  PetscInt  nresetsteps;
*7f296bb3SBarry Smith  Vec X_old;
*7f296bb3SBarry Smith  Vec G_old;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith} TAO_CG;
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis structure contains two parameters, two counters, and two work
*7f296bb3SBarry Smithvectors. Vectors for the solution and gradient are not needed here
*7f296bb3SBarry Smithbecause the TAO structure has pointers to them.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Solver Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAll TAO solvers have a routine that accepts a TAO structure and computes
*7f296bb3SBarry Smitha solution. TAO will call this routine when the application program uses
*7f296bb3SBarry Smiththe routine `TaoSolve()` and will pass to the solver information about
*7f296bb3SBarry Smiththe objective function and constraints, pointers to the variable vector
*7f296bb3SBarry Smithand gradient vector, and support for line searches, linear solvers, and
*7f296bb3SBarry Smithconvergence monitoring. As an example, consider the following code that
*7f296bb3SBarry Smithsolves an unconstrained minimization problem using the conjugate
*7f296bb3SBarry Smithgradient method.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoSolve_CG(Tao tao)
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG  *cg = (TAO_CG *) tao->data;
*7f296bb3SBarry Smith  Vec x = tao->solution;
*7f296bb3SBarry Smith  Vec g = tao->gradient;
*7f296bb3SBarry Smith  Vec s = tao->stepdirection;
*7f296bb3SBarry Smith  PetscInt     iter=0;
*7f296bb3SBarry Smith  PetscReal  gnormPrev,gdx,f,gnorm,steplength=0;
*7f296bb3SBarry Smith  TaoLineSearchConvergedReason lsflag=TAO_LINESEARCH_CONTINUE_ITERATING;
*7f296bb3SBarry Smith  TaoConvergedReason reason=TAO_CONTINUE_ITERATING;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(TaoComputeObjectiveAndGradient(tao,x,&f,g));
*7f296bb3SBarry Smith  PetscCall(VecNorm(g,NORM_2,&gnorm));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(VecSet(s,0));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  cg->beta=0;
*7f296bb3SBarry Smith  gnormPrev = gnorm;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  /* Enter loop */
*7f296bb3SBarry Smith  while (1){
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith    /* Test for convergence */
*7f296bb3SBarry Smith    PetscCall(TaoMonitor(tao,iter,f,gnorm,0.0,step,&reason));
*7f296bb3SBarry Smith    if (reason!=TAO_CONTINUE_ITERATING) break;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith    cg->beta=(gnorm*gnorm)/(gnormPrev*gnormPrev);
*7f296bb3SBarry Smith    PetscCall(VecScale(s,cg->beta));
*7f296bb3SBarry Smith    PetscCall(VecAXPY(s,-1.0,g));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith    PetscCall(VecDot(s,g,&gdx));
*7f296bb3SBarry Smith    if (gdx>=0){     /* If not a descent direction, use gradient */
*7f296bb3SBarry Smith      PetscCall(VecCopy(g,s));
*7f296bb3SBarry Smith      PetscCall(VecScale(s,-1.0));
*7f296bb3SBarry Smith      gdx=-gnorm*gnorm;
*7f296bb3SBarry Smith    }
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith    /* Line Search */
*7f296bb3SBarry Smith    gnormPrev = gnorm;  step=1.0;
*7f296bb3SBarry Smith    PetscCall(TaoLineSearchSetInitialStepLength(tao->linesearch,1.0));
*7f296bb3SBarry Smith    PetscCall(TaoLineSearchApply(tao->linesearch,x,&f,g,s,&steplength,&lsflag));
*7f296bb3SBarry Smith    PetscCall(TaoAddLineSearchCounts(tao));
*7f296bb3SBarry Smith    PetscCall(VecNorm(g,NORM_2,&gnorm));
*7f296bb3SBarry Smith    iter++;
*7f296bb3SBarry Smith  }
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe first line of this routine casts the second argument to a pointer to
*7f296bb3SBarry Smitha `TAO_CG` data structure. This structure contains pointers to three
*7f296bb3SBarry Smithvectors and a scalar that will be needed in the algorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAfter declaring an initializing several variables, the solver lets TAO
*7f296bb3SBarry Smithevaluate the function and gradient at the current point in the using the
*7f296bb3SBarry Smithroutine `TaoComputeObjectiveAndGradient()`. Other routines may be used
*7f296bb3SBarry Smithto evaluate the Hessian matrix or evaluate constraints. TAO may obtain
*7f296bb3SBarry Smiththis information using direct evaluation or other means, but these
*7f296bb3SBarry Smithdetails do not affect our implementation of the algorithm.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe norm of the gradient is a standard measure used by unconstrained
*7f296bb3SBarry Smithminimization solvers to define convergence. This quantity is always
*7f296bb3SBarry Smithnonnegative and equals zero at the solution. The solver will pass this
*7f296bb3SBarry Smithquantity, the current function value, the current iteration number, and
*7f296bb3SBarry Smitha measure of infeasibility to TAO with the routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoMonitor(Tao tao, PetscInt iter, PetscReal f,
*7f296bb3SBarry Smith               PetscReal res, PetscReal cnorm, PetscReal steplength,
*7f296bb3SBarry Smith               TaoConvergedReason *reason);
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithMost optimization algorithms are iterative, and solvers should include
*7f296bb3SBarry Smiththis command somewhere in each iteration. This routine records this
*7f296bb3SBarry Smithinformation, and applies any monitoring routines and convergence tests
*7f296bb3SBarry Smithset by default or the user. In this routine, the second argument is the
*7f296bb3SBarry Smithcurrent iteration number, and the third argument is the current function
*7f296bb3SBarry Smithvalue. The fourth argument is a nonnegative error measure associated
*7f296bb3SBarry Smithwith the distance between the current solution and the optimal solution.
*7f296bb3SBarry SmithExamples of this measure are the norm of the gradient or the square root
*7f296bb3SBarry Smithof a duality gap. The fifth argument is a nonnegative error that usually
*7f296bb3SBarry Smithrepresents a measure of the infeasibility such as the norm of the
*7f296bb3SBarry Smithconstraints or violation of bounds. This number should be zero for
*7f296bb3SBarry Smithunconstrained solvers. The sixth argument is a nonnegative steplength,
*7f296bb3SBarry Smithor the multiple of the step direction added to the previous iterate. The
*7f296bb3SBarry Smithresults of the convergence test are returned in the last argument. If
*7f296bb3SBarry Smiththe termination reason is `TAO_CONTINUE_ITERATING`, the algorithm
*7f296bb3SBarry Smithshould continue.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAfter this monitoring routine, the solver computes a step direction
*7f296bb3SBarry Smithusing the conjugate gradient algorithm and computations using Vec
*7f296bb3SBarry Smithobjects. These methods include adding vectors together and computing an
*7f296bb3SBarry Smithinner product. A full list of these methods can be found in the manual
*7f296bb3SBarry Smithpages.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNonlinear conjugate gradient algorithms also require a line search. TAO
*7f296bb3SBarry Smithprovides several line searches and support for using them. The routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoLineSearchApply(TaoLineSearch ls, Vec x, PetscReal *f, Vec g,
*7f296bb3SBarry Smith                       TaoVec *s, PetscReal *steplength,
*7f296bb3SBarry Smith                       TaoLineSearchConvergedReason *lsflag)
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithpasses the current solution, gradient, and objective value to the line
*7f296bb3SBarry Smithsearch and returns a new solution, gradient, and objective value. More
*7f296bb3SBarry Smithdetails on line searches can be found in
*7f296bb3SBarry Smith{any}`sec_tao_linesearch`. The details of the
*7f296bb3SBarry Smithline search applied are specified elsewhere, when the line search is
*7f296bb3SBarry Smithcreated.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithTAO also includes support for linear solvers using PETSc KSP objects.
*7f296bb3SBarry SmithAlthough this algorithm does not require one, linear solvers are an
*7f296bb3SBarry Smithimportant part of many algorithms. Details on the use of these solvers
*7f296bb3SBarry Smithcan be found in the PETSc users manual.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Creation Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe TAO solver is initialized for a particular algorithm in a separate
*7f296bb3SBarry Smithroutine. This routine sets default convergence tolerances, creates a
*7f296bb3SBarry Smithline search or linear solver if needed, and creates structures needed by
*7f296bb3SBarry Smiththis solver. For example, the routine that creates the nonlinear
*7f296bb3SBarry Smithconjugate gradient algorithm shown above can be implemented as follows.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPETSC_EXTERN PetscErrorCode TaoCreate_CG(Tao tao)
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
*7f296bb3SBarry Smith  const char *morethuente_type = TAOLINESEARCH_MT;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(PetscNew(&cg));
*7f296bb3SBarry Smith  tao->data = (void*)cg;
*7f296bb3SBarry Smith  cg->eta = 0.1;
*7f296bb3SBarry Smith  cg->delta_min = 1e-7;
*7f296bb3SBarry Smith  cg->delta_max = 100;
*7f296bb3SBarry Smith  cg->cg_type = CG_PolakRibierePlus;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  tao->max_it = 2000;
*7f296bb3SBarry Smith  tao->max_funcs = 4000;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  tao->ops->setup = TaoSetUp_CG;
*7f296bb3SBarry Smith  tao->ops->solve = TaoSolve_CG;
*7f296bb3SBarry Smith  tao->ops->view = TaoView_CG;
*7f296bb3SBarry Smith  tao->ops->setfromoptions = TaoSetFromOptions_CG;
*7f296bb3SBarry Smith  tao->ops->destroy = TaoDestroy_CG;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(TaoLineSearchCreate(((PetscObject)tao)->comm, &tao->linesearch));
*7f296bb3SBarry Smith  PetscCall(TaoLineSearchSetType(tao->linesearch, morethuente_type));
*7f296bb3SBarry Smith  PetscCall(TaoLineSearchUseTaoRoutines(tao->linesearch, tao));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry SmithEXTERN_C_END
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine declares some variables and then allocates memory for the
*7f296bb3SBarry Smith`TAO_CG` data structure. Notice that the `Tao` object now has a
*7f296bb3SBarry Smithpointer to this data structure (`tao->data`) so it can be accessed by
*7f296bb3SBarry Smiththe other functions written for this solver implementation.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine also sets some default parameters particular to the
*7f296bb3SBarry Smithconjugate gradient algorithm, sets default convergence tolerances, and
*7f296bb3SBarry Smithcreates a particular line search. These defaults could be specified in
*7f296bb3SBarry Smiththe routine that solves the problem, but specifying them here gives the
*7f296bb3SBarry Smithuser the opportunity to modify these parameters either by using direct
*7f296bb3SBarry Smithcalls setting parameters or by using options.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithFinally, this solver passes to TAO the names of all the other routines
*7f296bb3SBarry Smithused by the solver.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithNote that the lines `EXTERN_C_BEGIN` and `EXTERN_C_END` surround
*7f296bb3SBarry Smiththis routine. These macros are required to preserve the name of this
*7f296bb3SBarry Smithfunction without any name-mangling from the C++ compiler (if used).
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Destroy Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithAnother routine needed by most solvers destroys the data structures
*7f296bb3SBarry Smithcreated by earlier routines. For the nonlinear conjugate gradient method
*7f296bb3SBarry Smithdiscussed earlier, the following routine destroys the two work vectors
*7f296bb3SBarry Smithand the `TAO_CG` structure.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoDestroy_CG(TAO_SOLVER tao)
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG *) tao->data;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(VecDestroy(&cg->X_old));
*7f296bb3SBarry Smith  PetscCall(VecDestroy(&cg->G_old));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFree(tao->data);
*7f296bb3SBarry Smith  tao->data = NULL;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThis routine is called from within the `TaoDestroy()` routine. Only
*7f296bb3SBarry Smithalgorithm-specific data objects are destroyed in this routine; any
*7f296bb3SBarry Smithobjects indexed by TAO (`tao->linesearch`, `tao->ksp`,
*7f296bb3SBarry Smith`tao->gradient`, etc.) will be destroyed by TAO immediately after the
*7f296bb3SBarry Smithalgorithm-specific destroy routine completes.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### SetUp Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithIf the SetUp routine has been set by the initialization routine, TAO
*7f296bb3SBarry Smithwill call it during the execution of `TaoSolve()`. While this routine
*7f296bb3SBarry Smithis optional, it is often provided to allocate the gradient vector, work
*7f296bb3SBarry Smithvectors, and other data structures required by the solver. It should
*7f296bb3SBarry Smithhave the following form.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoSetUp_CG(Tao tao)
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&tao->gradient));
*7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&tao->stepdirection));
*7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&cg->X_old));
*7f296bb3SBarry Smith  PetscCall(VecDuplicate(tao->solution,&cg->G_old));
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### SetFromOptions Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe SetFromOptions routine should be used to check for any
*7f296bb3SBarry Smithalgorithm-specific options set by the user and will be called when the
*7f296bb3SBarry Smithapplication makes a call to `TaoSetFromOptions()`. It should have the
*7f296bb3SBarry Smithfollowing form.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoSetFromOptions_CG(Tao tao, void *solver);
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)solver;
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_eta","restart tolerance","",cg->eta,&cg->eta,0));
*7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_delta_min","minimum delta value","",cg->delta_min,&cg->delta_min,0));
*7f296bb3SBarry Smith  PetscCall(PetscOptionsReal("-tao_cg_delta_max","maximum delta value","",cg->delta_max,&cg->delta_max,0));
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### View Routine
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithThe View routine should be used to output any algorithm-specific
*7f296bb3SBarry Smithinformation or statistics at the end of a solve. This routine will be
*7f296bb3SBarry Smithcalled when the application makes a call to `TaoView()` or when the
*7f296bb3SBarry Smithcommand line option `-tao_view` is used. It should have the following
*7f296bb3SBarry Smithform.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithPetscErrorCode TaoView_CG(Tao tao, PetscViewer viewer)
*7f296bb3SBarry Smith{
*7f296bb3SBarry Smith  TAO_CG *cg = (TAO_CG*)tao->data;
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith  PetscFunctionBegin;
*7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPushTab(viewer));
*7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPrintf(viewer,"Grad. steps: %d\n",cg->ngradsteps));
*7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPrintf(viewer,"Reset steps: %d\n",cg->nresetsteps));
*7f296bb3SBarry Smith  PetscCall(PetscViewerASCIIPopTab(viewer));
*7f296bb3SBarry Smith  PetscFunctionReturn(PETSC_SUCCESS);
*7f296bb3SBarry Smith}
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith#### Registering the Solver
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOnce a new solver is implemented, TAO needs to know the name of the
*7f296bb3SBarry Smithsolver and what function to use to create the solver. To this end, one
*7f296bb3SBarry Smithcan use the routine
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```
*7f296bb3SBarry SmithTaoRegister(const char *name,
*7f296bb3SBarry Smith                const char *path,
*7f296bb3SBarry Smith                const char *cname,
*7f296bb3SBarry Smith                PetscErrorCode (*create) (Tao));
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smithwhere `name` is the name of the solver (i.e., `tao_blmvm`), `path`
*7f296bb3SBarry Smithis the path to the library containing the solver, `cname` is the name
*7f296bb3SBarry Smithof the routine that creates the solver (in our case, `TaoCreate_CG`),
*7f296bb3SBarry Smithand `create` is a pointer to that creation routine. If one is using
*7f296bb3SBarry Smithdynamic loading, then the fourth argument will be ignored.
*7f296bb3SBarry Smith
*7f296bb3SBarry SmithOnce the solver has been registered, the new solver can be selected
*7f296bb3SBarry Smitheither by using the `TaoSetType()` function or by using the
*7f296bb3SBarry Smith`-tao_type` command line option.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```{rubric} Footnotes
*7f296bb3SBarry Smith```
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith[^mpi]: For more on MPI and PETSc, see {any}`sec_running`.
*7f296bb3SBarry Smith
*7f296bb3SBarry Smith```{eval-rst}
*7f296bb3SBarry Smith.. bibliography:: /petsc.bib
*7f296bb3SBarry Smith   :filter: docname in docnames
*7f296bb3SBarry Smith```