next up previous contents
Next: Parameter Arrays IPARM and Up: NSPCG User's Guide Previous: Brief Background on Preconditioners

   
Brief Background on Accelerators

Each accelerator in the package applies itself to the preconditioned system

(QL-1AQR-1)(QR u) = (QL-1b)

which we will represent abstractly as

\begin{displaymath}{\bf A}{\bf u}= {\bf b}\end{displaymath}

The package contains two classes of accelerators. The first class, comprising the accelerators CG, SI, SRCG, SRSI and SOR, is best suited for the symmetrizable case, that is, when the matrices A and $Q\equiv Q_LQ_R$ are symmetric positive definite (SPD). These symmetric methods allow only left preconditioning, and they are designed to take full advantage of the symmetry of A and Q.

The second class comprises those accelerators designed to work with more general problems than the symmetrizable case. These accelerators in many cases allow for right and two-sided as well as left preconditioning. They are best understood and will be discussed here in terms of the system involving the abstract matrix ${\bf A}$.

Before giving practical usage considerations for the accelerators, we will give a brief description of the solution technique of each method. Each acceleration procedure is basically defined by two criteria:

1.
The solution is to be in some space, typically a shifted Krylov space, and
2.
The solution is selected from this space by some condition, typically an orthogonality condition.
Here the Krylov space is given by

\begin{displaymath}K_n(v,A)={\rm span}\{ A^i v\} _{i=0}^{n-1}. \end{displaymath}

For certain conditions on A, QL and QR, the orthogonality condition is equivalent to a minimization property over the space. That is, u(n) is chosen to minimize the quantity

\begin{displaymath}\langle u^{(n)}-\bar u,ZQ^{-1}A(u^{(n)}-\bar u)\rangle\end{displaymath}

where $\bar u$ denotes the true solution to the original problem Au=b, and $\langle\cdot ,\cdot \rangle$ denotes the standard inner product. The matrix Z depends on the method.

The following table gives a summary of the two criteria for each accelerator. For simplicity, the effects of truncation and restarting are neglected.


Accelerator Space Orthogonality Condition Z
CG, SI $u^{(n)}\in u^{(0)}+K_n(Q^{-1}r^{(0)},Q^{-1}A)$ $r^{(n)}\bot K_n(Q^{-1}r^{(0)},Q^{-1}A)$ Q
ME ${\bf u}^{(n)}\in {\bf u}^{(0)}+K_n({\bf A}{\bf r}^{(0)},{\bf A})$ ${\bf r}^{(n)}\bot K_n({\bf r}^{(0)},{\bf A})$ (${\bf A}$ symm.) QRTQR A-1Q
CGNR,LSQR ${\bf u}^{(n)}\in {\bf u}^{(0)}+K_n({\bf A}^T{\bf r}^{(0)}, {\bf A}^T{\bf A})$ ${\bf A}^T{\bf r}^{(n)}\bot K_n({\bf A}^T{\bf r}^{(0)}, {\bf A}^T{\bf A})$ ATQL-TQR
ORES,IOM ${\bf u}^{(n)}\in {\bf u}^{(0)}+K_n({\bf r}^{(0)},{\bf A})$ ${\bf r}^{(n)}\bot K_n({\bf r}^{(0)},{\bf A})$ QRTQR
ODIR,OMIN,GMRES ${\bf u}^{(n)}\in {\bf u}^{(0)}+K_n({\bf r}^{(0)},{\bf A})$ ${\bf r}^{(n)}\bot {\bf A}K_n({\bf r}^{(0)},{\bf A})$ ATQL-TQR
USYMLQ ${\bf u}^{(n)}\in {\bf u}^{(0)}+\widetilde K_n({\bf r}^{(0)},{\bf A}^T)$ ${\bf r}^{(n)}\bot \widetilde K_n({\bf r}^{(0)},{\bf A})$ QRTQR
USYMQR ${\bf u}^{(n)}\in {\bf u}^{(0)}+\widetilde K_n({\bf r}^{(0)},{\bf A}^T)$ ${\bf r}^{(n)}\bot {\bf A}\widetilde K_n({\bf r}^{(0)},{\bf A}^T)$ ATQL-TQR
LANDIR/MIN/RES ${\bf u}^{(n)}\in {\bf u}^{(0)}+K_n({\bf r}^{(0)},{\bf A})$ ${\bf r}^{(n)}\bot K_n({\bf r}^{(0)},{\bf A}^T)$ QRTQR


Here the n-dimensional quasi-Krylov space is defined by

\begin{displaymath}\widetilde K_n(v,A) = {\rm span} \{ v,Av,AA^Tv,AA^TAv,\ldots \} \end{displaymath}

The residual of the abstract system is given by ${\bf r}^{(n)}={\bf b}-{\bf A}{\bf u}^{(n)}$, whereas r(n)=b-Au(n) is the residual of the original system.

A few comments are in order regarding the nonsymmetric accelerators. ORTHODIR and ORTHOMIN always have a minimization property, for any A, QL and QR. ORTHORES, however, may not. In the case of ${\bf A}$ symmetric, ORTHODIR and ORTHOMIN reduce to the conjugate residual method for ${\bf A}$, and ORTHORES reduces to the 3-term conjugate gradient method applied to ${\bf A}$.

The Lanczos methods all utilize the same ``auxiliary vector'' $\tilde{\bf r}^{(0)} = {\bf r}^{(0)}$. Thus when ${\bf A}$ is SPD they all give the same iterates as the conjugate gradient method applied to ${\bf A}$.

Now we present guidelines for determining which accelerator to use in practical situations. If both A and Q are SPD, the symmetric accelerators listed above may be used. Otherwise we proceed as follows.

We consider four classes of matrix problem arising from the preconditioned matrix ${\bf A}$:

1.
${\bf A}$ is SPD.
2.
${\bf A}$ is symmetric indefinite.
3.
${\bf A}$ is definite (i.e., the symmetric part, $({\bf A}+{\bf A}^T)/2$, or its negative, is SPD.
4.
The general case.

Unfortunately, when ${\bf A}$ is not symmetric (cases 3. and 4. above) the choice of best accelerator is not at all clear. However, we will give some general guidelines below.


The SPD Case:


This case often arises from A and Q being SPD, in which case the symmetric accelerators such as CG may be used. Otherwise, one may use IOM (minimizing the ${\bf A}^{-1/2}$-norm of ${\bf r}^{(n)}$) or ORTHOMIN or GMRES (minimizing the 2-norm of ${\bf r}^{(n)}$). These should be used in their truncated forms, with variable NS2 [= IPARM(10)] set to ITMAX [= IPARM(2)], and NS1 [= IPARM(9)] set to 1 (ORTHOMIN) or 2 (IOM, GMRES).


The Symmetric Indefinite Case:


In the symmetric indefinite case, the methods SYMMLQ and MINRES may be used. SYMMLQ may be run by using truncated IOM with NS1=2and NS2=ITMAX. MINRES may be run by using truncated GMRES with NS1 and NS2 set the same way. Of the two, the MINRES algorithm is to be preferred, as it minimizes the 2-norm of the residual ${\bf r}^{(n)}$ at each iteration.


The Definite Case:


If it is known that ${\bf A}$ is definite, ORTHOMIN and GMRES are guaranteed to converge. These methods minimize the 2-norm of the residual ${\bf r}^{(n)}$. The implementation of ORTHOMIN in the package runs the algorithm ORTHOMIN(NS1), the truncated method, and restarts it every NS2 iterations. The best settings for NS1 and NS2 are as follows:

The Mildly Nonsymmetric Case.
For a mildly nonsymmetric matrix, NS2 should be set to ITMAX, so that the pure truncated method is run. In this case, NS1=0 is the steepest descent method, NS1=1 is the classical 2-term conjugate residual method, and progressively larger choices of NS1 handle nonsymmetry better, but at the expense of greater computation time and storage. A value of NS1 $\geq 1$ is preferred here.

The Highly Nonsymmetric Case.
If the matrix is highly nonsymmetric, NS1 should be set to ITMAX, so that the pure restarted method is run. The variable NS2 denotes the restart frequency, and, again, larger values of NS2 perform better but require more time and workspace. If a purely restarted method is used (i.e., NS1 $\geq$ NS2), the GMRES accelerator should be preferred to ORTHOMIN, as it requires slightly less storage and CPU time and is slightly more stable numerically.


The General Case:


For the general nonsymmetric case, the ORTHOMIN and GMRES accelerators may work but have the potential of breaking down. The package contains other methods which are sometimes better. These include methods based on the normal equations, Lanczos methods, and the Biconjugate Gradient Squared method. Roughly speaking, if the methods were listed in terms of increasing reliability (and decreasing overall efficiency), they would be listed in the order: Lanczos-type methods (including Biconjugate Gradient Squared), ORTHOMIN-type methods, and normal equations methods.

Normal Equations:
For the case of a general matrix, methods based on the normal equations corresponding to ${\bf A}$ (i.e., solving ${\bf A}^T{\bf A}{\bf u}={\bf A}^T{\bf b}$) may be used. These methods are guaranteed to converge in exact arithmetic for any nonsingular system whatsoever; however, in practice convergence is usually slow, and roundoff error may preclude convergence altogether. The package contains two implementations of conjugate gradients applied to the normal equations, CGNR and LSQR, the latter of which is more stable numerically. These both minimize the norm of the residual ${\bf r}^{(n)}$ at each step.

Lanczos Methods.
Another useful accelerator is the Lanczos (Biconjugate Gradient) method. This accelerator, making use of the transpose operator, has a short recurrence, and will converge within N iterations if the iteration process does not break down. However, the iterates are not bounded over all choices of initial residual, and in fact the method may break down for nonsymmetric matrices. In fact, little is known about the convergence properties of the Lanczos method. In spite of this, for some classes of problems the Lanczos method works quite efficiently. There are three Lanczos variants in the package, of which the LANMIN variant is to be preferred.

The Biconjugate Gradient Squared Method.
In many cases, the Biconjugate Gradient Squared (BCGS)
method may converge faster than LANMIN. The BCGS method requires two matrix multiplies per step and does not require the transpose. In exact arithmetic, it breaks down for roughly the same matrix problems as LANMIN does. Its faster convergence arises from the fact that it generates the square of the error-reducing polynomial utilized by LANMIN, thus in some sense converging twice as fast as LANMIN.

next up previous contents
Next: Parameter Arrays IPARM and Up: NSPCG User's Guide Previous: Brief Background on Preconditioners