NAMES AND
FORMATS
A.nr, A.nc: | number of rows and columns |
A.ia, A.ja: | pointer and indices in SPARSKIT compressed sparse row format, number starting with 1. |
A.a: | pointer of type void, referring to the associated numerical values in SPARSKIT compressed sparse row format. Thus you may pass any numerical pointer of type float, double, complex, doublecomplex. |
A.isreal: | set this to 1 if your matrix is real; 0 otherwise. |
A.issingle: | set this to 0 if your matrix is double precision (real or complex); 1 otherwise. |
A.issymmetric: | set this to 1 if your matrix is real or complex symmetric and make sure that only half of the matrix is passed (say, upper or lower triangular part); 0 otherwise. |
A.ishermitian: | set this to 1 if your matrix is Hermitian (real or complex) and make sure that only half of the matrix is passed (say, upper or lower triangular part); 0 otherwise. |
A.isdefinite: | set this to 1 if your matrix is already Hermitian and in addition positive definite (or already real symmetric and positive definite); 0 otherwise. |
S | real single precision |
D | real double precision |
C | complex single precision |
Z | complex double precision |
GNL | general matrices |
SPD | real symmetric positive definite matrices |
HPD | complex Hermitian positive definite matrices |
SYM | real/complex symmetric matrices |
HER | complex Hermitian matrices |
SSM | real/complex skew-symmetric matrices (not yet implemented) |
SHR | complex skew-Hermitian matrices (not yet implemented) |
EXAMPLES
General Case | Symmetric Positive Definite Case | Symmetric Indefinite Case | ||
SGNLAMGfactor(...) | SSPDAMGfactor(...) | SSYMAMGfactor(...) | ||
DGNLAMGfactor(...) | DSPDAMGfactor(...) | DSYMAMGfactor(...) | ||
CGNLAMGfactor(...) | CHPDAMGfactor(...) | CSYMAMGfactor(...) | CHERAMGfactor(...) | |
ZGNLAMGfactor(...) | ZHPDAMGfactor(...) | ZSYMAMGfactor(...) | ZHERAMGfactor(...) |
DATATYPES
A.nr, A.nc: | number of rows and columns |
A.ia, A.ja: | pointer and indices in SPARSKIT compressed sparse row format, numbers starting with 1. |
A.a: | pointer of type void, referring to the associated numerical values in SPARSKIT compressed sparse row format. Thus you may pass any numerical pointer of type float, double, complex, doublecomplex. |
A.isreal: | 1 real, 0 non-real |
A.issingle: | 0 double precision, 1 single precision. |
A.issymmetric: | 1 if real or complex symmetric (make sure that only half of the matrix is passed), 0 non-symmetric. |
A.ishermitian: | 1 if Hermitian (real or complex, make sure that only half of the matrix is passed), 0 non-Hermitian. |
A.isdefinite: | 1 Hermitian (or real+symmetric) and positive definite, 0 non-SPD case. |
A.nr, A.nc: | number of rows and columns |
A.ia, A.ja, A.a: | pointer, indices and numerical values in SPARSKIT compressed sparse row format. |
PRE.n,PRE.nB: | total size and size of the leading block |
PRE.LU: | sparse matrix continaing the ILU. |
PRE.LUperm: | permutation array (in case of ILUTP) |
PRE.E,PRE.F: | submatrices needed for multilevel ILU parts and . |
PRE.p, PRE.invq: | row and inverse column permutation |
PRE.rowscal, PRE.colscal: | numerical values for row and column scaling |
PRE.prev, PRE.next: | pointers to the previous and next level |
param.matching | use matching (1, default) or not (0) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.ordering | desired symmetric reordering
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.droptol | drop tolerance for the LU factors. By default, 1e-2 is chosen. Here you can overwrite the default values. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.droptolS | drop tolerance for the approximate Schur complement. By default, 0.1*param.droptol is chosen and recommended. Here you can overwrite the default values. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.droptolc |
threshold for dropping small entries from the constraint part,
if present (as indicated by negative entries in options.ind) default: 0 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.condest |
Norm of the inverse triangular factors. by default, 5 is chosen. Here you can overwrite the default values. As a rule of thumb, small CONDEST will allow more entries to be dropped (which may accelerate the computation and save memory) but at the same time, more levels will be necessary (which in turn may slow down the computation and increase the memory). Typically, values between 5 and 100 make sense. CONDEST=5 will make ILUPACK behave like AMG and select many coarse grid nodes. If you have a PDE-based problem, this might be the right choice. Otherwise, CONDEST=100 will safeguard the ILU computation and prevent the norm of the inverse triangular factors from becoming too large. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.restol | Residual tolerance for the iterative solver. The built-in iterative solver (CG/SQMR/restarted GMRES by default) will use this tolerance to terminate whenever the backward error (resp. relative energy norm) is less than this threshold. By default, eps^(3/4)~1e-12 is chosen for double precision, eps^(3/4)~1e-6 is chosen for single precision. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.elbow | elbow space for the ILU. Here please pass an estimate how much memory you are willing to spend. ILUPACK will try to keep the ILU inside the range you passed. The elbow space is a real number measuring the number of nonzeros of the ILU relative to the fill of the original matrix. By default, 10 is chosen. Note however, if your estimate is too small, ILUPACK will adapt elbow and overwrite this parameter. As long as enough memory is available, the ILU will be successfully computed. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.lfil | Maximum number of nonzeros per column in L (resp. per row in
U). By default n+1 is chosen, i.e. this option is disabled. You can limit the amount of memory by using some smaller value, e.g. A.ia[A.nc]-1 is the fill of A and ELBOW*(A.ia[A.nc]-1.0)/A.nc would restrict the maximum number of fill to the average number of nonzeros of A per column (or per row) times the ELBOW. Note however that this parameter cuts off the fill in L and U by brute force. It recommended NOT to use it. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.lfilS | Maximum number of nonzeros per row in S (approximate Schur
complement). By default n+1 is chosen, i.e. this option is disabled. You can limit the amount of memory by using some smaller value, e.g. A.ia[A.nc]-1 is the fill of A and ELBOW*(A.ia[A.nc]-1.0)/A.nc would restrict the maximum number of fill to the average number of nonzeros of A per column (or per row) times the ELBOW. Note however that this parameter cuts off the fill in S by brute force. It recommended NOT to use it. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.typetv |
Type of test vector. for some PDE-based problems it might be sensible to ensure that the ILU is exact when being applied to some given test vector. By default this option is disabled ("none"). If you want to use this feature you can either use "static" to pass a fixed test vector to the ILU. Or you can use any other string. In this case, using reverse communication principle, on every level you need to pass a test vector to the ILU. The ILU passes to you the current coarse grid system and an initial guess for the test vector. You have to return your own test vector. On entry to the first level, this initial guess is simply the test vector you prescribed. On any subsequent level, this will be your old test vector restricted to the coarse grid. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.tv | Test vector. If you decide to pass a test vector, then pass the associated pointer. ILUPACK will make its own copy inside AMGfactor, and you can release the memory if you like. In PDE-based applications, a typical guess is the vector with all ones. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.amg | type of algebraic multilevel method
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.npresmoothing | Number of pre-smoothing steps. If classical multigrid is selected (param.amg="mg";), then here you can set the number of pre-smoothing steps. default: 1 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.npostsmoothing | Number of post-smoothing steps. If classical multigrid is selected (param.amg="mg";), then here you can set the number of post-smoothing steps. default: 1 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.ncoarse | Number of coarse grid solves. Except for multilevel ILU (i.e. param.amg="amli"; or param.amg="mg";), here you define how often the coarse grid solve is performed. By default, only one coarse grid solve is used (V-cycle). The choice param.ncoarse=2; would correspond to a W-cycle. Note however, if a negative value is passed, a flexible solver is invoked, i.e. the number of coarse grid solves varies from one grid to another and from one step to the next one. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.presmoother | Type of pre-smoother. If full multigrid is used (param.amg="mg";), then here you can choose between built-in smoothers or your own hand-made smoother.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.postsmoother | Type of post-smoother. If full multigrid is used (param.amg="mg";), then here you can choose between built-in smoothers or your own hand-made smoother.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.FCpart | Pre-selection of coarse grid nodes. In some PDE-based applications it might be useful to select some coarse grid nodes in advance. Essentially this strategy uses a Ruge-Stueben-like heuristic strategy. If a test vector is available, the coarsening strategy is applied to the matrix, which is diagonally scaled from the right with the test vector.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.typecoarse | Type of coarse grid system. By default the coarse grid system S is computed from A and the ILU in
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.nrestart | Number of steps before GMRES is restarted. The iterative solver uses restarted GMRES (resp. fGMRES). By default, 30 steps are computed, before the method is restarted. Note that a smaller number reduces the memory, while a larger number can improve the convergence. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.mixedprecision | require the computation of the preconditioner in single precision | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.contraction | contraction factor < 1 of the residual for inner flexible solver when AMLI or classical multigrid is used and options.ncoarse<0 (flexible coarse grid solver) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.coarsereduce | If different from zero, then the L21 and the U12 block
are discarded solving with L,U is done implicitly via L11,U11 and
A21 (resp. A12). If set to zero, then L21, U12 are kept default: 1. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
param.decoupleconstraints | This allows for saddle point type problems to explictly
decouple the connections between the constraint part and the
free part. Applied on every level, this allows for smaller coarse
grid matrices. If set to zero, then the additional decoupling is
not applied. default: 1. |
GETTING STARTED: A SAMPLE
PROGRAM
// First variant of a SAMPLE program template for solving a real non-nsymmetric system using
// ILUPACK's real non-symmetric preconditioner
// structure for real double precision matrix in CSR format
SPARSEmat A;
// right hand side and solution, make sure to allocate sufficient memory
double *rhs, *sol;
// parameters for real double precision ILUPACK
ILUPACKparam options;
// structure for real double precision ILUPACK multilevel preconditioner
AMGlevelmat PRE;
/* read in a matrix. ILUPACK uses compressed sparse ROW format.
/3.5 -1.0 0 \
A matrix | 0 2.0 0 | is stored as follows
\ 0 0 1.5/
A.ia: 1 3 4 5 pointer to the start of every compressed
row plus pointer to the first space
behind the compressed rows
A.ja: 1 2 2 3 nonzero column indices
A.a: 3.5 -1.0 2.0 1.5 nonzero numerical values
The read part finally yields the following data structures
- A: matrix in compressed sparse row format
o A.nr, A.nc: number of rows and columns of A
o A.ia: pointer array
o A.ja: nonzero column index array
o A.a: nonzero numerical values
o A.isreal,A.issingle,A.issymmetric,A.ishermitian,A.isdefinite: matrix properties to be set
- rhs: right hand side(s) and additional data like exact solution
or initial guess
*/
// set properties
A.isreal=1;
A.issingle=A.issymmetric=A.ishermitian=A.isdefinite=0;
// init parameters to their default values
AMGinit(&A, &options);
// Now you may manipulate some of the parameters
// ...
// compute the preconditioner
ierr=AMGfactor(&A, &PRE, &options);
// if ierr!=0 then some error occurred
// solve the linear system
ierr=AMGsolver(&A, &PRE, &options, rhs, sol);
// release memory
AMGdelete(&A, &PRE, &options);
// Second variant of a SAMPLE program template for solving a real non-symmetric system using
// ILUPACK's real non-symmetric preconditioner
// structure for real double precision matrix in CSR format
Dmat A;
// right hand side and solution, make sure to allocate sufficient memory
double *rhs, *sol;
// parameters for real double precision ILUPACK
DILUPACKparam options;
// structure for real double precision ILUPACK multilevel preconditioner
DAMGlevelmat PRE;
// read input matrix and assign it to A in CSR format
// init parameters to their default values
DGNLAMGinit(&A, &options);
// compute the preconditioner
ierr=DGNLAMGfactor(&A, &PRE, &options);
// if ierr!=0 then some error occurred
// solve the linear system
ierr=DGNLAMGsolver(&A, &PRE, &options, rhs, sol);
// release memory
DGNLAMGdelete(&A, &PRE, &options);
More general template structure:
//
Smat,
Dmat,
Cmat,
Zmat,
SAMGlevelmat, ...
*mat A;
*AMGlevelmat PRE;
*ILUPACKparam param;
// read input matrix and assign it to A in CSR format
// init param with the default parameters
// SGNLAMGinit, DGNLAMGinit, CGNLAMGinit, ZGNLAMGinit, SSPDAMGinit, CHERAMGinit, ...
****AMGinit(&A, ¶m);
// main factorization routine for the multilevel ILU
ierr=****AMGfactor(&A, &PRE, ¶m);
// iterative solver to solve A*sol=rhs
ierr=****AMGsolver(&A, &PRE, ¶m, rhs, sol);
// release memory
****AMGdelete(&A, &PRE, ¶m);
TEMPLATES THAT FORM ILUPACK
Template 1. Static reordering schemes
Interfaces to many existing reordering schemes are provided, like
RCM (reverse Cuthill-McKee), MMD (minimum degree), AMD (approximate
minimum degree), MeTiS (multilevel nested dissection), MWM (maximum weight matching), ...
Template 2. partial Crout incomplete LU factorization with diagonal pivoting
After preordering and scaling the initial system, an inverse-based
ILU with diagonal pivoting is applied. By this template the matrix
is partially factored.
Template 3. Multilevel framework
For the reduced matrix that is left over from the second template,
a recursive multilevel strategy is applied that repeatedly applies
template 1 and 2.
Template 4. Iterative solver
After the incomplete LU decomposition is computed, the linear system
is solved iteratively
TEMPLATE 1: STATIC REORDERINGS AND THEIR USAGE IN ILUPACK
TEMPLATE 1: EXAMPLE
Level 1 | ||||
Initial system |
Initial system, reordered here: no permutation (no initial preprocessing) |
|||
Initial Level, reordered after inverse-based ILU has been applied | ||||
|
||||
Level 2 | ||||
Level 2 Initial system |
Level 2, reordered (Reverse Cuthill-McKee, regular reordering) |
|||
Level 2, reordered after inverse-based ILU has been applied | ||||
|
||||
Level 3 | ||||
Level 3 Initial system |
Level 3, reordered (Reverse Cuthill-McKee, regular reordering) |
|||
Level 3, reordered after inverse-based ILU has been applied | ||||
|
||||
Level 4 | ||||
Level 4 Initial system |
Level 4, reordered (Reverse Cuthill-McKee, regular reordering) |
|||
Level 4, reordered after inverse-based ILU has been applied | ||||
|
||||
Level 4 | ||||
Level 4,
reordered again.
here ddPQ (switched to final pivoting) |
||||
Level 4, reordered after inverse-based ILU has been applied |
TEMPLATE 1: STATIC REORDERINGS
Supported Symmetric Reorderings
RCM (Reverse Cuthill-McKee, SPARSPAK): banded systems, PDE-type problems.
MMD (Multiple Minimum Degree, SPARSPAK): almost symmetrically structured problems.
AMD (Approximate Minimum Degree, UMFPACK V4.3): almost symmetrically structured problems.
Multilevel Nested Dissection (by edges or by nodes, MeTiS 4.0): almost symmetrically structured problems.
INDSET (independent set, ARMS), symmetrically structured problems, PDE-type problems.
FC (fine grid/coarse grid partitioning similar to Ruge/Stüben AMG), symmetrically structured problems, PDE-type problems.
Supported Reorderings that improve (block) diagonal dominance
MC64 (Maximum Weight Matching, HSL): unstructured problems, improves diagonal dominance.
Symmetric MC64 (Symmetric Maximum Weight Matching, HSL): symmetrically structured indefinite problems, improves block diagonal dominance.
Maximum Weight Matching (MUMPS): unstructured problems, improves diagonal dominance.
Symmetric Maximum Weight Matching (MUMPS-based): symmetrically structured indefinite problems, improves block diagonal dominance.
ddPQ [Saad '03], permutation matrices P, Q as a compromise between diagonal
dominance and fill:
Applicable to all problems, in particular unstructured systems.
Combinations Matching and Symmetric Reorderings
The symmetric versions also use symmetric matchings.
MC64 (Maximum Weight Matching from HSL including symmetric matchings)
Maximum Weight Matching from MUMPS including symmetric matchings)
Any ordering offers or can be combined with column and/or row scaling
TEMPLATE 1: REORDERING
EXAMPLES
Initial system | ||||
System reordered by RCM (reverse Cuthill-McKee) | ||||
System reordered by MMD (Minimum Degree) | ||||
System reordered by MC64 (Maximum Weight Matching) | ||||
System reordered by ddPQ | ||||
System reordered by FC (fine/coarse grid) |
AVAILABLE INTERFACE DRIVERS
Driver for symmetric reorderings
RCM (Reverse Cuthill-McKee)
MMD (Minimum Degree)
AMD (Approx. Minimum Degree)
Multilevel Nested Dissection by edges or by nodes
Independent set
Fine/coarse partitioning
symmetrized version of PQ
(no permutation)
Driver that improve (block) diagonal dominance and fill
ddPQ
TEMPLATE 1. Preprocessing
Almost any permutation routine mentioned previously can be used in combination or without matchings. ILUPACKparam param is a structure of parameter settings. It controls the combinations.
TEMPLATE 2.
PILUC - INVERSE-BASED
ILU THAT CONTROLS
,
TEMPLATE 2.
PARTIAL CROUT ILU AS PART OF ILUPACK
diagonal pivoting (factor or skip strategy)
estimates of the inverse factors
diagonal compensation
TEMPLATE 3. MULTILEVEL SCHEME
TEMPLATE 3. APPROXIMATE
SCHUR COMPLEMENTS
ILUPACK MAIN AMG DRIVER
TEMPLATE 4.
ITERATIVE SOLVERS
COMMENTS
Default solvers are GMRES (for GNL), PCG (for SPD and HPD ) and SQMR (for SYM and HER ).
Initialization to standard settings supported by auxilliary routines
ILUPACK
Preconditioning software package (C/FORTRAN 77) using inverse-based multilevel ILUs
SPARSKIT
Iterative solvers GMRES, FGMRES
PCG adapted from SPARSKIT-CG
Preconditioners ILUTP and ILUT, now based on a binary search tree
Preconditioners and solvers adapted with respect to complex arithmetic (and BLAS)
Miscellaneous tools adapted (scaling, readmtc)
ARMS
Orderings indset and ddPQ
Multilevel strategy discarded)
Miscellaneous tools adapted (matvec, spartran)
BLAS
Iterative solvers GMRES, FGMRES and PCG now based on BLAS
Miscellaneous tools are using BLAS as well
INTERFACES
Interfaces with respect to several packages are provided
Combinations Matching and Symmetric Reorderings
MC64 & RCM, MC64 & MMD, MC64 & AMD, MC64 & Multilevel ND (edges or nodes)
MPS & RCM, MPS & MMD, MPS & AMD, MPS & Multilevel ND (edges or nodes)