HLIBpro
2.8.1
|
Given a sparse matrix \(A\) and a right hand hand side \(b\), the solution \(x\) of
\[Ax = b\]
is sought, whereby H-LU factorisation shall be used to improve the convergence of an iterative solver. No geometrical data is assumed and hence, algebraic clustering is performed for converting \(A\) into an H-matrix. Furthermore, nested dissection is applied to improve the efficiency of the H-LU factorisation.
The program starts with the inclusion of the needed header files and the initialisation of ๐ง๐ซ๐จ๐ก๐๐๐.
The next step is the definition of the sparse matrix, which shall be read from a file. ๐ง๐ซ๐จ๐ก๐๐๐ supports various foreign matrix formats, e.g. Matlab, SAMG, Harwell-Boeing or Matrixmarket. In this example, the matrix is stored in Matlab format:
The function read_matrix
performs autodetection of the file format and is therefore suited for all supported formats.
read_matrix
reads in the first matrix found in the file. If you have stored several matrices in one file, you have to use TMatlabMatrixIO and provide the name of the matrix to read.By using unique_ptr
, the corresponding object is automatically deleted when the current block is left, thereby reducing the danger of memory leaks. This technique is extensively used in ๐ง๐ซ๐จ๐ก๐๐๐.
Since any kind of matrix may be stored in the given file, e.g. a dense matrix, but in the following sparse matrices are expected, the type of the matrix is tested:
The macro IS_TYPE
returns true
, if the corresponding object is of the given type. This is an example of the runtime type information system (RTTI) in ๐ง๐ซ๐จ๐ก๐๐๐ and works similar to dynamic casts in C++. The advantage of the ๐ง๐ซ๐จ๐ก๐๐๐ RTTI is, that further information can be gathered, e.g. the type name through the function typestr()
, independent of the corresponding C++ compiler support.
Finally, since we know by now that M
contains a sparse matrix, we cast the pointer for easier use in the following. The macro ptrcast
is just an abbreviation of a C++ cast and was introduced in ๐ง๐ซ๐จ๐ก๐๐๐ for ease of use and for debugging purposes (ptrcast
is different depending on compilation options).
We now make use of the TSparseMatrix pointer S
and may output some information about it:
This includes the dimension of the matrix, i.e. the number of rows and columns, as well as the number of non zero elements. Furthermore, the value type, real of complex, and matrix properties are printed. Finally, the storage size and the Frobenius norm is computed.
Having a right hand side, the equation system above could now be solved using an iterative scheme. Unfortunately, most sparse matrices need some form of preconditioning for an acceptable convergence behaviour. The preconditioner shall be in the form of a H-LU factorisation of \(A\) and hence, of S
. For this, we need a cluster tree and a block cluster tree.
Since, by assumption, no geometrical data is available, the clustering is performed purely algebraically by using the connectivity relation of the indices stored in the sparse matrix itself. Furthermore, nested dissection is to be used, which needs some special clustering.
Algebraic clustering uses graph partitioning as the underlying technique. ๐ง๐ซ๐จ๐ก๐๐๐ implements two graph partitioning algorithms (TBFSAlgPartStart and TMLAlgPartStrat) and supports external algorithms, e.g. METIS and Scotch. Here we use the BFS-based partitioning strategy implemented by TBFSAlgPartStrat.
Nested dissection clustering is based on a standard bissection clustering algorithm, which therefore has to be defined explicitly in the form of TAlgCTBuilder, which on the other hand uses the graph partitioning strategy. Finally, TAlgNDCTBuilder computes the cluster tree for the index set defined by S:
Having the cluster tree, the block cluster tree can be computed:
Here, the weak admissibility condition for graphs, implemented by TWeakAlgAdmCond, is used. Since the block clusters to test are based on the internal ordering of the H-matrix but the sparse matrix is defined in the external ordering, the corresponding mappings between both is needed by the admissibility condition.
nmin
, cluster_level_mode
and sync_interface_depth
, which may be changed to optimise the partitioning.The printed sparsity constant gives some hint about the complexity of the H-matrix constructed over the block cluster tree, the smaller, the better.
The final conversion of the sparse matrix into H-format is performed by TSparseMBuilder. Here, again the mappings between internal and external ordering are needed. The accuracy can be usually neglected, since low rank blocks are usually empty (clusters of positive distance usually have no overlapping basis functions).
After matrix construction, the size of the H-matrix \(A\) and its norm is printed together with the (relative) norm of the difference \(\|S-A\|_2/\|S\|_2\), indicating any approximation error. For this, the matrix \(A\) should operator with the same ordering of the unknowns as \(S\) and is therefore wrapped in a TPermMatrix object.
The factorisation of A
may be computed differently, depending on the format of A
, e.g. if it is symmetric or not. In the latter case the standard \(LU\) factorisation is used, implemented in TLU, while in the symmetric case the \(LDL^T\) (or \(LDL^H\) for hermitian matrices) factorisation implemented in TLDL is computed.
The factorisation is performed in place, i.e. the data of A
is overwritten by the corresponding factores. This implies, that A
is actually not a matrix object after the factorisation since the stored data depends on special handling. For this, linear operators are used with special versions for each factorisation algorithm to permit evaluation of the factorised matrix or its inverse.
The factorisation is performed up to a block-wise accuracy of \(10^{-4}\). Depending on the given matrix this has to be changed to obtain a direct solver (single matrix-vector mult.) or a good preconditioner (small number of iterations).
Some properties of the factorisation are printed next, most notably the inversion error \(\|I-(LU)^{-1}S\|_2\) and the condition estimate \(\|LU\mathbf{1}\|_2\). However, to compare \((LU)^{-1}\) with \(S\), both need to have the same ordering of the indices. Instead of reordering \(S\), the factorised matrix may be wrapped in an object representing a permuted matrix:
Here, an operator was built, which first reorders w.r.t. the H-matrix, then applies the factorised matrix and finally reorders back.
Now, the inversion error maybe computed:
Finally, the equation system can be solved using the inverse of the matrix factorisation as a preconditioner. The right hand side is assumed to be available in Matlab format and is read in using read_vector
, which again performs file format auto detection. The solution vector is created using the col_vector
function of S
, which returns a vector defined over the column index set of S
.
With write_vector
the solution vector to a file, again in Matlab format.
The program is finished with the finalisation of ๐ง๐ซ๐จ๐ก๐๐๐ and the closing of the try block: