https://nlafet.github.io/StarNEig/
StarNEig is either comparable to LAPACK and ScaLAPACK or significantly faster depending on the computational step. Moreover, StarNEig realizes new parallel and blocked algorithms for computing eigenvectors without suffering from floating point overflow. In LAPACK the corresponding solvers are sequential scalar codes which compute eigenvectors one by one. In ScaLAPACK the corresponding solvers are vulnerable to overflow.
Eigenvalue problems can be found in every field of natural science. Clear examples are supplied by the analysis of systems of ordinary differential equations. The stability analysis of first order systems produces standard eigenvalue problems which are not necessarily symmetric and the analysis of second order systems produce quadratic eigenvalue problems which are equivalent to nonsymmetric generalized eigenvalue problems. Without the ability to solve eigenvalue problems rapidly and accurately, we would be unable to complete the calculations needed to maintain and advance our civilization. Therefore, it is important that we continue to develop new algorithms and software which maximizes both the performance and the accuracy using existing and emerging hardware
StarNEig is one of very few libraries to offer support for nonsymmetric eigenvalue problems. It is built on top of the runtime system StarPU which is used to schedule the tasks. Currently, StarNEig applies to real problems which have real or complex eigenvalues and eigenvectors. By design, StarNEig applies to both shared and distributed memory machines and it has experimental support for GPU accelerators.
]]>1: Carl Christian Kjelgaard Mikkelsen and Mirko Myllykoski: Parallel Robust Computation of Generalized Eigenvectors of Matrix Pencils
https://link.springer.com/chapter/10.1007/978-3-030-43229-4_6
https://arxiv.org/abs/2003.04776
In this paper, we consider the problem of computing generalized eigenvectors of a matrix pencil in real Schur form. In exact arithmetic, this problem can be solved using substitution. In practice, substitution is vulnerable to floating-point overflow. The robust solvers xtgevc in LAPACK prevent overflow by dynamically scaling the eigenvectors. These subroutines are scalar and sequential codes which compute the eigenvectors one by one. In this paper, we discuss how to derive robust algorithms which are blocked and parallel. The new StarNEig library contains a robust task-parallel solver Zazamoukh which runs on top of StarPU. Our numerical experiments show that Zazamoukh achieves a super-linear speedup compared with dtgevc for sufficiently large matrices.
2: Mirko Myllykoski and Carl Christian Kjelgaard Mikkelsen: Introduction to StarNEig – A Task-based Library for Solving Nonsymmetric Eigenvalue Problems.
https://link.springer.com/chapter/10.1007/978-3-030-43229-4_7
https://arxiv.org/abs/2002.05024
In this paper, we present the StarNEig library for solving dense nonsymmetric (generalized) eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library support GPUs. The library is currently in an early beta state and only real arithmetic is supported. Support for complex data types is planned for a future release. This paper is aimed at potential users of the library. We describe the design choices and capabilities of the library, and contrast them to existing software such as ScaLAPACK. StarNEig implements a ScaLAPACK compatibility layer that should make it easy for new users to transition to StarNEig. We demonstrate the performance of the library with a small set of computational experiments.
3: Angelika Beatrix Schwarz and Carl Christian Kjelgaard Mikkelsen: Robust Task-Parallel Solution of the Triangular Sylvester Equation.
https://link.springer.com/chapter/10.1007/978-3-030-43229-4_8
https://arxiv.org/abs/1905.10574
]]>The Bartels-Stewart algorithm is a standard approach to solving the dense Sylvester equation. It reduces the problem to the solution of the triangular Sylvester equation. The triangular Sylvester equation is solved with a variant of backward substitution. Backward substitution is prone to overflow. Overflow can be avoided by dynamic scaling of the solution matrix. An algorithm which prevents overflow is said to be robust. The standard library LAPACK contains the robust scalar sequential solver dtrsyl. This paper derives a robust, level-3 BLAS-based task-parallel solver. By adding overflow protection, our robust solver closes the gap between problems solvable by LAPACK and problems solvable by existing non-robust task-parallel solvers. We demonstrate that our robust solver achieves a performance similar to non-robust solvers.
D2.7 Eigenvalue solvers for nonsymmetric problems
D2.9 Novel SVD Algorithms
D5.3 Validation and evaluation
D6.3 Evaluation of software prototypes
D6.5 Evaluation of auto-tuning techniques
D6.7 Prototypes for tiled one-sided factorizations with algorithm-based fault tolerance
D7.7 Dissemination report. Period M19-M42
D7.8 Release of the NLAFET library
For the full list of released public deliverables, see this page: http://www.nlafet.eu/public-deliverables/
]]>For the full list of released public deliverables, see this page: http://www.nlafet.eu/public-deliverables/
]]>Robust algorithms do not suffer from overflow and always return a valid result. In LAPACK eigenvectors (standard and generalized) are computed using robust algorithms. The existing algorithms are scalar and sequential. This new work presents algorithms which are blocked and parallel. The analysis is supported by parallel software running on top of StarPU. Further improvements are possible, but the new software is already orders of magnitude faster than the existing software.
The authors are Carl Christian Kjelgaard Mikkelsen, Angelika Beatrix Schwarz and Lars Karlsson
]]>Working Note 19 have been uploaded:
The full list of published Working Notes can be found on this page: http://www.nlafet.eu/working-notes/
]]>The title of the poster is “Using GPU’s FP16 Tensor Cores Arithmetic to Accelerate Mixed-Precision Iterative Refinement Solvers and Reduce Energy Consumption” with the authors Azzam Haidar, Stan Tomov, Ahmad Abdelfattah, Mawussi Zounon, Jack Dongarra
]]>Several deliverables have been uploaded: D3.2, D3.6, D4.3, D6.2, and D6.6.
A full list, including links to all the public deliverables that have been posted until now can be found here: http://www.nlafet.eu/public-deliverables/
]]>