Design algorithms and implement efficient, scalable, and robust parallel software for fundamental dense matrix problems far above and beyond those covered by standard packages today. WP2 will advance the state of the art (i) for extreme-scale linear system solvers by developing communication- and synchronization-reducing algorithms and software for LU, QR, and Cholesky factorizations, and (ii) for scalable transformation-based eigenvalue solvers (reduction to condensed form, QR/QZ, reordering, subspace computation), as well as (iii) for scalable SVD methods by considering standard methods and methods based on spectral dichotomy. In addition, WP2 will develop a reference implementation and API of a set of hybrid BLAS (i.e., using multi-core CPUs in combination with GPUs). The specification of hybrid BLAS will be developed in collaboration with the stakeholder community
We structure the work in this work package in four tasks:
- Linear System Solvers
- Hybrid BLAS
- Eigenvalue Problem Solvers
- Singular Value Decomposition Algorithms
We will use results from WP6 to address key cross-cutting issues, and layer the software on top of a runtime system to accomplish our objectives. To ensure that the software is easy to use and deploy, our starting point is the PaRSEC runtime system and novel algorithmic theory developed in this work package will be used to guide the extension of that and other similar systems.