Button Text
Back

P42 - Towards a Sparse BLAS Standard for Triangular Solvers on ARM Architectures

This is some text inside of a div block.
This is some text inside of a div block.
-
This is some text inside of a div block.
CEST
Climate, Weather and Earth Sciences
Chemistry and Materials
Computer Science, Machine Learning, and Applied Mathematics
Applied Social Sciences and Humanities
Engineering
Life Sciences
Physics
This is some text inside of a div block.

Description

Sparse matrix computations are critical in scientific simulations and engineering, with the Sparse BLAS standard playing a growing role as a benchmark for performance and portability across diverse hardware, including x86 CPUs, GPUs, and ARM architectures. However, standardizing sparse matrix operations remains challenging due to differences in storage formats, accuracy requirements, and hardware-specific optimizations and will, therefore, require an iterative refinement process. Recent updates to the Arm Performance Libraries, such as the introduction of functions for sparse triangular solves and sparse vector operations, reflect significant industry efforts towards such standardization. This poster contributes to these ongoing efforts by highlighting the benefits of supernodal sparse matrix representations. Supernodes group columns with identical sparsity patterns into dense blocks, enabling efficient utilization of dense BLAS/LAPACK operations and thereby delivering substantial performance gains. We are collaborating with Arm to integrate supernodal representations into the Arm Performance Libraries, showcasing improved performance on ARM systems powered by state-of-the-art processors from the Ampere Altra Max, Azure Cobalt, and AWS Graviton series.

Presenter(s)

Authors