PASC25 Conference

Home

Program

Button Text

Back

P42 - Towards a Sparse BLAS Standard for Triangular Solvers on ARM Architectures

This is some text inside of a div block.

CEST

Climate, Weather and Earth Sciences

Chemistry and Materials

Computer Science, Machine Learning, and Applied Mathematics

Engineering

Life Sciences

Physics

This is some text inside of a div block.

Download pdf

Description

Sparse matrix computations are critical in scientific simulations and engineering, with the Sparse BLAS standard playing a growing role as a benchmark for performance and portability across diverse hardware, including x86 CPUs, GPUs, and ARM architectures. However, standardizing sparse matrix operations remains challenging due to differences in storage formats, accuracy requirements, and hardware-specific optimizations and will, therefore, require an iterative refinement process. Recent updates to the Arm Performance Libraries, such as the introduction of functions for sparse triangular solves and sparse vector operations, reflect significant industry efforts towards such standardization. This poster contributes to these ongoing efforts by highlighting the benefits of supernodal sparse matrix representations. Supernodes group columns with identical sparsity patterns into dense blocks, enabling efficient utilization of dense BLAS/LAPACK operations and thereby delivering substantial performance gains. We are collaborating with Arm to integrate supernodal representations into the Arm Performance Libraries, showcasing improved performance on ARM systems powered by state-of-the-art processors from the Ampere Altra Max, Azure Cobalt, and AWS Graviton series.

Presenter(s)

Presenter

Marco Julian

Solanki

ETH Zürich

Authors

Universitat Politècnica de Catalunya

Olaf

Schenk

Università della Svizzera italiana

Bookmark
this session

Unbookmark
this session

Saving...