Minisymposium Presentation
A GPU-Accelerated Unified API for Singular Values Enabling Reproducability Across Architectures and Data Types
Description
We present a portable, GPU-accelerated implementation of a QR-based Singular Value algorithm in Julia, that allows code reproducibility across several different GPU vendors. Singular Value Decomposition (SVD) is a fundamental numerical tool in scientific computing and machine learning, providing optimal low-rank matrix approximations with applications ranging from dimensionality reduction to data compression and signal processing. Our implementation leverages Julia’s multiple dispatch and metaprogramming capabilities, integrating with the GPUArrays and KernelAbstractions frameworks to provide a unified type-, and hardware-agnostic API. It supports diverse GPU architectures and data types, including half precision and Apple Metal. We benchmark the algorithm against several state-of-the-art linear algebra libraries and confirm performance reproducibility through a unified API. We explore GPU kernel optimization through parameter tuning to enable efficient parallelism and improved memory locality. Performance results on multiple GPU backends and data types demonstrate scalability combined with reproducibility, highlighting Julia’s suitability for high-performance linear algebra in heterogeneous environments.