Minisymposium Presentation
Celerity - SYCL-Based High-Productivity Development at HPC Scale
Description
The vendor-agnostic SYCL standard provides a C++-based foundation for the development of applications targeting heterogeneous systems. However, while SYCL supports multiple devices in a single shared memory host system or node, leveraging this support requires developers to manually take care of the issues of work splitting and data coherence across devices. Developing for an HPC environment, which features a large number of accelerators in a distributed memory cluster, is even more challenging and labor-intensive. As a consequence, algorithmic decisions are often difficult to change once taken, reducing the potential for experimenting with different approaches at scale.This talk will briefly introduce a subset of SYCL, and demonstrate how the Celerity runtime system extends the applicability of single-GPU concepts to clusters, in a manner largely transparent to the programmer. The applicability of this system, and its potential for enabling the rapid evaluation of different implementation choices, will be illustrated with scalability studies on a set of real-world use cases. Additionally, some SYCL software engineering best practices based on the Celerity development experience will be shared.