Button Text
Back

P12 - Enhancing Productivity and Performance Analysis on Euro HPC Systems

This is some text inside of a div block.
This is some text inside of a div block.
-
This is some text inside of a div block.
CEST
Climate, Weather and Earth Sciences
Chemistry and Materials
Computer Science, Machine Learning, and Applied Mathematics
Applied Social Sciences and Humanities
Engineering
Life Sciences
Physics
This is some text inside of a div block.

Description

Large-scale EuroHPC High-Performance Computing (HPC) systems, such as Leonardo and Lumi, present significant challenges for developers. A key difficulty is adapting their software to new architectures, accelerators and different compiler options in order to fully leverage available resources. As a result, developers often spend a substantial amount of time manually running performance and scalability tests to ensure changes do not degrade performance or compromise portability across platforms. To streamline these repetitive, manual steps that hinder productivity, we propose an automated framework for integrating HPC performance testing and benchmarking into a Continuous Integration (CI) pipeline. By extending the ReFrame framework, a widely used HPC regression and benchmarking tool, our approach automates the monitoring of application performance. This enables both strong and weak scaling benchmarks and performance portability tests across multiple architectures. The system collects metrics, generates visual reports, and alerts developers to any performance regressions. With these capabilities, HPC developers can focus on scientific advancements rather than repetitive, time-consuming testing and benchmarking. We demonstrate this workflow using the ECsim space-plasma physics application, illustrating how integrating DevOps practices and ReFrame-driven automation can streamline HPC software development.

Presenter(s)

Authors