Minisymposium

MS4A - Empowering Interdisciplinary Collaboration through Reproducible Benchmarking

Fully booked

Tuesday, June 17, 2025

15:00

17:00

CEST

Room 5.0A52

Live streaming recording

Session recording

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Olga

Pearce

Lawrence Livermore National Laboratory

Description

Porting and tuning the performance of scientific applications on heterogeneous and increasingly complex supercomputer architectures is currently a manual and arduous task. Therefore, benchmarks are crucial for evaluating and improving the performance of applications on these systems. Benchmarks can be used as a proxy of the execution behaviors of scientific applications in a controlled and simpler environment, which provides an opportunity to extrapolate performance gains across new hardware, algorithm changes, and software updates. This mini symposium brings together interdisciplinary communities, who are using supercomputers, to discuss the challenges and opportunities in tracking, analyzing, and tuning application performance through the use of reproducible benchmarks. Specifically, we will address these critical topics: defining benchmarks to accurately capture scientific application behaviors; broadening performance metrics beyond time to solution, to reflect the impact of optimizations; modernizing the process of running benchmarks and analyzing their performance on different hardware architectures; proposing a standard for the definition of new benchmarks to improve reproducibility; identifying opportunities for software/hardware co-design across scientific applications and domains; and how to encourage contributions from the community through open-source benchmark implementations.

Presentations

15:00

15:30

CEST

Collaborative Continuous Benchmarking for HPC Applications

Benchmarking is integral to procurement of HPC systems, communicating HPC center workloads to HPC vendors, and verifying performance of the delivered HPC systems. Currently, HPC benchmarking is manual and challenging at every step, posing a high barrier to entry, and hampering reproducibility of the benchmarks across different HPC systems. Collaborative continuous benchmarking can enable functional reproducibility, automation, and community collaboration in HPC benchmarking. Recent progress in HPC automation allows us to consider previously unimaginable large-scale improvements to the HPC ecosystem. Collaborative continuous benchmarking helps overcome the human bottleneck in HPC benchmarking, enabling better evaluation of our systems and enabling a more productive collaboration within the HPC community.

Olga Pearce (Lawrence Livermore National Laboratory, Texas A&M University)

15:30

16:00

CEST

Benchmarking the Three Ps: Performance, Portability, and Productivity

Our high-performance applications must be written to embrace the full ecosystem of supercomputer design. They need to take advantage of the hierarchy of concurrency on offer, and utilise the whole processor. And writing these applications must be productive because HPC software outlives any one system. Our applications need to address the “Three Ps” and be Performance Portable and Productive. Benchmarking for the Three Ps presents an acute challenge due to the complexities of rigorously and reproducibly testing such a large space. In this talk I will share a perspective on benchmarking for Performance, Portability, and Productivity, and share some of the tools and methodologies we are developing to make this easier for the future.

Tom Deakin (University of Bristol)

16:00

16:30

CEST

Benchmarking and Co-Design at System and Processor Level

Benchmarking provides insight into the behaviour of application codes and their kernels on HPC systems. This is useful for identifying bottlenecks and performance tuning on the application side, as well as for understanding how specific hardware features can impact (positively or negatively) the performance of those applications. It is therefore natural to use benchmarking as a vehicle for "co-design". Since this is also a term that finds different definitions, we would like to specify here that we refer to co-design in the sense of studying the interaction between application code, system software, and hardware components in search of the modifications at each of these three levels that would bring the best overall performance and energy efficiency. In this talk, we will present experiences on the use of benchmarking for co-design purposes gathered in the EU-funded DEEP and EPI projects. While in the former we looked at the system level, in the latter the focus is on the processor and even core level. We will describe the differences between the two and the challenges that we found.

Estela Suarez (Forschungszentrum Jülich, University of Bonn)

16:30

17:00

CEST

Panel: Reproducible Benchmarking in Scientific Applications - Challenges and Opportunities

What are the current research challenges and opportunities with reproducible benchmarking for scientific applications? Join an interactive panel with the three speakers Dr. Olga Pearce (LLNL, Texas A&M), Dr. Tom Deakin (University of Bristol), Dr. Estela Suarez (Juelich Supercomputing Center, University of Bonn), and Dr. Osman Simsek (University of Basel) as they present the state of the art and predictions for reproducible benchmarking. They will highlight potential impacts to the field as a result of current and future projects such as new AI/ML benchmarks, co-design, cloud infrastructure, and framework development. This panel will be moderated by Dr. Jens Domke (Riken) and will include ample time for questions and discussion from the audience.

Jens Domke (RIKEN) and Olga Pearce (Lawrence Livermore National Laboratory)

Bookmark
this session

Unbookmark
this session

Saving...