P01 - Achieving Performance Portability on ECMWF’s Open-Source Operational Wave Model ecWAM Using Source-To-Source Translation and GPU-Aware Data-Structures
Description
It can be quite challenging to adapt production numerical weather prediction (NWP) codes for GPU execution. Those codes have typically been developed and optimised for multi-core CPUs and are continually being updated by domain scientists. Additional complexity arises from the vast size of these codebases, the increased diversity in available platform architectures with native and derived programming models as well as the necessity of vendor-specific modifications to achieve optimal performance. At ECMWF, we manage this complexity using Loki, our source-to-source translation toolchain, and FIELD API, a GPU-aware data-structures library. In this poster we present how these two tools have been used to achieve performance portability on ECMWF’s operational wave-model, ecWAM. Starting from the original CPU optimised Fortran code, we present different GPU-capable variants that can be generated via Loki. The variants presented are diverse in terms of optimisation strategies and employed programming models. As one of the highlights, Loki is capable of translating the original Fortran kernels to C-style kernels like CUDA for NVIDIA GPUs and HIP for AMD GPUs. With this, we present not only performance across multiple architectures but also showing potential performance benefits resulting from translation to native kernel languages.
Presenter(s)

Presenter
I work as a computational scientist at ECMWF. The main focus of my work is the GPU adaptation of the physical parameterizations in the integrated forecasting system (IFS). Source-to-source translation is a key component of ECMWF's GPU adaptation strategy, and I am one of the core developers of Loki, ECMWF's in-house source-to-source translation tool. Previously, I have also worked on improving the vectorizability of HYDRA, the production computational fluid dynamics code of Rolls-Royce plc.