Minisymposium Presentation
Advancing Probabilistic Weather Forecast through Machine Learning at Scale
Presenter
Boris Bonev is a Senior Research Scientist at NVIDIA where he works at the intersection of Machine Learning and classical numerical methods for Scientific Computing. He is interested in developing algorithms from first principles and scaling them on high-performance computing systems. Prior to NVIDIA, Boris received his PhD in applied mathematics from EPFL, where he worked on PDE solvers and numerical linear algebra algorithms for large scale wave problems.
Description
We present recent advancements in global weather modeling based on spherical neural operators. This innovative approach demonstrates superior skill and reduced computational costs compared to current state-of-the-art models. Our model is trained as a probabilistic ensemble, respecting spherical geometry and symmetries. This approach, inspired by first principles and its probabilistic formulation ensure plausible dynamics and stable spectra in the model's output.To scale training, a novel paradigm for model-parallelism, inspired by domain-decomposition is developed. This reduces memory and IO requirements per GPU, enabling training of larger models by splitting them across multiple GPUs. Leveraging model-parallelism in conjunction with data-parallelism and ensemble-parallelism enables massive parallelization to train the model on 1024 NVIDIA H100 GPUs.The resulting model's efficiency is remarkable, capable of generating a full year's rollout in under 13 minutes on a single GPU, while demonstrating skill improvement over current operational models at 0.25 degrees global resolution.These advancements represent a significant step forward in ML weather forecasting, offering improved accuracy and computational efficiency. The model's ability to capture uncertain dynamics of the weather system while maintaining realistic physical properties makes it a powerful tool for generating forecasts, providing early warnings for extreme events, and informing decision-making across various sectors.