Minisymposium
MS1B - Advances in Foundation Models for Weather and Climate
Live streaming recording
Session recording
Session Chair
Description
In this minisymposium, we will survey the state-of-the-art of foundation models for weather and climate. These promise a second revolution for Earth system modeling, after the emergence of highly skilful machine learning-based weather forecasting models in the last two years. Foundation models aim to provide a machine learning-based, rich representation of the Earth system at many scales in space and time through training on many different datasets. With this, they can be used for a wide range of task, not unlike conventional, equation-based Earth system models. The first large foundation models are becoming available now and their applicability to a range of tasks is explored. The three talks in the minisymposium cover both model development as well as the application and the physicality of the output and will provide a comprehensive overview of the state of the field. In the panel discussion at the end of the minisymposium, the panelists will share their insights about the state of the field and where it will be developing to in the future. The panel also provides the audience with the opportunity to engage with the panelists and to share their experiences and opinions.
Presentations
Foundation models have achieved remarkable accuracy in short- to medium-range weather forecasts, primarily focusing on atmospheric variables. However, predicting new physical variables typically requires training or fine-tuning the model with additional datasets, incurring significant costs. We show that new variables can be learned directly from the latent space of the Aurora foundation model. Our frugal extension involves training lightweight decoders using a small dataset, specifically a subset of ERA5 and MSWEP. These decoders accurately predict surface variables related to the water cycle, establishing the first baseline for many of these variables. For precipitation, our decoder achieves results comparable to those in the literature. This work presents an affordable method to extend foundation models beyond atmospheric predictions. It also suggests that Aurora captures an internal representation of the Earth system, contributing to a better definition and understanding of foundation models.
We present recent advancements in global weather modeling based on spherical neural operators. This innovative approach demonstrates superior skill and reduced computational costs compared to current state-of-the-art models. Our model is trained as a probabilistic ensemble, respecting spherical geometry and symmetries. This approach, inspired by first principles and its probabilistic formulation ensure plausible dynamics and stable spectra in the model's output.To scale training, a novel paradigm for model-parallelism, inspired by domain-decomposition is developed. This reduces memory and IO requirements per GPU, enabling training of larger models by splitting them across multiple GPUs. Leveraging model-parallelism in conjunction with data-parallelism and ensemble-parallelism enables massive parallelization to train the model on 1024 NVIDIA H100 GPUs.The resulting model's efficiency is remarkable, capable of generating a full year's rollout in under 13 minutes on a single GPU, while demonstrating skill improvement over current operational models at 0.25 degrees global resolution.These advancements represent a significant step forward in ML weather forecasting, offering improved accuracy and computational efficiency. The model's ability to capture uncertain dynamics of the weather system while maintaining realistic physical properties makes it a powerful tool for generating forecasts, providing early warnings for extreme events, and informing decision-making across various sectors.
AI-driven Numerical Weather Prediction (AI-NWP) models, trained on the ERA5 reanalysis are currently our best representation of historical day-to-day weather evolution. They have demonstrated significant skill in forecasting present-day weather, outperforming traditional physics-based forecasting systems. Emerging evidence suggests that AI-NWP models do not merely replicate past atmospheric states but effectively learn the underlying physical dynamics of the atmosphere. We anticipate that, especially at short time scales (on the order of several days), AI-NWP models will outperform most existing climate models in simulating realistic weather conditions, including weather in the future climate scenarios. This improved performance may result from the richer dynamics captured by high-resolution AI-NWP models (typically around 25 km) compared to conventional climate models (usually around 100 km), or simply from their superior representation of atmospheric processes learned from ERA5. Such capabilities could even help to correct biases in current climate models. In this study, we examine the applicability of AI-NWP models to various climate scenarios and demonstrate their potential benefits for contemporary climate research. Specifically, we highlight their capabilities for downscaling coarse-resolution climate simulations and explore their capabilities in investigating extreme weather events through a storyline approach by reproducing present-day extreme events under altered climate conditions.
This session will be an open discussion on developments and future directions. Machine learning for Earth system modeling is a highly dynamic field and this format will allow the panelists to also share the latest, yet-unpublished developments and to put these into the context of the overall state of the field. The panel discussion will also engage the audience and, through this, broaden the impact of the minisymposium. Three of the panelists are early career scientists, from which two are already well established. The panelists also come from a variety of backgrounds and institutions, e.g. climate science and machine learning, which will enrich the discussion.