Minisymposium
MS4G - Moving Towards a More Sustainable HPC Ecosystem: A Full Stack Approach
Live streaming
Session Chair
Description
Global challenges around climate change necessitate a move to a greener society. However, mathematical simulations on supercomputers and the recent rise in popularity of AI consumes vast amounts of energy with associated environmental impact. Indeed, there are estimates that computing could consume up to 20% of the world’s electricity by the end of the decade. This is not only bad for the environment, because energy is a significant contributing factor to the CO2 emissions of computing, but it also limits the size of a supercomputer because very few locations can deliver the necessary amount of energy. Moreover, recent studies predict that data centres may account for up to 8% of global Carbon emissions by 2030, however despite these alarming numbers, moving HPC towards net zero has not seen the same urgency as other sectors. Given the scale of the challenge, this requires a full stack approach, identifying and exploring opportunities to reduce our carbon footprint across the entire ecosystem. The purpose of this minisymposium is to highlight opportunities for improved sustainability in key areas of the HPC ecosystem, explore how we can address these as a community and how these areas can interoperate to deliver a step change in sustainability.
Presentations
Earth System Models are crucial for simulating environmental processes but demand significant computational resources and energy. In this project we explore the potential of dataflow architectures to enhance both computational and energy efficiency of ESMs. We will primarily discuss the Cerebras Wafer Scale Engine, examining its capabilities and evaluating its suitability for the shallow water equation.
As the demand for high-performance computing (HPC) in weather and climate grows, integrating machine learning (ML) techniques offers a promising pathway to enhance predictive skill while addressing sustainability. This talk presents an ML-driven approach to identify sources of long-term predictability for the El Niño-Southern Oscillation (ENSO), focusing on the interplay between ocean (Sea Surface Temperature (SST) and heat content) and atmosphere (near-surface zonal wind, U10) variables. Our findings reveal that tropical SST serves as the primary source of predictability, while U10 alone exhibits comparable predictive skill to SST at lead times of 11 to 21 months, particularly from late fall to late spring. We uncover a long-lead signal originating from coupled wind-SST interactions in the Indian Ocean (IO), which propagates across the Pacific via an atmospheric bridge mechanism. By leveraging ML to optimize predictive models, we explore how such approaches can reduce computational costs and energy consumption in HPC systems, contributing to more sustainable climate prediction frameworks. This work aligns with the broader goal of integrating ML into climate modeling to enhance efficiency and scalability while minimizing environmental impact.
In order to move towards more sustainable HPC operations, it is important to understand and improve the power efficiency of data centers. In NCAR's Computational and Information Systems Lab (CISL) we have focussed heavily on reducing waste energy and ultimately driving a more efficient mode of operations for our systems. In this talk I will highlight some of the key activities that we have undertaken and explore the impact of these.