Button Text
Back

P29 - itwinai: Enabling Scalable AI Workflows on HPC for Digital Twins in Science

This is some text inside of a div block.
This is some text inside of a div block.
-
This is some text inside of a div block.
CEST
Climate, Weather and Earth Sciences
Chemistry and Materials
Computer Science, Machine Learning, and Applied Mathematics
Applied Social Sciences and Humanities
Engineering
Life Sciences
Physics
This is some text inside of a div block.

Description

The interTwin project is advancing the integration of Digital Twins across scientific domains, focusing on physics and climate research. A key component of this project is itwinai, a Python library designed to streamline scalable AI workflows on High-Performance Computing (HPC) systems. With its unified interface, itwinai simplifies the deployment and optimisation of AI models across leading frameworks for distributed machine learning. The library features tooling for profiling scalability and monitoring GPU utilisation, allowing scientists to better understand and show how well their code is distributed. It also helps to identify inefficiencies, enhancing sustainability and helping to develop greener AI solutions. Recent advancements include support for large-model parallelism and distributed hyperparameter optimization (HPO). By providing a uniform pipeline to run AI workflows easily and intuitively, itwinai lowers the barriers to these complex domains, empowering scientists to achieve reproducible, high-performance results on HPC infrastructure. Through integration with interLink, itwinai facilitates seamless offloading of compute-intensive tasks from cloud to HPC. Validated on diverse use cases in physics and climate research, including collaborations with CMCC, EURAC and CERFACS, itwinai has shown that it has the potential to address challenges in renewable energy, climate modelling, and sustainable development.

Presenter(s)

Presenter

Matteo
Bunino
-
CERN

Matteo Bunino earned a double MSc degree in Data Science and Computer Engineering from the Polytechnic University of Turin (Italy) and EURECOM (France). He worked at Huawei's Munich Research Center (MRC) on AI-powered malware analysis, resorting to reinforcement learning, NLP, and graph machine learning.Currently, Matteo is a fellow in the IT department at CERN and he is working on interTwin, a European project aimed at developing a unified digital twin engine (DTE) for science. In particular, Matteo is the main developer of "itwinai", a toolkit for advanced MLOps on cloud and HPC aimed at simplifying the access to large-scale distributed ML and hyper-parameter optimization for scientific use cases. Moreover, Matteo is also part of CERN openlab, where he is investigating digital twin applications with Nvidia Omniverse and heterogeneous computing benchmarking.

Authors