Minisymposium

MS3B - Challenges and Opportunities for Next-Generation Research Applications and Workflows

Fully booked

Tuesday, June 17, 2025

11:30

13:30

CEST

Room 5.0B15 & 16

Live streaming recording

Session recording

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Ewa

Deelman

University of Southern California

Description

We are increasingly engaged in transdisciplinary research to address the complex challenges facing our world. These challenges include transitioning to renewable energy systems, advancing personalized medicine, utilizing digital twins, and accurately predicting climate change and its impacts on local and regional ecosystems. As we look toward a future shaped by computing, data, and AI, we aim to leverage various digital services & methodologies. In this context, application and workflow-focused approaches can play a crucial role in advancing scientific frontiers by harnessing the potential of integration. This approach could also serve as a long-term strategy for upholding the principles of sustainability, openness, and transparency, particularly within federated ecosystems. Thus, engaging in discussions about next-generation application workflows is just as crucial for advancing research as conversations concerning the development of digital infrastructure. This minisymposium will convene experts from various domains, each focusing on different aspects of the scientific research lifecycle. Speakers will critically examine the role of AI, explore performance and productivity beyond Moore's Law, and discuss how generative strategies can empower physics-based simulations. Representing early-career researchers, Dr. Filippo Gatti from France will discuss generative strategies for physics-based simulations.

Presentations

11:30

12:00

CEST

Resisting Contextual Collapse through the iPlaces Platform: Data Trusts, High-Performance Computing, and Digital Twin Technology to Empower Local Communities

Field Stations advance our understanding of the physical, biogeochemical, ecological, social, and economic interactions that constitute place. Society has sophisticated ‘open science’ (OS)(cyberinfrastructure and progress is being made toward digital twins of Earth systems; yet local communities often feel disconnected from place-based scientific information and its benefits. One reason is that metadata describing samples/data collected in situ, including legal and social metadata that are vital for their reuse, can be stripped or lost in downstream applications. A novel publishing platform (iPlaces) creates a culture in which the common self-interest of all participants is clear to everyone. iPlaces enables field stations (and other anchor institutions) to publish project descriptions and related documentation in their station-branded journal. Using familiar peer review processes, station directors act as editors in a collaborative ecosystem that leverages OS data services, while empowering local communities to enter a dialogue with research teams. Benefits flow up and down value chains as: (1) place-based metadata are systematically layered onto research projects, (2) global OS infrastructure automatically applies this metadata to downstream research outputs, and (3) data trust services link outputs back to field stations and their communities. The power of this approach is discussed in a variety of contexts.

Neil Davies (Gump South Pacific Research Station, University of California; Berkeley Institute for Data Science, University of California, Berkeley) and Erin Robinson (Metadata Game Changers, LLC)

12:00

12:30

CEST

Reimagining Performance and Reproducibility in the Post-Moore Era: Innovations in Checkpointing and Workflow Management

In the post-Moore era, the quest for enhanced performance and reproducibility is more critical than ever. As researchers and engineers in high-performance computing (HPC) and scientific computing, reimagining key areas such as algorithms, hardware architecture, and software is essential to drive progress. In this talk, we will explore how performance engineering is evolving, focusing on checkpointing and the management of intermediate data in scientific workflows. We will first discuss the shift from traditional low-frequency checkpointing techniques to modern high-frequency approaches that require complete histories and efficient memory use. By breaking data into chunks, using hash functions to store only modified data, and leveraging Merkle-tree structures, we improve efficiency, scalability, and GPU utilization while addressing challenges like sparse data updates and limited I/O bandwidth. We will also examine the balance between performance and data persistence in workflows, where cloud infrastructures often sacrifice reproducibility for speed. To overcome this, we propose a persistent, scalable architecture that makes node-local data shareable across nodes. By rethinking checkpointing and cloud data architectures, we show how innovations in algorithms, hardware, and software can significantly advance both performance and reproducibility in the post-Moore era.

Michela Taufer (University of Tennessee)

12:30

13:00

CEST

ODISSEI: High Performance Computing for Social Science Research

ODISSEI’s advanced scientific computing infrastructure demonstrates how high-performance computing (HPC) can transform social science research. By leveraging a national supercomputer, ODISSEI provides social scientists with a secure, powerful HPC environment to process massive longitudinal datasets and complex data linkages. Researchers can now apply complex models and simulations to sensitive and high resolution data—such as large-scale network analysis, agent-based modeling, and deep neural networks—thanks to ample memory, massive parallelism, and specialized hardware (GPUs) that accelerate computation. HPC yields significant computational efficiencies: tasks that once took months can run in a matter of days, greatly accelerating the research workflow and iterative discovery. ODISSEI’s infrastructure is highly scalable, accommodating the ever-growing volume and complexity of datasets, drawn from administrative, experimental and web sources, while maintaining performance. Equally important, this integration of cutting-edge computing within social science fosters interdisciplinary collaboration between researchers, data providers, and computer scientists. Notably, the combination of extensive, well-annotated social science datasets with supercomputing capabilities that ODISSEI offers is unprecedented, positioning it at the forefront of data-intensive social research. In this presentation I provide a number of use cases where exceptionally rich data sources have led to new, innovative, and novel research lines.

Kasia Karpinska and Tom Emery (Erasmus University Rotterdam)

13:00

13:30

CEST

Integrating Fourier Neural Operators with Diffusion Models to Improve the Spectral Representation of Synthetic Earthquake Ground Motion Response

This study integrates the Multiple-Input Fourier Neural Operator (MIFNO) with the diffusion model by Gabrielidis et al. (2024) to address challenges in capturing mid-frequency details in synthetic earthquake ground motion. MIFNO, a computationally efficient surrogate model for seismic wave propagation, processes 3D heterogeneous geological data along with earthquake source characteristics. It is trained to reproduce the three-component (3C) earthquake wavefield at the surface. The HEMEWS-3D database (Lehmann et al., 2024) is used, comprising 30,000 earthquake simulations across varying geologies with random source positions and orientations. These reference simulations were conducted using the high-performance SEM3D software (CEA et al., 2017), which excels in simulating fault-to-structure scenarios at a regional scale. While SEM3D provides accurate results at lower frequencies, its performance degrades with increasing frequency due to complex physical phenomena and a known bias in neural networks, which struggle with small-scale features. This limitation restricts MIFNO's applicability in earthquake nuclear engineering. The proposed combination with the diffusion model aims to mitigate this issue and improve the accuracy of mid-frequency predictions in synthetic ground motion generation.

Filippo Gatti (Université Paris-Saclay), Fanny Lehmann (ETH AI Center), Niccolò Perrone and Stefania Fresca (Politecnico di Milano), and Hugo Gabrielidis (Université Paris-Saclay)

Bookmark
this session

Unbookmark
this session

Saving...