Minisymposium Presentation
Sustainable, Trustworthy Coupled HPC+AI for Molecular Simulation and Materials Design: Energy Consumption, Correctness, and Efficient Training on Leadership Platforms
Presenter
Dr. Ada Sedova is a Research Scientist in the Molecular Biophysics Group at the Oak Ridge National Laboratory. Her research is in high performance computing (HPC), scientific computing, physical chemistry, biophysics, bioinformatics, biochemistry and chemical physics. This includes HPC programming environments, scientific computing using HPC for simulation and analysis, and experimental work including neutron scattering, analytical biophysics and electrochemistry. Ada was a CSEEN Postdoctoral Research Associate in the Scientific Computing Group at NCCS, ORNL, and a Postdoctoral Research Associate in the Department of Chemistry, University at Albany. She received her PhD in Biomedical Sciences with focuses on computational biophysics, biophysical chemistry and structural biology from the joint program at the NY State Department of Health Wadsworth Center and University at Albany’s Biomedical Sciences Department. She also received a Masters in Mathematics from the University at Albany’s Department of Mathematics and Statistics with research in stochastic processes in physics and chemistry. Her research focuses on bridging gaps across technological areas.
Description
The promise of accelerating and advancing molecular simulation and materials design efforts with coupled HPC and deep learning (DL) workflows has motivated an explosion in a variety of approaches. In particular, leadership computing facilities have supported a diverse set of large-scale efforts in this area. But with the increasing size of models and advanced active learning workflows for training, which are arising in response to the need for improvements in accuracy and reliability of model predictions, concerns emerge about excessive energy consumption and the sustainability of HPC+AI simulation efforts for science. In this talk, I will describe experiences developing, deploying and assessing the results of leadership-scale HPC+DL efforts in modeling for molecular and materials sciences, from biosciences to advanced materials and nuclear energy, and using several different national leadership supercomputing resources. Successes, pain points, and lessons learned will be described, as well as tools being developed to help monitor robustness, correctness and reproducibility as well as power and energy metrics across software stack layers and parallel resources.