Minisymposium

MS4D - Biopreparadness at Scale via Context-Aware Agent-Based Models

Fully booked

Tuesday, June 17, 2025

15:00

17:00

CEST

Room 6.0D13

Live streaming recording

Session recording

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Adam

Spannaus

Oak Ridge National Laboratory

Description

A rapid response to the initial phase of the COVID-19 pandemic was hampered by decentralized data collection, analysis, and the novelty of the virus itself; vital metrics for virus characteristics, such as its transmissibility and virulence, were unknown. Moreover, the disease progression was spatially heterogeneous; different regions experienced waves at varying times and with differing intensities. To mitigate these challenges and better inform public health officials for the next pandemic, we are developing methods to assimilate real-world data into biologically informed agent-based models, facilitating biopreparedness at scale in near-real time. These models will allow for population stratification along multiple comorbidities or socio demographic factors across diverse geospatial regions by incorporating decentralized data in a mathematically private way. By incorporating this data from varied populations across a region, these models will assist public health agencies in mitigating an emerging outbreak and effectively managing hospital capacity. We will highlight different computational methods designed to address these key bioprepardness challenges in this minisymposium.

Presentations

15:00

15:30

CEST

Enhanced Uncertainty Quantification in Air Pollution Models and Impact on Epidemiological Risk

Advancements in remote sensing, geospatial data, physicochemical and source apportionment models, citizen science networks and machine learning have greatly improved our ability to predict air pollution at high spatiotemporal resolution and over large domains and time periods. Air pollution models with high predictive performance and low uncertainty are critical for estimating population and individual level human exposures and their health risks, in retrospective studies and for forecasting. This talk will present advances in air pollution modeling that integrate multi-modal data and deep learning and estimate uncertainty in predictions which can then inform or be integrated into epidemiological analyses. Current challenges and future needs for advancing these models that are currently being pursued by our team for biopreparedness and health risk applications will also be discussed.

Rima Habre (University of Southern California)

15:30

16:00

CEST

Anomaly Detection with a Deep Abstaining Classifier Model Under Federated Learning

A deep abstaining classifier, or DAC, introduced first for combating label noise, is a regular deep neural network classifier (DNN) but with an additional (abstain) class and a custom loss function that permits abstention during training. This allows the DAC to identify and abstain on (or decline to classify) confusing samples, without the need for manually labeling these cases, while continuing to learn and improve classification performance on the non-abstained samples. It has been shown that the resulting models can significantly improve accuracy compared to the original DNN, at the cost of reduced coverage. The DAC learns patterns in the data that make prediction unreliable, is more robust to feature noise, and constitutes a useful tool to diagnose uncertain predictions at inference time. In this talk, we describe how we adapt the DAC to be combined with Federated Learning (FL) to allow for a distributed training using data from different silos that cannot be openly shared due to privacy concerns, as is frequently encountered when dealing with health records. We also demonstrate how this DAC+FL model can be applied for anomaly detection and discuss how to configure this per silo.

Cristina Garcia Cardona and Jamal Mohd-Yusof (Los Alamos National Laboratory)

16:00

16:30

CEST

Data-Driven Agent Based Modeling for Precision Public Health

High-quality, accurate, and real-time information about disease spread is critical for rapid response to a biothreat. Electronic health data is collected by approximately 90% of all physicians in the United States. However, its broad use for public health surveillance and monitoring is inhibited by a lack of organization around common and high-quality information. We use AI to automatically code unstructured clinical documents. During a biothreat scenario, these tools will help synthesize data across multiple medical records systems and enable the rapid identification of vulnerable populations. In addition to extracting important information from clinical records, we are developing an autonomous biothreat agent that scans reputable public health surveillance reports and identifies emerging threats. Once an emerging threat is identified, our AI agents will be able to search existing scientific literature to extract the disease specific parameters for epidemiological modeling. AI is helping us develop the necessary infrastructure for near real-time situational readiness, so that we will be prepared during the next Covid-like event.

Heidi Hanson (Oak Ridge National Laboratory)

16:30

17:00

CEST

Speeding Up LLM Inference via Sequential Speculative Decoding

As Large Language Models (LLMs) grow in size and capability, their high computational cost poses a major challenge for real-time applications, making efficient inference a critical research problem. Speculative Decoding (SD) has emerged as a promising technique to accelerate LLM inference by leveraging a smaller draft model to generate candidate tokens, which are then verified in parallel by a larger target model to ensure statistical consistency. However, the need for frequent verification calls to the target LLM limits the potential speedup of SD. We propose SPRINTER, which utilizes a low-complexity verifier trained to predict if tokens generated by the draft model would be accepted by the target LLM. By performing approximate sequential verification, SPRINTER eliminates the need for constant verification by the target LLM and is only invoked when a token is deemed unacceptable. This significantly reduces the number of calls to the larger model, enabling further acceleration. We present a theoretical analysis of SPRINTER, examining the statistical properties of the generated tokens and the expected reduction in latency as a function of the verifier. Our evaluations on multiple datasets and model pairs demonstrate that approximate verification can maintain high-quality generation while achieving even greater speedups.

Meiyu Zhong, Noel Teku, and Ravi Tandon (The University of Arizona)

Bookmark
this session

Unbookmark
this session

Saving...