Paper

AP1D - ACM Papers Session 1D

Fully booked

Monday, June 16, 2025

17:00

18:00

CEST

Room 6.0D13

Live streaming recording

Session recording

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Nina

Mujkanovic

ETH Zurich / CSCS

Description

Presentations

17:00

17:30

CEST

Scalable Genomic Context Analysis with GCsnap2 on HPC Clusters

GCsnap2 Cluster is a scalable, high performance tool for genomic context analysis, developed to overcome the limitations of its predecessor, GCsnap1 Desktop. Leveraging distributed computing withmpi4py.futures, GCsnap2 Cluster achieved a 22× improvement in execution time and can now perform genomic context analysis for hundreds of thousands of input sequences in HPC clusters. Its modular architecture enables the creation of task-specific workflows and flexible deployment in various computational environments, making it well suited for bioinformatics studies of large-scale datasets.This work highlights the potential for applying similar approaches to solve scalability challenges in other scientific domains that rely on large-scale data analysis pipelines.

Reto Krummenacher and Osman Seckin Simsek (University of Basel); Michèle Leemann, Leila Alexander, and Torsten Schwede (University of Basel, Swiss Institute of Bioinformatics); Florina Ciorba (University of Basel); and Joana Pereira (University of Basel, Swiss Institute of Bioinformatics)

17:30

18:00

CEST

Accelerated CNN-based Scans for Traces of Positive Selection

Positive natural selection is the driving force that enables species
to survive and reproduce in their environment. Localizing traces of positive selection has practical applications in studying virus evolution and designing more effective drug treatments. State-of-the-art methods for the detection of positive selection combine Convolutional Neural Networks (CNN) with sliding-window algorithms to scan genomic sequences with high precision, but require prohibitively long execution times to process whole genomes with fine-grained resolution. We present an FPGA-accelerated system for efficiently scanning whole genomes with high granularity, implementing a quantized version of FAST-NN, a CNN that has been designed through a hardware-aware neural architecture search. FAST-NN employs a compact representation of genomic data as features, which eliminates potential I/O bottlenecks in hardware. Our accelerator architecture consists of a dedicated stage for each CNN layer in a pipelined datapath that integrates a specialized buffer design; this enables data reuse between overlapping sliding windows by leveraging the dilated convolutions in FAST-NN. A design point implemented onto an Alveo U250 accelerator card achieves comparable accuracy to FAST-NN, with a maximum reduction of only 2.2% due to quantization, while producing a classification outcome in each clock cycle at a frequency of 100MHz. Scanning the entire human genome (excluding the sex chromosomes), we observed between 19.51× and 28.61× faster processing than a PyTorch implementation on a 16-core CPU, and between 1.22× and 2.89× faster processing than a high-end GPU. The architecture is adaptable to other domains where CNNs are deployed in sliding-window algorithms for large-scale data processing.

Sjoerd van den Belt and Nikolaos Alachiotis (University of Twente)

Bookmark
this session

Unbookmark
this session

Saving...