Paper

Accelerated CNN-based Scans for Traces of Positive Selection

Monday, June 16, 2025

17:30

18:00

CEST

Climate, Weather and Earth Sciences

Chemistry and Materials

Computer Science and Applied Mathematics

Engineering

Life Sciences

Physics

Presenter

Sjoerd

van den Belt

University of Twente

I am a PhD student at the University of Twente with an interest in model and hardware co-design for machine learning. My main research focuses on integrating novel analog processing elements within digital accelerators for energy-efficient computer vision.My work on accelerating the detection of positive selection within genomic data combines theoretical knowledge on selective sweeps with hardware-aware machine learning model design. This approach leads to fast and energy-efficient inference on genomic data, that can be accelerated using dedicated hardware.

Watch replay

Description

Positive natural selection is the driving force that enables species
to survive and reproduce in their environment. Localizing traces of positive selection has practical applications in studying virus evolution and designing more effective drug treatments. State-of-the-art methods for the detection of positive selection combine Convolutional Neural Networks (CNN) with sliding-window algorithms to scan genomic sequences with high precision, but require prohibitively long execution times to process whole genomes with fine-grained resolution. We present an FPGA-accelerated system for efficiently scanning whole genomes with high granularity, implementing a quantized version of FAST-NN, a CNN that has been designed through a hardware-aware neural architecture search. FAST-NN employs a compact representation of genomic data as features, which eliminates potential I/O bottlenecks in hardware. Our accelerator architecture consists of a dedicated stage for each CNN layer in a pipelined datapath that integrates a specialized buffer design; this enables data reuse between overlapping sliding windows by leveraging the dilated convolutions in FAST-NN. A design point implemented onto an Alveo U250 accelerator card achieves comparable accuracy to FAST-NN, with a maximum reduction of only 2.2% due to quantization, while producing a classification outcome in each clock cycle at a frequency of 100MHz. Scanning the entire human genome (excluding the sex chromosomes), we observed between 19.51× and 28.61× faster processing than a PyTorch implementation on a 16-core CPU, and between 1.22× and 2.89× faster processing than a high-end GPU. The architecture is adaptable to other domains where CNNs are deployed in sliding-window algorithms for large-scale data processing.

Authors

Bookmark
this session

Unbookmark
this session

Saving...