Accelerating Sequence Alignments Based on FM-Index Using the Intel KNL Processor

Herruzo, Jose M.; Gonzalez-Navarro, Sonia; Ibanez-Marin, Pablo; Vinals-Yufera, Victor; Alastruey-Benede, Jesus; Plata, Oscar

Publicación: IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
2020
VL / 17 - BP / 1093 - EP / 1104
abstract
FM-index is a compact data structure suitable for fast matches of short reads to large reference genomes. The matching algorithm using this index exhibits irregular memory access patterns that cause frequent cache misses, resulting in a memory bound problem. This paper analyzes different FM-index versions presented in the literature, focusing on those computing aspects related to the data access. As a result of the analysis, we propose a new organization of FM-index that minimizes the demand for memory bandwidth, allowing a great improvement of performance on processors with high-bandwidth memory, such as the second-generation Intel Xeon Phi (Knights Landing, or KNL), integrating ultra high-bandwidth stacked memory technology. As the roofline model shows, our implementation reaches 95 percent of the peak random access bandwidth limit when executed on the KNL and almost all of the available bandwidth when executed on other Intel Xeon architectures with conventional DDR memory. In addition, the obtained throughput in KNL is much higher than the results reported for GPUs in the literature.

Access level

MENTIONS DATA