Performance Comparison and Tuning of Virtual Machines For Sequence Alignment Software

Main Article Content

Zachary John Estrada
Fei Deng
Zachary Stephens
Cuong Pham
Zbigniew Kalbarczyk
Ravishankar Iyer

Abstract


We explore the performance cost of virtualisation for the fast growing application domain of genomics. Traditionally, scientific applications have been considered too high-performance to pay the performance cost of virtualisation. However, as the demand for computing power for genomics is ever-increasing, the cloud can become an attractive way to meet the scaling challenge presented by Next-Generation Sequencing (NGS). We seek to explore the feasibility of running an NGS pipeline in a cloud, and in doing so consider two prevalent short-read sequence alignment programs, BWA and Novoalign. We executed those applications in three separate open-source system virtualisation solutions: the KVM hypervisor, the Xen para-virtualised hypervisor, and Linux Containers. We compare the runtime in each environment against the runtime of the same system without virtualisation and measure the relative performance of each hypervisor. We investigate and reduce as much as possible any overhead, presenting tuning suggestions for cloud implementers and users. Overall, we find that the overhead introduced by virtualisation can be reduced to low single-digit percentages, a cost we believe to be more than acceptable, especially given that two of the three solutions, Xen and Containers, exhibit near-zero overhead.

Article Details

Section
Special Issue Papers