Abstract
In recent years there has been a rapid development in sequencing technologies. These new technologies produce data in the order of several gigabase-pairs per day. The sequences produced are short and numerous. These short sequences are often used for resequencing. Resequencing is when sequence DNA from an organism with a known genome sequence is aligned to a reference genome of the organism. Doing this alignment with the traditional alignment tools like BLAST have proved to be too time-consuming, and because of this several new short sequence aligners been developed. These new tools are much faster than the traditional tools. I wanted to study whether a GPU could be used to create a faster tool for this, because a GPU is a great tool to speed up algorithms through massive parallelism.
I have developed GPUalign, a short read alignment tool. It uses a simple hash based index algorithm that aligns the reads with massive parallelism on the GPU. Tests have evaluated speed and accuracy of GPUalign and compared it to the state of the art tool BWA. GPUalign performed well and showed a great potential for the use of GPUs in short sequence alignment. At the same time GPUalign also have much room for further improvements in speed and accuracy. GPUalign should also scale well with future improvements in GPU technology.