Burrows-wheeler aligner algorithm example

05.03.2019 Ciera

Bwa is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. Bwa-mem and bwa-sw share similar features such as long-read support and split alignment, but bwa-mem, which is the latest, is generally recommended for high-quality queries as. The reference genome can be very large. For example, the size of the human genome is around 3 billion nucleotide bases.

It does not implement any of the heuristic methods included in the actual bwa algorithm, so implements the basic algorithm which returns 100 accurate results. It uses the d array to prune the tree search and speed up the inexact search algorithm. The first algorithm is designed for illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1mbp. Differences between this code and the real bwa algorithm. The search is case insensitive. This has been coded for ease of understanding, ease of reading and with the goal of better understaning the basic bwa algorithm and its datastructures.

It uses the burrows-wheeler transform bwt, suffix array sa, and 2 other auxillary datastructures c and occ sometimes called. Bwa is defined as burrows-wheeler aligner algorithm somewhat frequently. A popular software package for mapping low-divergent sequences against a large-reference genome, such as the human genome.

This is not in any way optimised.