Reference based multiple sequence alignment

SINA allows incorporating additional sequences into an existing multiple sequence alignment (MSA) without modifying the original alignment. While adding sequences to an MSA with SINA is usually faster than re-computing the entire MSA from an augmented set of unaligned sequences, the primary benefit lies in protecting investments made into the original MSA such as manual curation of the alignment, compute intensive phylogenetic tree reconstruction and taxonomic annotation of the resulting phylogeny.

Additionally, SINA includes a homology search which uses the previously computed alignment to determine the most similar sequences. Based on the search results, a LCA based classification of the query sequence can be computed using taxonomic classifications assigned to the sequences comprising the reference MSA.

SINA is used to compute the small and large subunit ribosomal RNA alignments provided by the SILVA project and is able to use the ARB format reference databases released by the project here.

An online version of SINA is provided by the SILVA project.

Publication

If you use SINA in your work, please cite:

Pruesse E, Peplies J, Glöckner FO. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics. 2012;28(14):1823-9. doi:10.1093/bioinformatics/bts252