Main content area

VICTOR: genome-based phylogeny and classification of prokaryotic viruses

Meier-Kolthoff, Jan P, Göker, Markus
Bioinformatics 2017 v.33 no.21 pp. 3396-3404
Archaea, Internet, biogeochemical cycles, bioinformatics, data collection, databases, genome, monophyly, nucleotide sequences, pathogens, therapeutics, viruses
Bacterial and archaeal viruses are crucial for global biogeochemical cycles and might well be game-changing therapeutic agents in the fight against multi-resistant pathogens. Nevertheless, it is still unclear how to best use genome sequence data for a fast, universal and accurate taxonomic classification of such viruses. We here present a novel in silico framework for phylogeny and classification of prokaryotic viruses, in line with the principles of phylogenetic systematics, and using a large reference dataset of officially classified viruses. The resulting trees revealed a high agreement with the classification. Except for low resolution at the family level, the majority of taxa was well supported as monophyletic. Clusters obtained with distance thresholds chosen for maximizing taxonomic agreement appeared phylogenetically reasonable, too. Analysis of an expanded dataset, containing >4000 genomes from public databases, revealed a large number of novel species, genera, subfamilies and families. The selected methods are available as the easy-to-use web service ‘VICTOR’ at Supplementary data are available at Bioinformatics online.