Main content area

Molecular complexity of successive bacterial epidemics deconvoluted by comparative pathogenomics

Beres, Stephen B., Carroll, Ronan K., Shea, Patrick R., Sitkiewicz, Izabela, Martinez-Gutierrez, Juan Carlos, Low, Donald E., McGeer, Allison, Willey, Barbara M., Green, Karen, Tyrrell, Gregory J., Goldman, Thomas D., Feldgarden, Michael, Birren, Bruce W., Fofanov, Yuriy, Boos, John, Wheaton, William D., Honisch, Christiane, Musser, James M.
Proceedings of the National Academy of Sciences of the United States of America 2010 v.107 no.9 pp. 4371-4376
Streptococcus, clones, disease outbreaks, genes, high-throughput nucleotide sequencing, infectious diseases, mass spectrometry, microarray technology, nucleotide sequences, phenotype, population structure, serotypes, single nucleotide polymorphism, transcriptome, Ontario
Understanding the fine-structure molecular architecture of bacterial epidemics has been a long-sought goal of infectious disease research. We used short-read-length DNA sequencing coupled with mass spectroscopy analysis of SNPs to study the molecular pathogenomics of three successive epidemics of invasive infections involving 344 serotype M3 group A Streptococcus in Ontario, Canada. Sequencing the genome of 95 strains from the three epidemics, coupled with analysis of 280 biallelic SNPs in all 344 strains, revealed an unexpectedly complex population structure composed of a dynamic mixture of distinct clonally related complexes. We discovered that each epidemic is dominated by micro- and macrobursts of multiple emergent clones, some with distinct strain genotype-patient phenotype relationships. On average, strains were differentiated from one another by only 49 SNPs and 11 insertion-deletion events (indels) in the core genome. Ten percent of SNPs are strain specific; that is, each strain has a unique genome sequence. We identified nonrandom temporal-spatial patterns of strain distribution within and between the epidemic peaks. The extensive full-genome data permitted us to identify genes with significantly increased rates of nonsynonymous (amino acid-altering) nucleotide polymorphisms, thereby providing clues about selective forces operative in the host. Comparative expression microarray analysis revealed that closely related strains differentiated by seemingly modest genetic changes can have significantly divergent transcriptomes. We conclude that enhanced understanding of bacterial epidemics requires a deep-sequencing, geographically centric, comparative pathogenomics strategy.