An official website of the United States government.

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.



Microbial population genetic research has been crucial for understanding pathogen dynamics, virulence, hostspecificity, and many other topics; in many cases uncovering unexpected and transformative biologicalprocesses. However, conventional population genetic analyses are limited by the quantity of sequence datafrom each sample. The temporal, spatial, and evolutionary resolution of techniques that rely on single genesequences or multi-locus sequence typing are often insufficient to study biological processes on fine scales,precisely the scales at which many evolutionary and mechanistic process occur. Population genomics offers avast quantity of sequence information for inferring evolutionary and ecological processes on very fine spatialand temporal scales, inferences that are critical to understanding and eventually controlling many infectiousdiseases. The promise of population genomics is tempered, however, by difficulties in isolating and preparingmicrobes for next-generation sequencing. We have developed the selective whole genome amplification(SWGA) technology to sequence microbial genomes from complex biological specimens without relying onlabor-intensive laboratory culture, even if the focal microbial genome constitutes only a miniscule fraction ofthe natural sample. The primary hindrance to popular adoption of SWGA for microbial genomic studies is notits effectiveness in producing samples suitable for next-generation sequencing but in the upfront investmentneeded to develop an effective protocol to amplify the genome of a specific microbial species. Identifying anSWGA protocol that consistently results in selective and even amplification across the target genome iscurrently hindered by computationally-inefficient software that can evaluate a very limited set of thepotentially effective solutions. Further, this software uses marginally-effective optimality criteria as there iscurrently only a limited understanding of the true criteria that result in highly-selective and even amplificationof a target genome. As a result, SWGA protocol development is currently costly in both time and resources. Aprimary goal of the proposed research is to identify the criteria that result in optimal SWGA by analyzing next-generation sequencing data with advanced machine learning techniques. These optimality criteria will beintegrated into a freely-available, computationally-efficient swga development program that will reduce theupfront investment in SWGA protocol development, thus allowing researchers to address medically- andbiologically-important questions in any microbial species. In the near term, this project will also generateeffective SWGA protocols for four microbial species which can be used immediately to address fundamentalquestions in evolutionary biology, disease progression, and emerging infectious disease dynamics. From aglobal disease perspective, this work is imperative as the majority of microbial species cannot easily be culturedand are in danger of becoming bystanders in the genomics revolution that is currently elucidating evolutionaryprocesses and molecular mechanisms in cultivable microbial species.

Brisson, Dustin
University of Pennsylvania
Start date
End date
Project number
Accession number