Discovery of Molecular Targets for Identifying the Top Six Human Disease-Causing Shiga-Toxigenic E. Coli Non-O157 Strains

Objective

Over 100 different serotypes of Shiga-toxigenic E. coli have been reported to cause disease in
humans. In North America approximately half of human cases are caused by the serotype O157:H7
with the remaining cases being caused by non-O157 serotypes. From the non-O157 serotypes, six
serotypes are responsible for 70-80% of the reported cases. Because of the number of reported
cases of these six non-O157 serotypes and their ability to cause the same disease in humans as E. coli
O157, the Food Safety Inspection Service has deemed the six non-O157 serotypes to be adulterants
in beef trim. Twenty years of E. coli O157:H7 research has led to the development of reliable
culture methods and molecular tests for detecting and identifying this pathogen. The molecular tests
for E. coli O157 have been developed because genomic sequencing has given researchers a better
understanding of the genome. However, little is known about the genomic content of other nonO157 STECs. In order to design molecular markers for non-O157 STEC serotypes, addition
genomic sequence is needed for genome comparison. The more strains used when comparing
genomes, the better chance of finding serotype specific molecular markers. Currently, there is
genomic information in GenBank for one strain from three non-O157 STEC, O26:H11, O111:H8
and O103:H2. We sequenced the genomes of representative strains from six of the most reported
non-O157 serotypes that have caused disease in humans (O26, O111, O103, O121, O45 and O145).
This approach provided a more robust method for finding genomic variation that is distinct to each
non-O157 STEC serotype and provide candidate molecular markers for assay development that
distinguish STEC O26, O111, O103, O121, O45 and O145 serotypes.
<P>
1. Sequence genomes of 30 Shiga-toxigenic E. coli (STEC) non-O157 strains of the following serotypes: O26, O111, O103, O121, O45, and O145. <P> 2. To identify serotype-specific DNA sequences that can be used to develop a molecular based assay for the detection of the top six human disease-causing non-O157 STECs.

More information

Approach:
Genomic sequencing of non-O157 STEC and sequence analysis. We propose to sequence five strains from each of the non-O157 STEC serotypes O26, O45, O103, O111, O121, and O145. The U.S. Meat Animal Research Center has nearly finished (pending delivery of computer cluster, estimated December 2011) developing an in-house integrated and automated microbial genome sequencing and annotation system that takes as input a microbial isolate and outputs an annotated assembled genomic sequence suitable for GenBank submission. This system leverages our recent investment in a custom laboratory information system integrated with our new computer cluster and high capacity storage to track every step in the sequencing pipeline from strain metadata (source, sample processing protocols, etc.) to sequencing, assembly, and annotation. Genomic DNA will be extracted from each strain using a Qiagen DNA extraction kit, and shotgun DNA sequencing libraries will be made for both the PacBio and 454 DNA sequencers. We will combine PacBio single-molecule real-time sequencing (SMRT) and 454 reads to generate robust hybrid assemblies where longer SMRT reads identify mis-assemblies and span gaps in 454 only assemblies, while 454 reads improve the base call confidence in SMRT reads. Our automated genome assembly and annotation pipeline facilitates the routine comparison of genomes that have been uniformly assembled and annotated. Genome comparison and nucleotide polymorphism validation. Annotated genomes will be compared using software developed by current collaborators that is still under development. The software concatenates genes common to all the strains being compared and then aligns the concatenated sequences. Phylogeny software builds a tree from the concatenated sequences. Nucleotide polymorphisms responsible for the branches are exported to design assays for the Sequenom MassARRAY analyzer. Up to 40 nucleotide polymorphisms can be analyzed at a time using this instrument. We propose to interrogate between 60 and 80 nucleotide polymorphisms per serotype to identify nucleotide polymorphisms specific for each serotype. A total of 768 strains will be used to validate the nucleotide polymorphisms and will include 192 O157:H7, three O157:non-H7, four O55:H7, two O55:H6, 83 O111 STEC, 23 O111 non-STEC, 80 O26 STEC, 30 O26 non-STEC, nine O45 STEC, three O45 non-STEC, 24 O103 STEC, seven O103 non-STEC, five O145 STEC, one O145 non-STEC, six O121 STEC, six O121 non-STEC, 11 other E. coli STEC, 176 E. coli O-antigen standards, 61 Salmonella, and 42 other bacteria. (ARS Project No. 5438-42000-015-03) <P>

For complete projects details, view the <a href="http://www.beefresearch.org/CMDocs/BeefResearch/FY%202011%20Discovery%2…; target="_blank">Project Summary. </a>

Investigators

Harhay, Gregory; Bono, James

Institution

USDA - Agricultural Research Service

Start date

2011

End date

2012

Funding Source

Nat'l. Cattlemen's Beef Assoc.

Project number

BC-2011-3