An official website of the United States government.

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Discovery of Molecular Targets for Identifying the Top Six Human Disease-Causing Shiga-Toxigenic E. Coli Non-O157 Strains

Harhay, Gregory; Bono, James
USDA - Agricultural Research Service
Start date
End date
Over 100 different serotypes of Shiga-toxigenic E. coli have been reported to cause disease in humans. In North America approximately half of human cases are caused by the serotype O157:H7 with the remaining cases being caused by non-O157 serotypes. From the non-O157 serotypes, six serotypes are responsible for 70-80% of the reported cases. Because of the number of reported cases of these six non-O157 serotypes and their ability to cause the same disease in humans as E. coli O157, the Food Safety Inspection Service has deemed the six non-O157 serotypes to be adulterants in beef trim. Twenty years of E. coli O157:H7 research has led to the development of reliable culture methods and molecular tests for detecting and identifying this pathogen. The molecular tests for E. coli O157 have been developed because genomic sequencing has given researchers a better understanding of the genome. However, little is known about the genomic content of other nonO157 STECs. In order to design molecular markers for non-O157 STEC serotypes, addition genomic sequence is needed for genome comparison. The more strains used when comparing genomes, the better chance of finding serotype specific molecular markers. Currently, there is genomic information in GenBank for one strain from three non-O157 STEC, O26:H11, O111:H8 and O103:H2. We sequenced the genomes of representative strains from six of the most reported non-O157 serotypes that have caused disease in humans (O26, O111, O103, O121, O45 and O145). This approach provided a more robust method for finding genomic variation that is distinct to each non-O157 STEC serotype and provide candidate molecular markers for assay development that distinguish STEC O26, O111, O103, O121, O45 and O145 serotypes.

1. Sequence genomes of 30 Shiga-toxigenic E. coli (STEC) non-O157 strains of the following serotypes: O26, O111, O103, O121, O45, and O145.

2. To identify serotype-specific DNA sequences that can be used to develop a molecular based assay for the detection of the top six human disease-causing non-O157 STECs.

More information
Approach: Genomic sequencing of non-O157 STEC and sequence analysis. We propose to sequence five strains from each of the non-O157 STEC serotypes O26, O45, O103, O111, O121, and O145. The U.S. Meat Animal Research Center has nearly finished (pending delivery of computer cluster, estimated December 2011) developing an in-house integrated and automated microbial genome sequencing and annotation system that takes as input a microbial isolate and outputs an annotated assembled genomic sequence suitable for GenBank submission. This system leverages our recent investment in a custom laboratory information system integrated with our new computer cluster and high capacity storage to track every step in the sequencing pipeline from strain metadata (source, sample processing protocols, etc.) to sequencing, assembly, and annotation. Genomic DNA will be extracted from each strain using a Qiagen DNA extraction kit, and shotgun DNA sequencing libraries will be made for both the PacBio and 454 DNA sequencers. We will combine PacBio single-molecule real-time sequencing (SMRT) and 454 reads to generate robust hybrid assemblies where longer SMRT reads identify mis-assemblies and span gaps in 454 only assemblies, while 454 reads improve the base call confidence in SMRT reads. Our automated genome assembly and annotation pipeline facilitates the routine comparison of genomes that have been uniformly assembled and annotated. Genome comparison and nucleotide polymorphism validation. Annotated genomes will be compared using software developed by current collaborators that is still under development. The software concatenates genes common to all the strains being compared and then aligns the concatenated sequences. Phylogeny software builds a tree from the concatenated sequences. Nucleotide polymorphisms responsible for the branches are exported to design assays for the Sequenom MassARRAY analyzer. Up to 40 nucleotide polymorphisms can be analyzed at a time using this instrument. We propose to interrogate between 60 and 80 nucleotide polymorphisms per serotype to identify nucleotide polymorphisms specific for each serotype. A total of 768 strains will be used to validate the nucleotide polymorphisms and will include 192 O157:H7, three O157:non-H7, four O55:H7, two O55:H6, 83 O111 STEC, 23 O111 non-STEC, 80 O26 STEC, 30 O26 non-STEC, nine O45 STEC, three O45 non-STEC, 24 O103 STEC, seven O103 non-STEC, five O145 STEC, one O145 non-STEC, six O121 STEC, six O121 non-STEC, 11 other E. coli STEC, 176 E. coli O-antigen standards, 61 Salmonella, and 42 other bacteria. (ARS Project No. 5438-42000-015-03)

For complete projects details, view the Project Summary.

Funding Source
Nat'l. Cattlemen's Beef Assoc.
Project source
View this project
Project number
Escherichia coli
Bacterial Pathogens