- Wanner, Barry
- Purdue University
- Start date
- End date
- Basic knowledge of the biochemistry, molecular biology, and genetics of Escherichia coli K-12 may be more advanced than for any other organism. This information ranges from environmental physiologic responses down to the atomic level and its the sheer volume makes it exceedingly difficult, even for experts, to fully grasp and utilize. While many databases, websites, and computational tools enable access to and interrogation of these biological data, none is comprehensive and many gaps exist. The quality and quantity of these resources vary; many have redundant, conflicting, and out-of-date information. Since data typically cannot be traced back to sources, reliability is constantly in question. Existing information systems are disorganized, overwhelmed by current knowledge, and not prepared for handling new information from high-throughput experimentation (genome-wide transcription, translation, and metabolite profiling, large-scale structural biology, enhanced imaging of living cells, and the like).
The proposed integrated 'one-stop-shopping' E. coli community information Resource, EcoliHub, will permit full use of existing knowledge and enable new discoveries for a deeper understanding of life processes. These tools and the advances they allow will greatly impact human health both through application to pathogenic bacteria, especially enteropathogenic E. coli, Shigella, and Salmonella species, and because many cellular processes are universal.
This EcoliHub Resource builds on our prototype database designed for seamless and transparent bi-directional connections with cooperating interoperable resources (EcoCyc, Genobase and the OU microarray DB), and with major biological databases (e.g., ERIC, NCBI Entrez Gene, UniProt, and KEGG). EcoliHub will provide a chat room to facilitate community-driven developments that meet the needs of the users. It will offer on-the-fly computational tools for molecular and structural analysis; like 3D-PSSM, BLAST/PSI-BLAST, BLOCKS, Pfam, and PredictProtein.
The Resource will maintain a depository of related E. coli informatics from specialized databases including Brenda, GeneMark, GenProtEC, GO, LipoP, MEROPS, PORES, RegulonDB, Superfamily, and TransportDB. Schemata will be designed to annotate and track changes within the information sources. A module for biosamples (strains, plasmids, etc.) will facilitate materials acquisition from national repositories (the Coli Genetic Stock Center and others). Interoperation and provision of information will be achieved with web services and a web-based architecture will provide public access to the Resource, including intuitive software for data mining and high-throughput functional studies.
- Funding Source
- Nat'l. Inst. of General Medical Sciences
- Project number
- Escherichia coli