RiceGenes - An Information System for Rice Research

Published in Probe Volume 4(3-4): August 1994-January 1995


Edyth Paul, Makoto Goto and Susan McCouch
Department of Plant Breeding and Biometry
Cornell University

The RiceGenes database came into existence in April 1993, funded by a grant from the USDA Plant Genome Research Program. Housed at Cornell University, the database contains a variety of information related to the rice genome and rice germplasm. It is our hope that the construction and free distribution of the database will provide an information tool that is useful to the international rice research community.

For a number of years, rice DNA probes developed and maintained at Cornell University have been freely distributed to rice researchers around the globe. Not only were these probes used to create the original and the expanded high- density molecular maps at Cornell (McCouch et al., 1988, Causse et al., 1994), but they have also been used in molecular mapping efforts in many other rice-producing countries. The initial focus of the database development effort was to make available all information relating to these maps and markers in order to facilitate their efficient use throughout the world.

Among the data included are detailed descriptions of close to 700 DNA markers, as well as the raw mapping data from the interspecific backcross population that is the basis of the Cornell map. A set of molecular maps developed by other programs using subsets of these markers but based on different mapping populations is in the process of being loaded. In addition, the classical genetic map based on known genes and morphological mutant markers is also available in RiceGenes. Estimates of the molecular weights of RFLP bands detected by over 1200 probe/enzyme combinations have been calculated, and images of parental survey blots have been included for over 250 of the probes. We have recently incorporated an extensive data set which includes the molecular characterization of a large sample of rice germplasm. Approximately 150 diverse rice accessions were surveyed with 150 probes, and the molecular weights and banding patterns that resulted were catalogued. Each accession record includes a list of probe/enzyme combinations and the corresponding molecular weights of the bands they produce. Each molecular allele has a link to all accessions in which it was detected, and each probe/enzyme polymorphism record groups the total set of accessions according to like banding patterns. This allows the user to see all other accessions with a similar profile for a given probe/enzyme, to assess how "rare" a particular band is, and to compare the alleles contained in two different accessions.

We will soon have available a maize/rice comparative map containing over 650 markers and showing homologous regions between these two genomes. Images are already available which show the polymorphisms resulting from select probe/enzyme combinations on barley, rice, oat, wheat and sugar cane. Comparative maps of rice, oat, wheat, barley, maize, sorghum and millet are currently under development using rice as the basis for the comparison of these genomes. These maps will also be available through RiceGenes. Work is underway in collaboration with the Japanese Rice Genome Research Program to produce an integrated map which will significantly increase the level of molecular detail available.

To further add to the molecular data, links are being made to rice sequences deposited in the public sequences databases. Links are also being made to the GRIN system, providing access to performance records for over 50,000 rice accessions. Both of these links will allow RiceGenes users to seamlessly access information from physically discrete databases via the World Wide Web (described below). Finally, we hope to establish collaborations with groups interested in extending RiceGenes, particularly in terms of pest and pathogen information.

For users who have network connections, RiceGenes is accessible via the Internet in four formats. The first is a graphical user interface based on the ACEDB software which is available to users with direct TCP/IP network connections and X11 graphics capability (this includes most UNIX workstations, or personal computers with some additional, inexpensive software). The ACEDB format provides live graphics, photographic images and text displays, with links between data objects that can be activated by clicking with the mouse. This format is distributed as a compressed .tar file via anonymous ftp from the site probe.nalusda.gov, directory pub/ricegenes. The second format is the gopher, which is menu-driven and has more limited searching powers, but is very easy to use and requires no special graphics capabilities. Duplicate gopher servers are maintained at nightshade.cit.cornell.edu port 70 and probe.nalusda.gov port 7007 to ensure continuous service. Gopher access requires only a modem connection to an Internet host. The third method of access is through the World Wide Web. The USDA/NAL provides access to all plant genome databases, and much more, through a Web server located at the following URL (electronic address): http://probe.nalusda.gov:8000. Depending on the viewer software used, the Web can be a graphical interface (e.g., using Mosaic) or text-only (e.g., using Lynx). In either case, the Web provides the capability of hot links between, in addition to within, databases. The final interface is an electronic mail query system for users who do not have full Internet access, but do have Internet mail service. Send a message to waismail@probe.nalusda.gov with only the word "help" in the body of the message to receive instructions on using this service.

In addition to the full contents of the database (both text and images), the gopher and Web interfaces have access to electronic versions of the Rice Genetics Newsletter. We currently have three volumes of the Newsletter available as text documents which are preindexed for fast searching by any word, and will ultimately include all twelve volumes. The electronic, word-searchable versions of such documents are expected to be more useful to a wider group of people than the printed versions.

For non-Internet sites, the National Agricultural Library has made available a CD-ROM which contains all the USDA-funded genome databases. More information on the CD-ROM and other information services provided by the USDA can be obtained from pgenome@nalusda.gov.

Perhaps to a greater extent than the other USDA genome databases, RiceGenes must address an international user group. Broadly speaking, the goal of the project is to provide an accessible source of information on rice genetic resources, and in particular to provide the type of information that may be costly or impossible for many labs to produce or accumulate by themselves. Researchers around the world are strongly encouraged to participate in the expansion of the database by making their results publicly available through this forum. Because much of the research being done on rice is carried out in countries which do not have access to the Internet, the distribution of the plant genome databases on CD-ROM was strongly lobbied for and has been key in the dissemination of the RiceGenes information.

For further details regarding the database or how to access it please contact: Edie Paul, 252 Emerson, Ithaca, NY 14583, or via electronic mail to epaul@nightshade.cit.cornell.edu.

Literature Cited:

McCouch SR, Kochert G, Yu ZH, Wang ZY, Khush GS, Coffman WR, Tanksley SD (1988)Molecular mapping of rice chromosomes. Theor Appl Genet 76:815-829

Causse M, Fulton TM, Cho YG, Ahn SN, Chunwongse J, Wu K, Xiao J, Yu Z, Ronald PC, Harrington SB, Second GA, McCouch SR, Tanksley SD (1994) Saturated molecular map of the rice genome based on an interspecific backcross population. Genetics (in press).