Published in Probe Volume 1(1-2): Spring-Summer 1991
Lois Blaine, Head, Bioinformatics Department
American Type Culture Collection
Rockville, Maryland
CODATA Convenes Workshop To Address Problems
Formulating a plan to improve access to standardized terminology for biological database producers and users was the goal of a workshop held May 14-16 in Nancy, France, by the CODATA Commission on Standardized Terminology for Access to Biological Data.
The workshop, jointly sponsored by the U.S. National Center for Biotechnology Information and Commission of the European Communities DG XII, was attended by representatives of the Biological Unions of the International Council of Scientific Unions (ICSU), producers of bibliographic and factual databases, and professional terminologists. This combination of participants, coming from disparate subdisciplines of biological and information science, provided an excellent blend of appropriate talents to address the multifaceted problems of standardizing terminology.
A primary goal of the Commission, re-emphasized during the Nancy Workshop, is to raise the level of consciousness within the biological community of the need to communicate across disciplines. A major benefit of today's computer technology is that it provides the means to integrate data in ways that will lead to new scientific insights. Artificial intelligence, innovative programming, massive data storage capabilities, and vastly improved communication technology, will inevitably draw diverse data sources together. If data is to be integrated, exchanged, and searched efficiently, the intellectual input to make this possible must come from biologists now.
Although problems surrounding the "standardization" of nomenclature and terminology have been with us for centuries, information technology demands that we take a fresh look at these problems and devise new methods to solve them that takes full advantage of today's technological tools.
Workshop Segments
The workshop program consisted of three segments. The first segment included formal presentations that provided a background and overview of the perceived problems in interdisciplinary access to biological terminology. Dr. Andrzej Elzanowski, from the Max-Planck-Institut fur Biochemie, articulately summarized the problems faced by database producers in "translating" the nomenclature and terminology used by authors who write for scientific journals into some type of "standard" that can be used for consistency of retrieval of identical concepts.
During the workshop, it became apparent that authors, editors, publishers, and database producers all face a similar situation-- the lack of clear guidelines on nomenclature and taxonomy of organisms, and the terminology to describe their characteristics.
In the second segment, representatives of the Biounions, in a series of round-table discussions, presented the "state of the art" within each union regarding nomenclature and terminology standards. Broad topics covered by the union representatives included botanical and zoological nomenclature and taxonomy, biochemistry, microbiology, pharmacology, physiology, nutrition, food science, and clinical medicine. Participants discovered that the nomenclature and terminology committees of the ICSU unions face as many problems in providing standards as database producers and users do in locating standardized terminology for biological concepts. Their ability to provide wide access to standardized terminology in formats and electronic media desired by database producers is limited by the fact that almost all of their work is performed on a "volunteer basis" and is primarily designed for intradisciplinary use. Additional resources would be required to expand the work of the Biounion nomenclature and terminology committees.
During the third segment, participants heard from terminological specialists who discussed existing standards for terminology. There are general principles that must be applied when developing terminological databases, regardless of the scope and content. The Commission agreed to work with the specialists to educate biologists on the ICSU committees in the implementation of these principles. Several documents recognized by the International Standards Organization were recommended for use in the educational campaign.
Increasing Information Exchange
The workshop discussions raised the participants' awareness that the present interdisciplinary nature of many scientific activities leads to a greater need for an exchange of information within the component parts. The impact of multinational projects, such as HUGO, similarly imposes added demands for clarity and standardization of expression. The integration of international, interdisciplinary databases will require some precision in defining terminology for uniform interpretation of scientific principles.
Workshop participants agreed that the first step in providing wider access to standardized terminologic references is to expand efforts initiated by the U.S. National Library of Medicine (NLM) in establishing a "Nomenclature File" in its Directory of Biotechnology Information Resources. All agreed that the NLM file is a useful beginning, but that a broader international inventory of terminological resources and their relationships to one another is required. A steering committee composed of Commission members will be established to design procedures for developing this international terminological inventory.
"Term Bank" Needed
The ultimate goal of the Commission would be to catalyze the development of an international "Term Bank." This, of course, would have to be developed in modules and would necessitate preliminary studies to determine feasibility and user requirements. The Commission would also seek the cooperation of other international organizations such as the International Council on Scientific and Technical Information (ICSTI) and the International Federation of Scientific Editors (IFSE), both represented at the Nancy workshop. An enormous effort would be required to make such a Term Bank available to the international scientific community. While the computer and communication technology is available to link subsets of such a database, issues, including copyright, cost recovery, coordination, updating responsibilities, and funding were recognized to be potential barriers to the successful accomplishment of the goal.
Dr. Leslie Sobin, a representative from the International Union Against Cancer (UICC), aptly set to verse the precautions that must be taken by the Commission in approaching a project of this size:
When you're looking for the answer
how to classify all cancer,
proteins, microbes, fish and succulent legumes,
You must know a little Latin
tell a round fish from a flat one
and have memory with lots and lots of room.
But, before we start alinking
we should sit back and be thinking
on our methods, clientele and on our goal,
Lest we make a mammoth bank
rarely used and rarely thanked
just consuming funds and efforts: A BLACK HOLE.
Leslie Sobin
Members of the CODATA Commission will keep these words of wisdom in the forefront as they launch their campaign to improve access to standardized biological terminology.
For further information on the CODATA Commission on Standardized Terminology for Access to Biological Data, please contact the CODATA Secretariat, 51 bd. de Montmorency, 75016 Paris, France.