Improving End-User Computer Searches

Published in Probe Volume 2(1): Spring 1992


Vincent Caccese, Librarian
Biological & Agricultural Sciences Reference Department
Shields Library
University of California, Davis, CA

Since their appearance in the early 1970's, computerized databases have been a boon to the searching of scientific literature. Increasing numbers of end-users conduct their own computer searches without direct assistance of an intermediary. They are generally satisfied with the results. For individuals less pleased with their searching skills, the following suggestions may prove helpful.

Principles of Searching

Many texts are available that explain the principles of searching.1 In brief, users can improve search results if they can determine (1) the scope of the database being used, dates of coverage, and the frequency of its update; (2) the database producer's policy of selecting source publications and the extent to which those publications are indexed selectively or cover to cover; (3) the correct format of an author's name being searched; (4) the system default for a Boolean operator when none is entered between terms by the user; (5) the optimal number of concepts to intersect with AND; and (6) whether controlled search terms or codes are used.

A good database manual will address the above points and clarify whether or not a database is appropriate for a given question. If a manual is not available or is incomplete, the convenient but less exhaustive alternatives are the help screens in CD-ROM systems. These provide summary statements about the database and time periods covered. Other excellent sources are the data sheets from online vendors that describe the databases available through them. Online systems also have limited help screens associated with a particular database as well as directory files online devoted to database descriptions.2 Because of per minute costs, these can be expensive. Several other print database directories give additional guidance on database coverage and are available in larger libraries.3

Scope and Coverage

After selecting the likely databases to use, determine the scope and depth of coverage. Periodicals are generally the largest component of scientific bibliographic databases. A database "list of periodicals indexed" can be an aid to the database manual in assessing source publication coverage. However, few databases attempt to represent the contents of periodicals cover to cover; fewer still will index to the depth expected by any given user. In general, news items, letters, book reviews, and shorter communications are commonly excluded. Dissertations, patents, and books are covered unevenly among databases.

Databases also vary in the degree of currency of its records compared with the dates of the source publications. Because of the labor-intensive nature of indexing, it is not unusual for the online database to contain a reference to a source publication 3 months behind the date of the original. This potentially long delay should alert the user not to rely solely on one or even two databases for literature coverage.

Author names, in particular, are troublesome for searching. There is a deception that they are easy to search. The authors themselves are inconsistent in how they present their names. In addition, each database has unique formatting requirements. The rule of thumb is always examine the index or expand features in online and CD-ROM systems for author format. If the format varies even slightly from what the user entered, it pays to re-enter the name and search again.

Searching Approach

Conceptualizing a search beforehand is a good practice. A common and effective approach for the user is to divide a question into concepts--two or three in most cases in bibliographic text files. How a user dissects a topic is related to a number of factors, including individual needs, subject expertise, and the database itself. The greater the number of concepts to intersect by AND, the more likely the results will tend toward zero. The simple technique of removing a concept requirement provides an option for recalling greater results.

The user lists for each concept the terms, synonyms, near synonyms, antonyms, and any codes assigned by the database to represent any concepts. The terms in each conceptual grouping will be joined by the union operator OR and the result intersected with the remaining concepts by means of other logical operators such as AND or NOT. Many database systems operate on terms immediately connected by AND, and then by OR. For this reason, these operators have to be diligently separated, either by nesting of any like terms that are ORed, or by creating separate sets for each concept before ANDing with another. The user also should be aware of any logical operators supplied by default by the software.

Selecting Entry Terms

When given a choice, users commonly search for title words. Title words in a database originate, at the very least, in the title provided by the author and from words that might be added by the indexer.

Any one database might stop with the recording of title words only. Other databases reflect a continuum of additional search points. For example, there may be other subject-like words sometimes called "identifiers," which include new jargon or other kinds of terms not under standardized control.

At the other end of the continuum are database-controlled terms, which include hierarchies and standardized language preferred by the database to express alternate words or concepts. The only sure way to determine whether the database is using controlled terms for searching is to consult the database description mentioned earlier.

Good databases have various search aids for user guidance, including thesauri, hierarchical structures, lists of reference works used as authorities, and the previously mentioned lists of periodicals indexed.

REFERENCES

  1. Harter, Stephen F. Online Information Retrieval: Concepts, Principles and Techniques. Orlando, Florida: Academic Press, 1986.
  2. Computer Readable Databases: A Directory and Data Sourcebook. 8th edition. Detroit: Gale Publishing Co., 1991, also online with Dialog Information Services as File 230; and Directory of Online Databases and Directory of Portable Databases. Detroit: Gale Publishing Co., 1992, also online through Data- Star, Orbit, and Questel.
  3. Information Industry Directory, 12th edition. Detroit: Gale Research, Inc., 1992. 2 vols. Reviews of recent directories have appeared in Information Intelligence Online Newsletter. 13(2):10-13, February 1992.