Survey of Synonymous Codon Usage in Nuclear Genes of Arabidopsis, Soybean, and Maize


Julia Bailey-Serres and Sheila L. Fennoy Department of Botany and Plant Sciences
University of California
Riverside, CA 92521-0124

The overall bias in synonymous codon usage of a genome is speciesspecific. Analysis of protein coding regions of small samples of plant genes for a number of species revealed codon usage biases.1 The synonymous codon usage of nuclear genes of plants va ries mainly in the bias toward C or G versus A or U in the silent third nucleotide position. Nuclear gene coding regions of monocots are enriched in codons ending in C and G, whereas dicots have a higher frequency of codons ending in A and U.

We used a multivariate statistical analysis to examine codon usage in maize. More biased codon usage was recognized among more highly expressed genes, whereas more random codon usage was observed among more lowly expressed genes. Our work indicates that t he overall codon usage patterns in maize reflect the G+C content of the genome. Codon usage bias of individual genes may not solely reflect the nucleotide compositional bias of a chromosomal region, but may be affected by selection on the silent third nucleotide.2

The accumulation of DNA sequence data for a large number of nuclear genes of plants provided an opportunity to further examine synonymous codon usage. Table 1 shows a summary of codon usage for three plant species, maize (Zea mays L.), soybean (Glycine max L.), and Arabidopsis (Arabidopsis thaliana). Non-duplicate protein coding sequences were obtained from the September 1992 releases of GenBank and EMBL databases and the literature, and the relative synonymous codon usage was determined. The synonymous c odons used at a higher frequency in these data sets are indicated with an asterisk.

Information on codon usage is useful for the design of degenerate oligonucleotide primers for PCR amplification of regions encoding conserved proteins. In addition, consideration of G+C content or codon usage appears to be important for high levels of exp ression of bacterial genes in plants.3,4 Further systematic analyses are needed to determine the role of the G+C content and codon usage in regulating gene expression.

References

  1. Campbell, W.H. and Gowri, G. (1990) Plant Physiol., 92, 1-11.
  2. Fennoy, S.L. and Bailey-Serres, J. (1993) Nucl. Acids Res., 21, 5294-5300.
  3. Perlak, F.J., Fuchs, R.L., Dean, D.A., McPherson, S.L. and Fischhoff, D.A. (1991) Proc. Natl. Acad. Sci. USA, 88, 3324-3328.
  4. Koziel, M.G., Beland, G.L., Bowman, C., Carozzi, N.B., Crenshaw, R., Crossland, L., Dawson, J., Desai, N., Hill, M., Kadwell, S., Launis, K., Lewis, K., Maddox, D., McPherson, K., Meghji, M.R., Merlin, E., Rhodes, R., Warren, G.W., Wright, M. and Evola, S.V. (1993) Bio/Technology,11, 194-200.