1 Center for Biological Sequence Analysis, Department of Biotechnology, Building 208, The Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
2 Department of Molecular Biology, Institute of Medical Microbiology, University of Oslo, The National Hospital, NO-0027 Oslo, Norway
3 Molecular Microbiology and Genomics Consultants, Tannestrasse 7, D-55576 Zotzenheim, Germany
Correspondence
David W. Ussery
(dave{at}cbs.dtu.dk)
Genomes of the month microbial genome evolution
Eight microbial genomes have been published in the four weeks since the last Genome Update was written (Ussery et al., 2004). They represent five bacterial and three eukaryotic organisms, and provide several interesting aspects of genome evolution. A very brief overview of the new genomes will be presented below; this is meant merely to wet the appetite of the reader and to provide pointers to the relevant recent literature.
Two spirochaete genomes have been published this month, bringing the total number of genomes from three to five for this phylum. The genome of Treponema denticola strain ATCC 35405 (Seshadri et al., 2004) is more than twice the size of the previously sequenced genome of Treponema pallidum (2·8 Mbp vs 1·1 Mbp), although the number of tRNAs and rRNAs are about the same in both genomes. The difference in genome size appears to be the result of a combination of three types of evolution: genome reduction, lineage-specific recombination and horizontal gene transfer (Seshadri et al., 2004
). The other newly sequenced spirochaete genome, of Leptospira interrogans serovar Copenhageni strain Fiocruz L1-130 (Nascimento et al., 2004
), has two chromosomes and encodes 3728 genes, two rRNA operons and 37 tRNAs, as shown in Table 1
. This genome is nearly identical in size to that of L. interrogans serovar Lai (Ren et al., 2003
), which has 4727 annotated genes, or nearly 1000 extra genes. This is perhaps due to the difference in cut-off values for gene-finding from the two different groups.
|
|
The thermophilic and halotolerant bacterium Thermus thermophilus has become a model organism for structural biology, as many of its proteins have been crystallized and their structures determined. Examination of the genome of Thermus thermophilus strain HB27, which can grow at temperatures up to 85 °C, has revealed some clues as to what it might take to live in a hot-spring environment (Henne et al., 2004). Based on its genome sequence, it looks like this bacterium is a scavenger which lives on solid surfaces and takes up nutrients as they pass by.
The genome of the parasite Wolbachia pipientis wMel is unusual in that it is both streamlined and also contains high levels of repeats and mobile DNA elements (Wu et al., 2004). Thus, for this bacterium, natural selection appears to be a bit inefficient, probably due to repeated population bottlenecks (Wu et al., 2004
).
Three eukaryotic genomes have also been sequenced this month. As usual, unfortunately the quality of the eukaryotic sequences is not as good as that of the prokaryotic genomes; there are many gaps in the sequences, and also the annotation (when present) is patchy at best (in our opinion). According to Kellis et al. (2004), the genome sequence of the yeast Kluyveromyces waltii strain NCYC 2644 compared to that of Saccharomyces cerevisiae provides the first comparison across an ancient whole genome duplication event and offers the opportunity to study the long-term fate of a genome after duplication. The intracellular pathogen Cryptosporidium parvum type II isolate has a genome of about 9·1 Mbp in length and encodes a mere 3800 proteins (Abrahamsen et al., 2004
). (Note that this is about the size of a medium to small bacterial proteome!) This parasite has undergone massive genome reduction and streamlining, even losing all of its mitochondrial DNA, which has been incorporated into the main chromosome. Finally, the genome of the alga Cyanidioschyzon merolae 10D (Matsuzaki et al., 2004
) is 16·5 Mbp long and spread over 20 chromosomes. There are very few introns, and only three rRNA operons (see Table 1
). This genome provides a model system with a simple gene composition for studying the origin, evolution and fundamental mechanisms of [photosynthetic] eukaryotic cells (Matsuzaki et al., 2004
).
Method of the month comparison of tRNA genes in sequenced genomes
The number of tRNA genes in bacterial genomes ranges from 126 in Vibrio parahaemolyticus to 29 in Mycoplasma pulmonis. Since there are a maximum of 61 possible codons (and hence different tRNA genes), some genomes obviously have missing tRNAs, although all of the genomes can code for the use of all 20 amino acids. The use of base wobble in the third position allows for a given tRNA gene to utilize certain codons which differ only in the third position. Thus, for example, in the case of the Mycoplasma pulmonis genome, even though there are only 29 tRNA genes, all 61 codons are found within the protein-coding sequences. However, the frequency of usage within the coding regions varies considerably for example, of the six possible codons for leucine, UUA is used 13 272 times, whilst CUG is only used 165 times, or nearly 100-fold less.
Codon usage plots for three different phyla are shown in Fig. 1(A). Note that some codons (such as AAA and GAA) are used frequently in all phyla, whilst other codons, such as UAA, UAC and UAU, are used infrequently. A change in the third position in the codon often will code for the same amino acid, and bias in this position is correlated with changes in the AT content of the genome. For example, in the M. pulmonis genome mentioned above, the CUN codon usage is strongly biased towards U or A (CUC is only used 399 times, compared to 7523 for CUU and 3932 for CUA). Thus, an AT-rich genome and a GC-rich genome might code for a similar amino acid composition, but each genome would have a different third position bias, as can be seen in Fig. 1(B)
. Finally, the overall amino acid composition of the genomes from the three different phyla look quite similar, with the amino acids leucine, alanine, glycine and serine being most abundant, and tryptophan, cytosine, histidine and methionine being used infrequently.
|
Next month, the number of genes per genome will be discussed. At the time of writing, the bacterial genome with the fewest genes is that of Mycoplasma genitalium, with a mere 480 genes, whilst the largest is that of Bradyrhizobium japonicum, with 8317 genes.
Supplemental web pages
Web pages containing supplemental material related to this article can be accessed from the following url: http://www.cbs.dtu.dk/services/GenomeAtlas/suppl/GenUp005/
Acknowledgements
This work was supported by a grant from the Danish Center for Scientific Computing.
REFERENCES
Abrahamsen, M. S., Templeton, T. J., Enomoto, S. & 17 other authors (2004). Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304, 441445.
Grosjean, H. & Björk, G. R. (2004). Enzymatic conversion of cytidine to lysidine in anticodon of bacterial tRNAIle an alternative way of RNA editing. Trends Biochem Sci 29, 165168.[CrossRef][Medline]
Henne, A., Bruggemann, H., Raasch, C. & 17 other authors (2004). The genome sequence of the extreme thermophile Thermus thermophilus. Nat Biotechnol Epub ahead of print, DOI: 10.1038/nbt956
Horn, M., Collingro, A., Schmitz-Esser, S. & 10 other authors (2004). Illuminating the evolutionary history of Chlamydiae. Science Epub ahead of print, DOI: 10.1126/science.1096330
Kellis, M., Birren, B. W. & Lander, E. S. (2004). Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617624.[CrossRef][Medline]
Matsuzaki, M., Misumi, O., Shin, I. T. & 39 other authors (2004). Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428, 653657.[CrossRef][Medline]
Nascimento, A. L., Verjovski-Almeida, S., Van Sluys, M. A. & 9 other authors (2004). Genome features of Leptospira interrogans serovar Copenhageni. Braz J Med Biol Res 37, 459477.[Medline]
Ren, S. X., Fu, G., Jiang, X. G. & 36 other authors (2003). Unique physiological and pathogenic features of Leptospira interrogans revealed by whole-genome sequencing. Nature 422, 888893.[CrossRef][Medline]
Seshadri, R., Myers, G. S., Tettelin, H. & 36 other authors (2004). Comparison of the genome of the oral pathogen Treponema denticola with other spirochete genomes. Proc Natl Acad Sci U S A 101, 56465651.
Srinivasan, G., James, C. M. & Krzycki, J. A. (2002). Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA. Science 296, 14591462.
Ussery, D. W., Hallin, P. F., Lagesen, K. & Coenye, T. (2004). Genome Update: rRNAs in sequenced microbial genomes. Microbiology 150, 11131115.
Wassenaar, T. M. & Meinersmann, R. J. (2003). The TGA stop codon and the phylogeny of the selenocysteine pathway. Genome Lett 2, 127138.[CrossRef]
Wu, M., Sun, L. V., Vamathevan, J. & 27 other authors (2004). Phylogenomics of the reproductive parasite Wolbachia pipientis wMel: a streamlined genome overrun by mobile genetic elements. PLoS Biol 2, E69.[Medline]
Yamao, F., Muto, A., Kawauchi, Y., Iwami, M., Iwagami, S., Azumi, Y. & Osawa, S. (1985). UGA is read as tryptophan in Mycoplasma capricolum. Proc Natl Acad Sci U S A 82, 23062309.[Abstract]
Zinoni, F., Birkmann, A., Leinfelder, W. & Bock, A. (1987). Cotranslational insertion of selenocysteine into formate dehydrogenase from Escherichia coli directed by a UGA codon. Proc Natl Acad Sci U S A 84, 31563160.[Abstract]
HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
J MED MICROBIOL | ALL SGM JOURNALS |