Department of Biology, University of Michigan, Ann Arbor, Michigan;
DOE Joint Genome Institute and Lawrence Livermore National Laboratory, Walnut Creek, California
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Complete mitochondrial genome sequences are available for 127 animals (see Boore [1999]
and the Mitochondrial Genomics link at http://www.jgi.doe.gov). Comparisons among them have demonstrated variation among lineages in many genome features, including gene arrangement, nucleotide composition and skew, and replication and transcription signaling elements. The density of sampling within each phylum varies greatly; that of Chordata is greatest (80, composed primarily of 77 vertebrate genomes) followed by Arthropoda (16 genomes), with the remaining 31 genomes representing only 8 of the more than 30 other phyla. Sampling within the lophophorate phyla has begun with two articulate brachiopods, Terebratulina retusa (Stechmann and Schlegel 1999
) and Laqueus rubellus (Noguchi et al. 2000
).
The differences between the two previously sequenced articulate brachiopod mtDNAsgene rearrangements, differences in gene lengths, disparity in numbers of noncoding nucleotides, differing nucleotide biasesare radical and warrant the examination of another articulate brachiopod mtDNA. We determined the mtDNA sequence of the articulate brachiopod Terebratalia transversa and compared it with both the closely related L. rubellus and the more distantly related T. retusa in order to make a comprehensive analysis of the evolution of mtDNA features within the articulate brachiopod clade. The T. transversa genome is relatively small, with size reductions in tRNA, rRNA, and protein-encoding genes. The genome contains the 37 genes commonly found in metazoan mtDNA, all encoded by one G+T-rich strand. Its gene arrangement, when compared with those of T. retusa and L. rubellus, suggests that further comparisons of gene arrangements might provide useful characters for resolving articulate brachiopod relationships.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Sequences were assembled and analyzed using Sequence Analysis and Sequence Navigator (ABI) and MacVector 6.5 (Oxford Molecular Group). Protein and ribosomal RNA genes were identified by their sequence similarities to the Lumbricus terrestris mtDNA homologs (Boore and Brown 1995
). tRNA genes were identified either by using tRNAscan-SE (version 1.1, http://www.genetics.wustl.edu/eddy/tRNAscan-SE; Lowe and Eddy 1997) with a cove cutoff score of 0.1 or, in many cases where tRNAs were not found using this program, by recognizing potential secondary structures by eye.
The 5' ends of protein genes were inferred to be at the first legitimate in-frame start codon (ATN, GTG, TTG, GTT; Wolstenholme 1992) that did not overlap the preceding gene, except that overlap with an upstream tRNA gene was limited to the most 3' nucleotide of the tRNA.
Protein gene termini were inferred to be at the first in-frame stop codon unless that codon was located within the sequence of a downstream gene. Otherwise, a truncated stop codon (T or TA) adjacent to the beginning of the downstream gene (with the exception of atp8) was designated as the termination codon and was assumed to be completed by polyadenylation after transcript cleavage (Ojala, Montoya, and Attardi 1981
). The 5' and 3' ends of rrnL and rrnS were assumed to be adjacent to the ends of bordering tRNA genes.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Start and stop codons were inferred for each of T. transversa's 13 protein-encoding genes (table 1
). There are five start codons used in the T. transversa mtDNA: ATG (cox2), ATT (nad4, nad4L, nad5, atp8), GTG (nad6, cox1, cox3, cob), GTT (nad2), and TTG (nad1, nad3, atp6). Five genes end on a complete stop codon, either TAA (nad2, atp6) or TAG (nad4L, cox1, cox3). The other eight genes end on an abbreviated stop codon that is presumably completed by polyadenylation of the mRNA (Ojala, Montoya, and Attardi 1981
); this is simply T for all except nad6, which ends with TA. It is notable that for six of the eight genes that we interpret as ending at abbreviated stop codons (cob, nad5, nad6, nad1, cox2, and nad3), there is a complete stop codon farther downstream that, if used, would cause overlap of the downstream tRNA gene by 126 nt. As has been previously suggested (Boore and Brown 2000
), the downstream stop codons could possibly act as "backups" in cases of improper transcript cleavage.
|
Among the three brachiopod species for which complete mtDNA sequences are available, the most closely related pair is T. transversa and L. rubellus, both members of the family Laqueidae. The mtDNA gene rearrangements are not well conserved between these two taxa and are even less well conserved between either of these taxa and the mtDNA of the cancellothyridid T. retusa (fig. 2 ). This latter genome retains a number of features inferred to be primitive for Brachiopoda, since they are also found in the distantly related polyplacophoran mollusk Katharina tunicata. The large number of gene rearrangements among these taxa suggests that gene arrangement data will be useful for inferring phylogeny at lower taxonomic levels within the brachiopod clade Articulata.
|
Genome Size
The sizes reported for completely sequenced animal mtDNAs range from 13.5 kb (Taenia crassiceps; Le et al. 2000
) to 19.5 kb (Drosophila melanogaster; Lewis, Farr, and Kaguni 1995
). Terebratalia transversa mtDNA, at 14,291 nt, is near the low end of this range and is comparable in size to that of L. rubellus (14,017 nt; Noguchi et al. 2000
) or to those of some nematodes (Okimoto et al. 1991, 1992
; Keddie, Higazi, and Unnasch 1998
), those of gastropods (Hatzoglou, Rodakis, and Lecanidou 1995
; Terrett, Miles, and Thomas 1996
; Yamazaki et al. 1997
), and those of parasitic flatworms (Le et al. 2000
).
Each of the two laqueid mtDNAs is reduced in size in similar ways compared with the larger T. retusa mtDNA (table 2
). There are fewer nucleotides in noncoding regions, the rRNA genes appear to be shorter, the tRNA genes are somewhat smaller (due especially to a decreased number of nucleotides in the TC arms; fig. 3
), and protein gene sizes have been drastically reduced. There is uncertainty regarding rRNA gene sizes, and precise end assignment is not possible from sequence information alone (e.g., the 45 nt at the 3' end of rrnL of L. rubellus do not align well with rrnL of T. transversa [data not shown], suggesting that some nucleotides assigned to the L. rubellus rrnL may actually be noncoding). The size reduction of the T. transversa protein-encoding genes is striking; there are ca. 1,500 fewer nucleotides (500 fewer codons) in L. rubellus and T. transversa than in T. retusa protein-encoding genes, a surprisingly large reduction.
|
|
|
Interestingly, T. transversa's skew values are very similar to those of the other sequenced laqueid, L. rubellus (AT skew = -0.29, GC skew = +0.27), but opposite to those of the cancellothyridid T. retusa (AT skew = +0.03, GC skew = -0.29) (table 2 ). Because all genes are transcribed in the same direction in each of these mtDNAs, mutation (deamination) pressure should be the same on the displaced strand, and the three mtDNAs should show the same or similar skewnesses relative to the coding strand if transcription is the most important cause of strand-biased nucleotide composition. That this expectation is not borne out indicates that mutational events during transcription cannot be causative of these differences, and suggests instead that replication is more important. It is possible that the origins of replication might be oppositely oriented in these two families of brachiopods and that deaminations during replication are the primary cause of skewness. In support of this hypothesis, the sequence of a portion of the noncoding region of each of the two laqueid mtDNAs is similar to the reverse complement of a portion of the noncoding region of T. retusa (fig. 5 ). The noncoding region in vertebrate and fruit fly mtDNA is known to contain sequence elements that mediate DNA replication, and it is possible that the noncoding region in these brachiopod mtDNAs contains functionally similar sequences. Alternatively, it is possible that the similarities seen here are themselves due to the differential biasing of base compositions of the two strands among these mtDNAs.
|
For the most part, the bias in usage of synonymous codons in the proteins of T. transversa mtDNA follows the same pattern of nucleotide frequency (T > G > A > C) as the mtDNA coding strand as a whole (fig. 6 and tables 2 and 3 ). This bias is evident in both fourfold- and twofold-degenerate codon families, suggesting that third codon positions mostly reflect mutational bias. However, six codons, all of which have an unexpectedly high homodimer frequency at codon positions 2 and 3, are exceptions to this: TCC, CCC, GCC, CGG, AGG, and GGG. The most extreme variation is found in the two homotrimeric codons, CCC and GGG. The hypothesis that this reflects mutation pressure is bolstered by the observation that homopolymer runs ranging from 2 to 11 nt in length are more common than expected throughout the mtDNA, given the nucleotide frequencies. We are unable to determine whether this effect increases the frequency of NTT codons, since T is already the most common nucleotide, or of NAA codons, because all of these are in twofold-degenerate codon families and thus can be compared only with NAG codons (all of which are more common than NAA, as expected). For L. rubellus, a similar effect is apparent for only four codons, GCC, CGG, AGG, and GGG; all are more common than expected. Otherwise, third codon positions deviate from the expected ranked frequency T > G > A > C only because of the infrequency of NCG codons.
|
|
To what extent is amino acid composition determined by natural selection and to what extent by mutational bias? Clearly, purifying selection must eliminate certain mutations that would change amino acids at essential sites in mitochondrial proteins, and directional selection may sometimes create novel amino acid identities at some sites. At other, less constrained, positions, amino acids with similar physical or chemical traits can substitute acceptably, with some sites apparently tolerating even very radical amino acid substitutions; for these, substitutions would be expected to reflect mutational bias.
To address this question, we analyzed the extent to which the base composition at first and second codon positions is similar to that at third codon positions, which are much freer to vary in conformation with mutational bias. The amino acid compositions of the mitochondrial proteins of three brachiopods can be sorted into four physicochemical groups and are listed in descending order relative to their changes in frequency between T. transversa and T. retusa (table 4 ). The laqueids T. transversa and L. rubellus are very similar in this respect, differing only in their relative usages of the leucine codon TTR or CTN. Both differ greatly, however, from T. retusa. Within each of the four amino acid groupings, codons rich in T and G (e.g., TTR, GTN, and TTY in the nonpolar group) are much more common in the laqueids, and those rich in A and C (e.g., ATR, ATY, CCN, GCN, and CTN in the nonpolar group) are much rarer, suggesting that mutational bias is a prime factor influencing amino acid composition, at least within each physicochemical group.
|
Like all metazoan mitochondrial genomes, that of T. transversa encodes two rRNAs. The size of its rrnL is 1,105 nt, with A+T = 62.6% (the highest of any region in the genome) and AT and GC skews of -0.08 and 0.23, respectively. rrnS is 762 nt in size, with A+T = 59.2% and AT and GC skews of -0.06 and 0.24.
Noncoding Nucleotides
Terebratalia transversa mtDNA has 202 noncoding nucleotide pairs; 149 are in three noncoding regions of 35 (between atp8 and cox3), 42 (between nad2 and cox1), and 69 (between cox1 and trnC) nucleotides, and 53 are dispersed throughout the genome, either as single nucleotide pairs or in blocks ranging in size from five to nine pairs.
The enzymatic removal of tRNAs from a polycistronic transcript is necessary to release adjacent, gene-specific messages (Battey and Clayton 1980
; Ojala et al. 1980
; Rossmanith 1997
). When two protein-encoding genes abut directly, in some cases both are translated from the same bicistronic message (Ojala et al. 1980
); in some others, sequences with the potential to form stem-loop structures are present at the junctions and may mediate transcript cleavage (e.g., Boore and Brown 1994
). Three protein-encoding gene pairs abut directly in T. transversa mtDNA: atp8-cox3, nad2-cox1, and nad4L-cox2. Both nad2 and nad4L have complete stop codons, so it is possible that each forms a bicistronic mRNA with cox1 and cox2, respectively. However, atp8 in the atp8-cox3 pair lacks a complete stop codon; in the absence of editing or processing, a single product would result from translation of the bicistronic transcript. For the atp8-cox3 and nad2-cox1 pairs, adjacent genes are separated by the 35- and 42-nt-long noncoding regions described above; each of the noncoding sequences has the potential to form an RNA stem-loop structure (fig. 7
). No such potential could be identified for the nad4L-cox2 junction or for the 69-nt-long region following cox1.
|
The third and longest (69 nt) of the noncoding regions is similar in length to that of L. rubellus (54 nt). Each of these noncoding regions is much shorter than the longest noncoding region in T. retusa (794 nt), although in T. retusa only 287 of the 794 nucleotides are unique, with the remainder being in repeated sequences. Attempts at aligning the noncoding regions to search for possibly conserved functional elements revealed that the plus strand of the noncoding regions of the laqueids aligned best with the minus strand of the noncoding region of T. retusa (fig. 5 ). It is possible that this alignment is due to some conserved regulatory function, perhaps for DNA replication. If this is the case, then the opposite strand biases among these mtDNAs might be explained by a reversal in the direction of first-strand replication.
The near uniformity of proteins encoded by metazoan mtDNAs makes this DNA an excellent model system for the study of molecular evolution. Comparisons of mtDNAs from closely related species, such as these brachiopods, are of general utility for future studies of genome and protein evolution. Here, for example, we see an example of mutational bias as well as natural selection determining amino acid usage. Further descriptions and comparisons of mtDNAs, along with functional studies, will allow us to address questions such as which amino acids in each of the proteins are selected, what the roles of these amino acids are, and which portions of the proteins encoded by larger mtDNAs (e.g., that of T. retusa) have been lost by smaller genomes (e.g., those of the laqueids) and what functions they serve in the larger proteins.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
1 Keywords: Terebratalia transversa
mitochondria
genome
evolution
codon usage
nucleotide skew
2 Address for correspondence and reprints: Kevin G. Helfenbein, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California 94598. kgh{at}umich.edu
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Asakawa S, Y. Kumazawa, T. Araki, H. Himeno, K. Miura, K. Watanabe, 1991 Strand specific nucleotide composition bias in echinoderm and vertebrate mitochondrial genomes J. Mol. Evol 32:511-520[ISI][Medline]
Battey J., D. A. Clayton, 1980 The transcription map of human mitochondrial-DNA implicates transfer-RNA excision as a major processing event J. Biol. Chem 255:1599-1606[ISI]
Beagley C. T., R. Okimoto, D. R. Wolstenholme, 1998 The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near-standard genetic code Genetics 148:1091-1108
Boore J. L., 1999 Animal mitochondrial genomes Nucleic Acids Res 27:1767-1780
. 2000 The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deuterostome animals Pp. 133147 in D. Sankoff and J. Nadeau, eds. Comparative genomics, computational biology series. Vol. 1. Kluwer Academic Publishers, Dordrecht, the Netherlands
Boore J. L., W. M. Brown, 1994 Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata Genetics 138:423-443
. 1995 Complete DNA sequence of the mitochondrial genome of the annelid worm, Lumbricus terrestris Genetics 141:305-319
. 2000 Mitochondrial genomes of Galatheolinum, Helobdella, and Platynereis: sequence and gene arrangement comparisons show that Pogonophora is not a phylum and Annelida and Arthropoda are not sister taxa Mol. Biol. Evol 17:87-106
Brusca R. C., G. J. Brusca, 1990 Invertebrates Sinauer, Sunderland, Mass
Clayton D. A., 1982 Replication of animal mitochondrial DNA Cell 28:693-705[ISI][Medline]
Fearnley I. M., J. E. Walker, 1986 Two overlapping genes in bovine mitochondrial DNA encode membrane components of ATP synthase EMBO J 5:2003-2008[Abstract]
Foury F., T. Roganti, N. Lecrenier, B. Purnelle, 1999 The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae FEBS Lett 440:325-331[ISI]
Francino M. P., H. Ochman, 1997 Strand asymmetries in DNA evolution Trends Genet 13:240-245[ISI][Medline]
Hatzoglou E., G. C. Rodakis, R. Lecanidou, 1995 Complete sequence and gene organization of the mitochondrial genome of the land snail Albinaria coerulea Genetics 140:1353-1366
Hoffmann R. J., J. L. Boore, W. M. Brown, 1992 A novel mitochondrial genome organization for the blue mussel, Mytilus edulis Genetics 131:397-412
Ikemura T., 1981 Correlations between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system J. Mol. Biol 151:389-409[ISI][Medline]
. 1982 Correlation between the abundance of yeast transfer RNAs and the occurrence of respective codons in protein genes J. Mol. Biol 158:573-597[ISI][Medline]
Keddie E. M., T. Higazi, T. R. Unnasch, 1998 The mitochondrial genome of Onchocerca volvulus: sequence, structure and phylogenetic analysis Mol. Biochem. Parasitol 95:111-127[ISI][Medline]
Le T. H., D. Blair, T. Agatsuma, et al. (14 co-authors) 2000 Phylogenies inferred from mitochondrial gene ordersa cautionary tale from the parasitic flatworms Mol. Biol. Evol 17:1123-1125
Lewis D. L., C. L. Farr, L. S. Kaguni, 1995 Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons Insect Mol. Biol 4:263-278[ISI][Medline]
Moritz C., T. E. Dowling, W. M. Brown, 1987 Evolution of animal mitochondrial DNA: relevance for population biology and systematics Annu. Rev. Ecol. Syst 18:269-292[ISI]
Noguchi Y., K. Endo, F. Tajima, R. Ueshima, 2000 The mitochondrial genome of the brachiopod Laqueus rubellus Genetics 155:245-259
Ojala D., C. Merkel, R. Gelfand, G. Attardi, 1980 The transfer RNA genes punctuate the reading of genetic information in human mitochondrial DNA Cell 22:393-403[ISI][Medline]
Ojala D., J. Montoya, G. Attardi, 1981 tRNA punctuation model of RNA processing in human mitochondria Nature 290:470-474[ISI][Medline]
Okimoto R., H. M. Chamberlin, J. L. Macfarlane, D. R. Wolstenholme, 1991 Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne) nucleotide sequences, genome location and potential for host-race identification Nucleic Acids Res 19:1619-1626[Abstract]
Okimoto R., J. L. MacFarlane, D. O. Clary, D. R. Wolstenholme, 1992 The mitochondrial genomes of two nematodes, Caenorhabditis elegans and Ascaris suum Genetics 130:471-498
Perna N. T., T. D. Kocher, 1995 Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes J. Mol. Evol 41:353-358[ISI][Medline]
Reyes A., C. Gissi, G. Pesole, C. Saccone, 1998 Asymmetrical directional mutation pressure in the mitochondrial genome of mammals Mol. Biol. Evol 15:957-966[Abstract]
Rossmanith W., 1997 Processing of human mitochondrial tRNA-Ser(AGY); a novel pathway in tRNA biosynthesis J. Mol. Biol 265:365-371[ISI][Medline]
Sasuga J., S.-I. Yokobori, M. Kaifu, T. Ueda, K. Nishikawa, K. Watanabe, 1999 Gene contents and organization of a mitochondrial DNA segment of the squid Loligo bleekeri J. Mol. Evol 48:692-702[ISI][Medline]
Stechmann A., M. Schlegel, 1999 Analysis of the complete mitochondrial DNA sequence of the brachiopod Terebratulina retusa places Brachiopoda within the protostomes Proc. R. Soc. Lond. B Biol. Sci 266:2043-2052[ISI][Medline]
Terrett J., S. Miles, R. Thomas, 1996 Complete DNA sequence of the mitochondrial genome of Cepaea nemoralis (Gastropoda: Pulmonata) J. Mol. Evol 42:160-168[ISI][Medline]
Wolstenholme D. R., 1992 Animal mitochondrial DNA: structure and evolution Int. Rev. Cytol 141:173-216[ISI][Medline]
Yamazaki N., R. Ueshima, J. Terrett, et al. (12 co-authors) 1997 Evolution of pulmonate gastropod mitochondrial genomes: comparisons of gene organizations of Euhadra, Cepaea and Albinaria and implications of unusual tRNA secondary structures Genetics 145:749-758
Yokobori S., S. Pääbo, 1997 Polyadenylation creates the discriminator nucleotide of chicken mitochondrial tRNA(Tyr) J. Mol. Biol 265:95-99[ISI][Medline]
Yokobori S., T. Ueda, G. Feldmaier-Fuchs, S. Pääbo, R. Ueshima, A. Kondow, K. Nishikawa, K. Watanabe, 1999 Complete DNA sequence of the mitochondrial genome of the ascidian Halocynthia roretzi (Chordata, Urochordata) Genetics 153:1851-1862