The Complete Mitochondrial Genome of the Articulate Brachiopod Terebratalia transversa

Kevin G. Helfenbein, Wesley M. Brown and Jeffrey L. Boore

Department of Biology, University of Michigan, Ann Arbor, Michigan;
DOE Joint Genome Institute and Lawrence Livermore National Laboratory, Walnut Creek, California


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 
We sequenced the complete mitochondrial DNA (mtDNA) of the articulate brachiopod Terebratalia transversa. The circular genome is 14,291 bp in size, relatively small compared with other published metazoan mtDNAs. The 37 genes commonly found in animal mtDNA are present; the size decrease is due to the truncation of several tRNA, rRNA, and protein genes, to some nucleotide overlaps, and to a paucity of noncoding nucleotides. Although the gene arrangement differs radically from those reported for other metazoans, some gene junctions are shared with two other articulate brachiopods, Laqueus rubellus and Terebratulina retusa. All genes in the T. transversa mtDNA, unlike those in most metazoan mtDNAs reported, are encoded by the same strand. The A+T content (59.1%) is low for a metazoan mtDNA, and there is a high propensity for homopolymer runs and a strong base-compositional strand bias. The coding strand is quite G+T-rich, a skew that is shared by the confamilial (laqueid) species L. rubellus but is the opposite of that found in T. retusa, a cancellothyridid. These compositional skews are strongly reflected in the codon usage patterns and the amino acid compositions of the mitochondrial proteins, with markedly different usages being observed between T. retusa and the two laqueids. This observation, plus the similarity of the laqueid noncoding regions to the reverse complement of the noncoding region of the cancellothyridid, suggests that an inversion that resulted in a reversal in the direction of first-strand replication has occurred in one of the two lineages. In addition to the presence of one noncoding region in T. transversa that is comparable with those in the other brachiopod mtDNAs, there are two others with the potential to form secondary structures; one or both of these may be involved in the process of transcript cleavage.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 
The lophophorates are a group of animals that includes three phyla: Brachiopoda, Phoronida, and Ectoprocta (see Brusca and Brusca 1990Citation ). Their collective name is derived from the most conspicuous morphological characteristic shared among the three groups, a ciliated, tentacular feeding apparatus called a lophophore. One lophophorate phylum, the brachiopods, has a rich fossil record that dates back 600 Myr and contains more than 12,000 species; today, about 335 species are known (Brusca and Brusca 1990Citation ). Of these, most are in the class Articulata and have two valves (shells) connected by a tooth and socket hinge; the remainder are in the class Inarticulata and have unhinged valves that are connected by muscles alone.

Complete mitochondrial genome sequences are available for 127 animals (see Boore [1999]Citation and the Mitochondrial Genomics link at http://www.jgi.doe.gov). Comparisons among them have demonstrated variation among lineages in many genome features, including gene arrangement, nucleotide composition and skew, and replication and transcription signaling elements. The density of sampling within each phylum varies greatly; that of Chordata is greatest (80, composed primarily of 77 vertebrate genomes) followed by Arthropoda (16 genomes), with the remaining 31 genomes representing only 8 of the more than 30 other phyla. Sampling within the lophophorate phyla has begun with two articulate brachiopods, Terebratulina retusa (Stechmann and Schlegel 1999Citation ) and Laqueus rubellus (Noguchi et al. 2000Citation ).

The differences between the two previously sequenced articulate brachiopod mtDNAs—gene rearrangements, differences in gene lengths, disparity in numbers of noncoding nucleotides, differing nucleotide biases—are radical and warrant the examination of another articulate brachiopod mtDNA. We determined the mtDNA sequence of the articulate brachiopod Terebratalia transversa and compared it with both the closely related L. rubellus and the more distantly related T. retusa in order to make a comprehensive analysis of the evolution of mtDNA features within the articulate brachiopod clade. The T. transversa genome is relatively small, with size reductions in tRNA, rRNA, and protein-encoding genes. The genome contains the 37 genes commonly found in metazoan mtDNA, all encoded by one G+T-rich strand. Its gene arrangement, when compared with those of T. retusa and L. rubellus, suggests that further comparisons of gene arrangements might provide useful characters for resolving articulate brachiopod relationships.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 
Total DNA of the articulate brachiopod Terebratalia transversa was a gift from James M. Turbeville. Pairs of PCR primers were first used to amplify short (500–700 nt) regions within rrnL, cox1, and cox3. The primers used for rrnL were 16SARL (CGC CTG TTT ATC AAA AAC AT) and 16SBRH (CCG GTC TGA ACT CAG ATC ACG T); the primers used for cox1 were LCO1490 (GGT CAA CAA ATC ATA AAG ATA TTG G) and HCO2198 (TAA ACT TCA GGG TGA CCA AAA AAT CA); the primers used for cox3 were CO3-DL1 (TGG TGG CGA GAT GTK KTN CGN CGN GA) and CO3-DL2 (ACW ACG TCK ACG AAG TGT CAR TAT CA). Amplifications employed 40 cycles of 94°C for 15 s, 50°C for 45 s, and 72°C for 120 s, and each yielded a single band when visualized with ethidium bromide staining and UV irradiation on a 1.2% agarose gel. The products were purified by three serial passages of ultrafiltration (Millipore Ultrafree spin columns, 30,000 NMWL), subjected to cycle sequencing reactions according to the supplier's (Perkin-Elmer) instructions, and analyzed on an ABI 377 automated DNA sequencer. From these sequences, primers were designed that were used to amplify the T. transversa mitochondrial genome in three pieces with the GeneAmp XL PCR kit. These six primers were tried in all possible pairwise combinations. We initially used reaction conditions of 94°C for 45 s, followed by 37 cycles of 94°C for 15 s, 63°C for 20 s, and 68°C for 8 min, followed by a final extension of 72°C for 12 min. This generated single fragments of approximately 6.4 kb and 3.0 kb with the primer pairs 16S forward/COI reverse and COI forward/COIII reverse. The primer pairs 16S reverse/COIII forward generated a fragment of approximately 5.4 kb, but with two smaller, faint bands even when the annealing temperature was raised to 63°C. Purification and sequence determinations were as above, with additional primers used to "primer walk" through each fragment. These three long-PCR products, together with the three shorter ones, jointly compose the entire T. transversa mitochondrial genome in overlapping segments. Both strands of all amplification products were sequenced.

Sequences were assembled and analyzed using Sequence Analysis and Sequence Navigator (ABI) and MacVector 6.5 (Oxford Molecular Group). Protein and ribosomal RNA genes were identified by their sequence similarities to the Lumbricus terrestris mtDNA homologs (Boore and Brown 1995Citation ). tRNA genes were identified either by using tRNAscan-SE (version 1.1, http://www.genetics.wustl.edu/eddy/tRNAscan-SE; Lowe and Eddy 1997) with a cove cutoff score of 0.1 or, in many cases where tRNAs were not found using this program, by recognizing potential secondary structures by eye.

The 5' ends of protein genes were inferred to be at the first legitimate in-frame start codon (ATN, GTG, TTG, GTT; Wolstenholme 1992) that did not overlap the preceding gene, except that overlap with an upstream tRNA gene was limited to the most 3' nucleotide of the tRNA.

Protein gene termini were inferred to be at the first in-frame stop codon unless that codon was located within the sequence of a downstream gene. Otherwise, a truncated stop codon (T or TA) adjacent to the beginning of the downstream gene (with the exception of atp8) was designated as the termination codon and was assumed to be completed by polyadenylation after transcript cleavage (Ojala, Montoya, and Attardi 1981Citation ). The 5' and 3' ends of rrnL and rrnS were assumed to be adjacent to the ends of bordering tRNA genes.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 
Gene Content and Arrangement
The T. transversa mtDNA encodes the 37 genes (those for 13 proteins, 22 tRNAs, and two rRNAs) most commonly found in animal mitochondrial genomes (Boore 1999Citation ). Figure 1 shows a circular map of these genes and the largest noncoding region.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 1.—Circular gene map of Terebratalia transversa mtDNA (GenBank accession number AF331161). All 37 genes are encoded on the same strand and transcribed clockwise, as indicated by the arrow, although the origin(s) remains unknown. Gene scaling is approximate. "NC" marks the longest (69 nt) noncoding region. One-letter amino acid abbreviations are used to label the corresponding tRNAs. The anticodons of both leucine tRNAs and both serine tRNAs are indicated in parentheses to distinguish members of each pair

 
There are seven cases in which genes appear to overlap; however, there is an alternative possibility. Among the seven apparent overlaps in the genome, six involve only the discriminator nucleotide of an upstream tRNA and the first nucleotide of the adjacent gene, suggesting that the tRNA gene may not actually encode this nucleotide. The seventh case is that of a 3-nt overlap between trnQ and trnW. However, it is possible that trnQ is actually shorter by these 3 nt and that the structure of tRNA(Q) is completed by polyadenylation, which would provide A's to match the T's at the 5' end of the acceptor stem, as has been described for other systems (Yokobori and Pääbo 1997Citation ).

Start and stop codons were inferred for each of T. transversa's 13 protein-encoding genes (table 1 ). There are five start codons used in the T. transversa mtDNA: ATG (cox2), ATT (nad4, nad4L, nad5, atp8), GTG (nad6, cox1, cox3, cob), GTT (nad2), and TTG (nad1, nad3, atp6). Five genes end on a complete stop codon, either TAA (nad2, atp6) or TAG (nad4L, cox1, cox3). The other eight genes end on an abbreviated stop codon that is presumably completed by polyadenylation of the mRNA (Ojala, Montoya, and Attardi 1981Citation ); this is simply T for all except nad6, which ends with TA. It is notable that for six of the eight genes that we interpret as ending at abbreviated stop codons (cob, nad5, nad6, nad1, cox2, and nad3), there is a complete stop codon farther downstream that, if used, would cause overlap of the downstream tRNA gene by 1–26 nt. As has been previously suggested (Boore and Brown 2000Citation ), the downstream stop codons could possibly act as "backups" in cases of improper transcript cleavage.


View this table:
[in this window]
[in a new window]
 
Table 1 Terebratalia transversa Start and Stop Codons

 
All of the genes are transcribed from the same strand, a relatively uncommon state among the 127 described animal mtDNAs (see Boore [1999]Citation and the Mitochondrial Genomics link at http://www.jgi.doe.gov), previously found only in the two other articulate brachiopods (Stechmann and Schlegel 1999Citation ; Noguchi et al. 2000Citation ), the blue mussel Mytilus edulis (Hoffmann, Boore, and Brown 1992Citation ), two annelids (Boore and Brown 1995, 2000Citation ; GenBank record NC_000931), four nematodes (Okimoto et al. 1991, 1992Citation ; Keddie, Higazi, and Unnasch 1998Citation ), a tunicate (Yokobori et al. 1999Citation ), six parasitic flatworms (Le at al. 2000Citation ), and the hexacoral Metridium senile (Beagley, Okimoto, and Wolstenholme 1998Citation ).

Among the three brachiopod species for which complete mtDNA sequences are available, the most closely related pair is T. transversa and L. rubellus, both members of the family Laqueidae. The mtDNA gene rearrangements are not well conserved between these two taxa and are even less well conserved between either of these taxa and the mtDNA of the cancellothyridid T. retusa (fig. 2 ). This latter genome retains a number of features inferred to be primitive for Brachiopoda, since they are also found in the distantly related polyplacophoran mollusk Katharina tunicata. The large number of gene rearrangements among these taxa suggests that gene arrangement data will be useful for inferring phylogeny at lower taxonomic levels within the brachiopod clade Articulata.



View larger version (68K):
[in this window]
[in a new window]
 
Fig. 2.—Comparison of the mitochondrial gene arrangements of the mollusk Katharina tunicata (Boore and Brown 1994Citation ) and the three sequenced articulate brachiopods. These circular genomes have been graphically linearized at the 3' end of the arbitrarily chosen cox3 gene. Thick lines below the K. tunicata genome indicate genes encoded by the opposite strand. Illustrated are all the gene junctions shared between K. tunicata and Terebratulina retusa (inferred, therefore, to be primitive for these brachiopods) and between Laqueus rubellus and each of the other two brachiopods. Those genes that are found in a common arrangement in the laqueids, L. rubellus and Terebratalia transversa, but are shared in a different arrangement between the mollusk and T. retusa are inferred to be derived for Laqueidae. Genes are abbreviated as in figure 1 , except that L1, L2, S1, and S2 are those tRNAs with anticodons uag, uaa, ucu, and uga, respectively

 
The genes atp8 and atp6 are adjacent in all arthropod and deuterostome mtDNAs and in that of the yeast Saccharomyces cerevisiae (Foury et al. 1999Citation ), sometimes in overlapping reading frames. It has been demonstrated that atp8 and atp6 are translated from a bicistronic mRNA in mammals (Fearnley and Walker 1986Citation ), and this is also a possible, although undemonstrated, phenomenon in other animals having atp8/atp6 adjacency. These genes are not adjacent in the mtDNAs of annelids (Boore and Brown 2000Citation ), of many mollusks, and of the two laqueid brachiopods. As has been suggested (Boore 1999Citation ), this loss of adjacency allowed by the loss of the co-translation of these messages may be a derived feature that characterizes a clade containing several phyla, a suggestion that awaits further phylogenetic and experimental analysis. However, if this should prove to be true, then the primitive state of gene arrangement (but presumably not that of translation) must have been retained in the squid Loligo bleekeri (Sasuga et al. 1999Citation ), the chiton K. tunicata (Boore and Brown 1995Citation ), and T. retusa (Stechmann and Schlegel 1999Citation ).

Genome Size
The sizes reported for completely sequenced animal mtDNAs range from 13.5 kb (Taenia crassiceps; Le et al. 2000Citation ) to 19.5 kb (Drosophila melanogaster; Lewis, Farr, and Kaguni 1995Citation ). Terebratalia transversa mtDNA, at 14,291 nt, is near the low end of this range and is comparable in size to that of L. rubellus (14,017 nt; Noguchi et al. 2000Citation ) or to those of some nematodes (Okimoto et al. 1991, 1992Citation ; Keddie, Higazi, and Unnasch 1998Citation ), those of gastropods (Hatzoglou, Rodakis, and Lecanidou 1995Citation ; Terrett, Miles, and Thomas 1996Citation ; Yamazaki et al. 1997Citation ), and those of parasitic flatworms (Le et al. 2000Citation ).

Each of the two laqueid mtDNAs is reduced in size in similar ways compared with the larger T. retusa mtDNA (table 2 ). There are fewer nucleotides in noncoding regions, the rRNA genes appear to be shorter, the tRNA genes are somewhat smaller (due especially to a decreased number of nucleotides in the T{Psi}C arms; fig. 3 ), and protein gene sizes have been drastically reduced. There is uncertainty regarding rRNA gene sizes, and precise end assignment is not possible from sequence information alone (e.g., the 45 nt at the 3' end of rrnL of L. rubellus do not align well with rrnL of T. transversa [data not shown], suggesting that some nucleotides assigned to the L. rubellus rrnL may actually be noncoding). The size reduction of the T. transversa protein-encoding genes is striking; there are ca. 1,500 fewer nucleotides (500 fewer codons) in L. rubellus and T. transversa than in T. retusa protein-encoding genes, a surprisingly large reduction.


View this table:
[in this window]
[in a new window]
 
Table 2 Nucleotide Composition

 


View larger version (28K):
[in this window]
[in a new window]
 
Fig. 3.—The potential secondary structures of the 22 inferred tRNAs. The DHU stem and loop of tRNA(T) may form, but if so, there would be 2 nt rather than 1 nt between the DHU arm and the anticodon arm. The lack of T{Psi}C and DHU arms of tRNA(R) and tRNA(I), respectively, is an unusual state for these two tRNAs relative to other published animal mtDNAs. We assume a minimum number of 3 nt in DHU and T{Psi}C loops and thus do not show potential base pairs that would leave <3 nt in a loop

 
Nucleotide Composition
Terebratalia transversa mtDNA (fig. 4 ) has an A+T composition of 59.1%, a value that is low compared with many other invertebrate mtDNAs. The base composition of the individual strands is biased and can be described by skewness (Perna and Kocher 1995Citation ), which measures the relative number of A's to T's (AT skew = [A-T]/[A+T]) and G's to C's (GC skew = [G-C]/[G+C]) on a strand; for the coding strand of T. transversa (table 2 ), AT skew = -0.33 and GC skew = +0.34. The skewness is even more extreme in protein-encoding sequences (AT skew = -0.41, GC skew = +0.37). AT skew is virtually absent outside of protein-coding genes, although GC skew remains positive. Among tRNAs and rRNAs (and in functionally important paired secondary structures in noncoding regions), a similar number of A's and T's is required for stem structure formation and perhaps accounts for the lower magnitude of AT skew; however, this requirement probably has less effect on GC skew, due to the lack of appreciable helix-destabilization by GT pairs.



View larger version (62K):
[in this window]
[in a new window]
 
Fig. 4.—Abbreviated mtDNA sequence of Terebratalia transversa. Numbers within the slash marks indicate omitted nucleotides. Non-coding nucleotides, including the large non-coding region, are underlined. The two secondary structures inferred to be formed by non-coding nucleotides are marked by hyphens (for stem nucleotides) and plus signs (for loop nucleotides). A dart(>) marks the last nucleotide for each gene and indicates the direction of transcription. Nucleotides of termination codons, complete or abbreviated (see text), of protein coding genes are underscored with carats(;p0). All genes are inferred to initiate with formyl-methionine regardless of the actual start codon shown here, and so this amino acid is shown in parentheses when not conforming to the genetic code

 
These skews may be generated during either transcription or replication, processes that expose C's and A's on the displaced (unpaired) strand to deamination (C->U and A->hypoxanthine, resulting in C->T and A->G transitions), whereas those on the other strand are protected by pairing with the bases of the nascent RNA or DNA strand (Francino and Ochman 1997Citation ; Reyes et al. 1998Citation ). Thus, one strand (the leading strand of replication and the nontemplate strand of transcription) may become GT-rich, while the other becomes AC-rich. MtDNA replication is highly asymmetric in mammals and in fruit flies (Clayton 1982Citation ), and this asymmetry may be general; if so, this model for the generation of a strand-biased nucleotide composition is especially important. Although the relative importance of the two contributing process (i.e., transcription vs. replication) remains to be assessed, our nucleotide skew data suggest that transcription does not play a role in the T. transversa system (discussed below).

Interestingly, T. transversa's skew values are very similar to those of the other sequenced laqueid, L. rubellus (AT skew = -0.29, GC skew = +0.27), but opposite to those of the cancellothyridid T. retusa (AT skew = +0.03, GC skew = -0.29) (table 2 ). Because all genes are transcribed in the same direction in each of these mtDNAs, mutation (deamination) pressure should be the same on the displaced strand, and the three mtDNAs should show the same or similar skewnesses relative to the coding strand if transcription is the most important cause of strand-biased nucleotide composition. That this expectation is not borne out indicates that mutational events during transcription cannot be causative of these differences, and suggests instead that replication is more important. It is possible that the origins of replication might be oppositely oriented in these two families of brachiopods and that deaminations during replication are the primary cause of skewness. In support of this hypothesis, the sequence of a portion of the noncoding region of each of the two laqueid mtDNAs is similar to the reverse complement of a portion of the noncoding region of T. retusa (fig. 5 ). The noncoding region in vertebrate and fruit fly mtDNA is known to contain sequence elements that mediate DNA replication, and it is possible that the noncoding region in these brachiopod mtDNAs contains functionally similar sequences. Alternatively, it is possible that the similarities seen here are themselves due to the differential biasing of base compositions of the two strands among these mtDNAs.



View larger version (9K):
[in this window]
[in a new window]
 
Fig. 5.—Alignment of the complete sequence of the longest noncoding regions in Terebratalia transversa (positions 1537–1605), Laqueus rubellus (positions 13906–13959), and the reverse complement of a segment (positions 7243–7175) of the nonrepetitive portion of the long noncoding region in Terebratulina retusa. Bold typeface indicates conservation of a nucleotide at a particular position in at least two species

 
Codon Usage and Amino Acid Composition
The extent to which synonymous codon usage is determined by selection is not clear, although in some systems it appears that certain codons are used more frequently in highly expressed genes for greater translational efficiency (Ikemura 1981, 1982Citation ). The extent to which, or even whether, this plays a role in animal mitochondrial systems is unknown, since there is, at present, no evidence for differential protein gene regulation in animal mitochondrial systems. Given that the protein products of these genes all act in concert in coupled oxidative phosphorylation/electron transport reactions, we regard translational regulation as an unlikely possibility.

For the most part, the bias in usage of synonymous codons in the proteins of T. transversa mtDNA follows the same pattern of nucleotide frequency (T > G > A > C) as the mtDNA coding strand as a whole (fig. 6 and tables 2  and 3 ). This bias is evident in both fourfold- and twofold-degenerate codon families, suggesting that third codon positions mostly reflect mutational bias. However, six codons, all of which have an unexpectedly high homodimer frequency at codon positions 2 and 3, are exceptions to this: TCC, CCC, GCC, CGG, AGG, and GGG. The most extreme variation is found in the two homotrimeric codons, CCC and GGG. The hypothesis that this reflects mutation pressure is bolstered by the observation that homopolymer runs ranging from 2 to 11 nt in length are more common than expected throughout the mtDNA, given the nucleotide frequencies. We are unable to determine whether this effect increases the frequency of NTT codons, since T is already the most common nucleotide, or of NAA codons, because all of these are in twofold-degenerate codon families and thus can be compared only with NAG codons (all of which are more common than NAA, as expected). For L. rubellus, a similar effect is apparent for only four codons, GCC, CGG, AGG, and GGG; all are more common than expected. Otherwise, third codon positions deviate from the expected ranked frequency T > G > A > C only because of the infrequency of NCG codons.



View larger version (29K):
[in this window]
[in a new window]
 
Fig. 6.—Frequency of each nucleotide by codon position for all protein-encoding genes. Order of bars: Terebratalia transversa, Laqueus rubellus, Terebratulina retusa

 

View this table:
[in this window]
[in a new window]
 
Table 3 Codon Usage of All Mitochondrially Encoded Genes of Three Articulate Brachiopods

 
Codon usage in T. retusa shows no such strikingly consistent biases. However, T. retusa's codon usage supports the hypothesis that selection maintains a large number of hydrophobic, nonpolar amino acids in membrane-associated proteins (Asakawa et al. 1991). The T's, more numerous than A's in T. retusa's protein genes as indicated by the negative skew value (table 2 ), are overrepresented at second codon positions (see fig. 6 ), and the amino acids specified by these codons (phenylalanine, leucine, isoleucine, methionine, and valine) are nonpolar and hydrophobic.

To what extent is amino acid composition determined by natural selection and to what extent by mutational bias? Clearly, purifying selection must eliminate certain mutations that would change amino acids at essential sites in mitochondrial proteins, and directional selection may sometimes create novel amino acid identities at some sites. At other, less constrained, positions, amino acids with similar physical or chemical traits can substitute acceptably, with some sites apparently tolerating even very radical amino acid substitutions; for these, substitutions would be expected to reflect mutational bias.

To address this question, we analyzed the extent to which the base composition at first and second codon positions is similar to that at third codon positions, which are much freer to vary in conformation with mutational bias. The amino acid compositions of the mitochondrial proteins of three brachiopods can be sorted into four physicochemical groups and are listed in descending order relative to their changes in frequency between T. transversa and T. retusa (table 4 ). The laqueids T. transversa and L. rubellus are very similar in this respect, differing only in their relative usages of the leucine codon TTR or CTN. Both differ greatly, however, from T. retusa. Within each of the four amino acid groupings, codons rich in T and G (e.g., TTR, GTN, and TTY in the nonpolar group) are much more common in the laqueids, and those rich in A and C (e.g., ATR, ATY, CCN, GCN, and CTN in the nonpolar group) are much rarer, suggesting that mutational bias is a prime factor influencing amino acid composition, at least within each physicochemical group.


View this table:
[in this window]
[in a new window]
 
Table 4 Comparison of the Amino Acid Compositions of the 13 Mitochondrially Encoded Proteins of the Articulate Brachiopods Terebratalia transversa (Ttr), Laqueus rubellus (Lru), and Terebratulina retusa (Tre)

 
tRNA and rRNA Genes
Proposed secondary structures for all 22 tRNAs are shown in figure 3 . Many of the tRNAs have reduced T{Psi}C arms: there is no T{Psi}C arm in tRNA(R), there is potential for only a single base pair in tRNA(K) and tRNA(Y), and there is potential for only 2 bp in 10 others. The tRNAs S1, S2, and I lack a paired DHU arm, a condition commonly found among serine, but not isoleucine, tRNAs. There is potential for a paired DHU arm in tRNA(T), but only if there are two nucleotides (AG) rather than one between the anticodon and DHU stems. Among the other tRNAs, nine have A's, five have G's, three have T's, and one has a C at this position. Between the DHU and acceptor stems, 14 tRNAs have the dinucleotide TA, 2 have GA, 2 have AA, and 1 has TG. All tRNAs have a "variable arm" with 4 nt. Thirteen of the anticodons are flanked by T and A, eight by T and G, and one by C and A.

Like all metazoan mitochondrial genomes, that of T. transversa encodes two rRNAs. The size of its rrnL is 1,105 nt, with A+T = 62.6% (the highest of any region in the genome) and AT and GC skews of -0.08 and 0.23, respectively. rrnS is 762 nt in size, with A+T = 59.2% and AT and GC skews of -0.06 and 0.24.

Noncoding Nucleotides
Terebratalia transversa mtDNA has 202 noncoding nucleotide pairs; 149 are in three noncoding regions of 35 (between atp8 and cox3), 42 (between nad2 and cox1), and 69 (between cox1 and trnC) nucleotides, and 53 are dispersed throughout the genome, either as single nucleotide pairs or in blocks ranging in size from five to nine pairs.

The enzymatic removal of tRNAs from a polycistronic transcript is necessary to release adjacent, gene-specific messages (Battey and Clayton 1980Citation ; Ojala et al. 1980Citation ; Rossmanith 1997Citation ). When two protein-encoding genes abut directly, in some cases both are translated from the same bicistronic message (Ojala et al. 1980Citation ); in some others, sequences with the potential to form stem-loop structures are present at the junctions and may mediate transcript cleavage (e.g., Boore and Brown 1994Citation ). Three protein-encoding gene pairs abut directly in T. transversa mtDNA: atp8-cox3, nad2-cox1, and nad4L-cox2. Both nad2 and nad4L have complete stop codons, so it is possible that each forms a bicistronic mRNA with cox1 and cox2, respectively. However, atp8 in the atp8-cox3 pair lacks a complete stop codon; in the absence of editing or processing, a single product would result from translation of the bicistronic transcript. For the atp8-cox3 and nad2-cox1 pairs, adjacent genes are separated by the 35- and 42-nt-long noncoding regions described above; each of the noncoding sequences has the potential to form an RNA stem-loop structure (fig. 7 ). No such potential could be identified for the nad4L-cox2 junction or for the 69-nt-long region following cox1.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 7.—Inferred secondary structures of two noncoding regions. A, The 35 nt between atp8 and cox3 can potentially form a secondary structure similar to that of a tRNA anticodon stem-loop. Nucleotides marked with asterisks are the same as those in the corresponding position of the tRNA(P) anticodon stem-loop. B, The inferred secondary structure of the 45 nt between nad2 and cox1

 
Notably, the proposed stem-loop structure between atp8 and cox3 is identical to the anticodon stem and loop of tRNA(P) at 13 of 17 nucleotide positions; these are marked with asterisks in figure 7 . A plausible explanation for the similarity of this stem-loop sequence to that of trnP is that this stem-loop is a duplicate of trnP that has mostly been deleted (Moritz, Dowling, and Brown 1987Citation ; Boore 2000Citation ). In this scenario, a duplicate copy of trnP could have been selectively maintained to the extent necessary to preserve its function in processing the atp8-cox3 transcript. An additional observation regarding the noncoding region between atp8 and cox3 (fig. 7 A ) is that the 9-nt sequence GAGGCAGCT appears twice (at positions 3008–3016 and 3023–3031); whether this sequence, which appears nowhere else in the T. transversa mtDNA, serves any functional role awaits experimental analysis.

The third and longest (69 nt) of the noncoding regions is similar in length to that of L. rubellus (54 nt). Each of these noncoding regions is much shorter than the longest noncoding region in T. retusa (794 nt), although in T. retusa only 287 of the 794 nucleotides are unique, with the remainder being in repeated sequences. Attempts at aligning the noncoding regions to search for possibly conserved functional elements revealed that the plus strand of the noncoding regions of the laqueids aligned best with the minus strand of the noncoding region of T. retusa (fig. 5 ). It is possible that this alignment is due to some conserved regulatory function, perhaps for DNA replication. If this is the case, then the opposite strand biases among these mtDNAs might be explained by a reversal in the direction of first-strand replication.

The near uniformity of proteins encoded by metazoan mtDNAs makes this DNA an excellent model system for the study of molecular evolution. Comparisons of mtDNAs from closely related species, such as these brachiopods, are of general utility for future studies of genome and protein evolution. Here, for example, we see an example of mutational bias as well as natural selection determining amino acid usage. Further descriptions and comparisons of mtDNAs, along with functional studies, will allow us to address questions such as which amino acids in each of the proteins are selected, what the roles of these amino acids are, and which portions of the proteins encoded by larger mtDNAs (e.g., that of T. retusa) have been lost by smaller genomes (e.g., those of the laqueids) and what functions they serve in the larger proteins.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 
We gratefully acknowledge D. Lavrov for his comments, which improved this manuscript. This work was supported by grant DEB9807100 from the National Science Foundation. Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract W-7405-Eng-48.


    Footnotes
 
Ross Crozier, Reviewing Editor

1 Keywords: Terebratalia transversa mitochondria genome evolution codon usage nucleotide skew Back

2 Address for correspondence and reprints: Kevin G. Helfenbein, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California 94598. kgh{at}umich.edu . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Acknowledgements
 References
 

    Asakawa S, Y. Kumazawa, T. Araki, H. Himeno, K. Miura, K. Watanabe, 1991 Strand specific nucleotide composition bias in echinoderm and vertebrate mitochondrial genomes J. Mol. Evol 32:511-520[ISI][Medline]

    Battey J., D. A. Clayton, 1980 The transcription map of human mitochondrial-DNA implicates transfer-RNA excision as a major processing event J. Biol. Chem 255:1599-1606[ISI]

    Beagley C. T., R. Okimoto, D. R. Wolstenholme, 1998 The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near-standard genetic code Genetics 148:1091-1108[Abstract/Free Full Text]

    Boore J. L., 1999 Animal mitochondrial genomes Nucleic Acids Res 27:1767-1780[Abstract/Free Full Text]

    ———. 2000 The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deuterostome animals Pp. 133–147 in D. Sankoff and J. Nadeau, eds. Comparative genomics, computational biology series. Vol. 1. Kluwer Academic Publishers, Dordrecht, the Netherlands

    Boore J. L., W. M. Brown, 1994 Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata Genetics 138:423-443[Abstract/Free Full Text]

    ———. 1995 Complete DNA sequence of the mitochondrial genome of the annelid worm, Lumbricus terrestris Genetics 141:305-319[Abstract/Free Full Text]

    ———. 2000 Mitochondrial genomes of Galatheolinum, Helobdella, and Platynereis: sequence and gene arrangement comparisons show that Pogonophora is not a phylum and Annelida and Arthropoda are not sister taxa Mol. Biol. Evol 17:87-106[Abstract/Free Full Text]

    Brusca R. C., G. J. Brusca, 1990 Invertebrates Sinauer, Sunderland, Mass

    Clayton D. A., 1982 Replication of animal mitochondrial DNA Cell 28:693-705[ISI][Medline]

    Fearnley I. M., J. E. Walker, 1986 Two overlapping genes in bovine mitochondrial DNA encode membrane components of ATP synthase EMBO J 5:2003-2008[Abstract]

    Foury F., T. Roganti, N. Lecrenier, B. Purnelle, 1999 The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae FEBS Lett 440:325-331[ISI]

    Francino M. P., H. Ochman, 1997 Strand asymmetries in DNA evolution Trends Genet 13:240-245[ISI][Medline]

    Hatzoglou E., G. C. Rodakis, R. Lecanidou, 1995 Complete sequence and gene organization of the mitochondrial genome of the land snail Albinaria coerulea Genetics 140:1353-1366[Abstract/Free Full Text]

    Hoffmann R. J., J. L. Boore, W. M. Brown, 1992 A novel mitochondrial genome organization for the blue mussel, Mytilus edulis Genetics 131:397-412[Abstract/Free Full Text]

    Ikemura T., 1981 Correlations between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system J. Mol. Biol 151:389-409[ISI][Medline]

    ———. 1982 Correlation between the abundance of yeast transfer RNAs and the occurrence of respective codons in protein genes J. Mol. Biol 158:573-597[ISI][Medline]

    Keddie E. M., T. Higazi, T. R. Unnasch, 1998 The mitochondrial genome of Onchocerca volvulus: sequence, structure and phylogenetic analysis Mol. Biochem. Parasitol 95:111-127[ISI][Medline]

    Le T. H., D. Blair, T. Agatsuma, et al. (14 co-authors) 2000 Phylogenies inferred from mitochondrial gene orders—a cautionary tale from the parasitic flatworms Mol. Biol. Evol 17:1123-1125[Free Full Text]

    Lewis D. L., C. L. Farr, L. S. Kaguni, 1995 Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons Insect Mol. Biol 4:263-278[ISI][Medline]

    Moritz C., T. E. Dowling, W. M. Brown, 1987 Evolution of animal mitochondrial DNA: relevance for population biology and systematics Annu. Rev. Ecol. Syst 18:269-292[ISI]

    Noguchi Y., K. Endo, F. Tajima, R. Ueshima, 2000 The mitochondrial genome of the brachiopod Laqueus rubellus Genetics 155:245-259[Abstract/Free Full Text]

    Ojala D., C. Merkel, R. Gelfand, G. Attardi, 1980 The transfer RNA genes punctuate the reading of genetic information in human mitochondrial DNA Cell 22:393-403[ISI][Medline]

    Ojala D., J. Montoya, G. Attardi, 1981 tRNA punctuation model of RNA processing in human mitochondria Nature 290:470-474[ISI][Medline]

    Okimoto R., H. M. Chamberlin, J. L. Macfarlane, D. R. Wolstenholme, 1991 Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne) nucleotide sequences, genome location and potential for host-race identification Nucleic Acids Res 19:1619-1626[Abstract]

    Okimoto R., J. L. MacFarlane, D. O. Clary, D. R. Wolstenholme, 1992 The mitochondrial genomes of two nematodes, Caenorhabditis elegans and Ascaris suum Genetics 130:471-498[Abstract/Free Full Text]

    Perna N. T., T. D. Kocher, 1995 Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes J. Mol. Evol 41:353-358[ISI][Medline]

    Reyes A., C. Gissi, G. Pesole, C. Saccone, 1998 Asymmetrical directional mutation pressure in the mitochondrial genome of mammals Mol. Biol. Evol 15:957-966[Abstract]

    Rossmanith W., 1997 Processing of human mitochondrial tRNA-Ser(AGY); a novel pathway in tRNA biosynthesis J. Mol. Biol 265:365-371[ISI][Medline]

    Sasuga J., S.-I. Yokobori, M. Kaifu, T. Ueda, K. Nishikawa, K. Watanabe, 1999 Gene contents and organization of a mitochondrial DNA segment of the squid Loligo bleekeri J. Mol. Evol 48:692-702[ISI][Medline]

    Stechmann A., M. Schlegel, 1999 Analysis of the complete mitochondrial DNA sequence of the brachiopod Terebratulina retusa places Brachiopoda within the protostomes Proc. R. Soc. Lond. B Biol. Sci 266:2043-2052[ISI][Medline]

    Terrett J., S. Miles, R. Thomas, 1996 Complete DNA sequence of the mitochondrial genome of Cepaea nemoralis (Gastropoda: Pulmonata) J. Mol. Evol 42:160-168[ISI][Medline]

    Wolstenholme D. R., 1992 Animal mitochondrial DNA: structure and evolution Int. Rev. Cytol 141:173-216[ISI][Medline]

    Yamazaki N., R. Ueshima, J. Terrett, et al. (12 co-authors) 1997 Evolution of pulmonate gastropod mitochondrial genomes: comparisons of gene organizations of Euhadra, Cepaea and Albinaria and implications of unusual tRNA secondary structures Genetics 145:749-758[Abstract/Free Full Text]

    Yokobori S., S. Pääbo, 1997 Polyadenylation creates the discriminator nucleotide of chicken mitochondrial tRNA(Tyr) J. Mol. Biol 265:95-99[ISI][Medline]

    Yokobori S., T. Ueda, G. Feldmaier-Fuchs, S. Pääbo, R. Ueshima, A. Kondow, K. Nishikawa, K. Watanabe, 1999 Complete DNA sequence of the mitochondrial genome of the ascidian Halocynthia roretzi (Chordata, Urochordata) Genetics 153:1851-1862[Abstract/Free Full Text]

Accepted for publication May 18, 2001.