Dramatic Mitochondrial Gene Rearrangements in the Hermit Crab Pagurus longicarpus (Crustacea, Anomura)

M. J. Hickerson and C. W. CunninghamGo,

Department of Zoology, Duke University

Abstract

The entire mitochondrial gene order of the crustacean Pagurus longicarpus was determined by sequencing all but approximately 300 bp of the mitochondrial genome. We report the first major gene rearrangements found in the clade including Crustacea and Insecta. At least eight mitochondrial gene rearrangements have dramatically altered the gene order of the hermit crab P. longicarpus relative to the putatively ancestral crustacean gene order. These include two rearrangements of protein-coding genes, the first reported for any nonchelicerate arthropod. Codon usage and amino acid sequences do not deviate substantially from those reported for other crustaceans. Investigating the phylogenetic distribution of these eight rearrangements will add additional characters to help resolve decapod phylogeny.

Introduction

The metazoan mitochondrial genome is a circular, double-stranded DNA molecule that is highly variable in DNA sequence but conservative in gene content (Wolstenholme 1992Citation ). Mitochondrial gene order has been found to vary across the Metazoa, generating interest in using mitochondrial DNA (mtDNA) gene order for phylogenetic inference (Boore, Lavrov, and Brown 1998Citation ). Although the apparent rarity of gene rearrangements provides few characters for phylogenetic reconstruction (Curole and Kocher 1999Citation ), useful characters will accumulate as more mitochondrial genomes are sequenced. Of the nine arthropods whose complete mitochondrial gene order has been determined, gene rearrangements have been restricted to tRNA genes, with the exception of protein-coding gene movements in two lineages of ticks. The two complete crustacean mitochondrial genomes sequenced to date have revealed no major deviation from the gene order found in Drosophila yakuba, aside from two tRNA gene movements in Artemia franciscana.

The phylogenetic position of crustaceans within the arthropods is not agreed upon, and the monophyletic status of the Crustacea has even been called into question (Ballard et al. 1992Citation ; Friedrich and Tautz 1995Citation ; Regier and Shultz 1997Citation ). Here, we describe the mitochondrial gene order of the hermit crab Pagurus longicarpus and report the first major gene rearrangements in any nonchelicerate arthropod. Comparisons with partial crustacean genomes suggest that the rearrangements in the hermit crab P. longicarpus appear to have arisen within the Decapoda. Because they are restricted to the Decapoda, these rearrangements will not help resolve the issue of crustacean monophyly. On the other hand, exploring their phylogenetic distribution may provide additional characters to help discern relationships among decapod groups.

Materials and Methods

DNA was extracted from a live P. longicarpus individual obtained from Beaufort, N.C., using the Genome kit (BIO 101) according to the manufacturer's instructions. With the exception of approximately 300 bp of an 850-bp AT-rich region, 15,630 bp of the mtDNA molecule was sequenced directly from a combination of overlapping 0.5–2.0-kb PCR products and a long PCR 7.5-kb product cloned into TOPO XL vectors (Invitrogen). The overlapping 0.5–2.0-kb fragments were PCR-amplified in 50 mM Tris HCl (pH 9.0), 20 mM ammonium sulfate, 0.005% BSA, and 2.5 mM MgCl2, with 200 µM of each dNTP, 0.5 mM of each primer, and 1.25 U Amplitaq Gold polymerase (Perkin Elmer) in a total volume of 50 µl.

The 7.5-kb fragment was initially amplified using the Boehringer Mannheim Expand Long Template PCR system with conditions specified for system 3. PCR products were inserted into the pCR®XL plasmid vector using the Invitrogen TOPO XL cloning kit. Plasmid DNA was purified from each clone using the QIAGEN plasmid miniprep kit. The 7.5-kb fragment was sequenced with M13 and T7 reverse primers and then subsequently with newly designed internal primers. As a check against PCR error, three different clones were sequenced, and a two-out-of-three rule was used to choose between base ambiguities. The 0.5–2.0-kb fragments were cycle-sequenced with the PCR amplification primers and internal primers when necessary. All cycle-sequencing was performed with ABI PRISM BigDye terminator chemistry and analyzed on ABI 373 or 377 automated sequencers (ABI/Perkin-Elmer).

Mitochondrial protein and RNA gene sequences were initially identified using BLAST searches. Confirmation of gene sequences was then done by alignment with Drosophila yakuba mitochondrial sequences using CLUSTAL W, version 1.4. The amino acid sequences of the protein-coding genes were inferred from the Drosophila genetic code. A preliminary screen for tRNA genes was done using tRNAscan-SE 1.1 (Lowe and Eddy 1997Citation ). The default search mode was used (cove cutoff score = 15), specifying mitochondrial/chloroplast DNA as the source and using the invertebrate mitochondrial genetic code for tRNA structure prediction. tRNA secondary structures were detected in this manner for 20 out of the 22 tRNA's, the two remaining tRNA's were detected by inspecting noncoding sequences for tRNA-like secondary structures by eye.

Contiguous sequences were obtained for the entire mitochondrial genome with the exception of approximately 300 bp within the AT-rich noncoding region. This sequence is deposited in GenBank under accession number AF150756.

Results and Discussion

The Mitochondrial Genome of the Hermit Crab P. longicarpus
Aside from approximately 300 bp within the AT-rich region that proved difficult to sequence, we obtained the entire mitochondrial genome of P. longicarpus. Our sequences include both rRNA genes, all 22 tRNA genes, and all 13 protein-coding genes commonly found in the mitochondria of metazoan animals (Wolstenholme 1992Citation ). All orientations and gene sequence junctions are shown in figure 1 . Each of the 22 putative tRNA sequences can be folded into cloverleaf structures similar to those found in other metazoan animals (fig. 2 ). The weakest inferred structures are tRNASer(AGN) and tRNAVal, with six base mispairings each (fig. 2 ). Although technical difficulties (most likely secondary structure) prevented sequencing of the entire AT-rich region (estimated to be approximately 850 bp in size), we obtained the sequence of its flanking regions (38 bp 5' and 500 bp 3'). Before discussing the gene rearrangements, we will discuss characteristics of the mitochondrial genome.



View larger version (33K):
[in this window]
[in a new window]
 
Fig. 1.—A schematic map of the of the mtDNA sequence of Pagurus longicarpus. Numbers within brackets are the numbers of nucleotides of each gene or noncoding region. With the exception of approximately 300 bp of the noncoding region, the entire circular mtDNA was sequenced and is represented here as a linear molecule. The direction in which each gene is transcribed is indicated by the direction of darts within the braces that delineate the extent of each gene. The inferred amino acids at the amino- and carboxy-termini of protein genes are given below the corresponding DNA sequence. Noncoding regions are labeled NCR if they are longer than 28 bp

 


View larger version (33K):
[in this window]
[in a new window]
 
Fig. 2.—Putative cloverleaf structures for the mitochondrial tRNA's from Pagurus longicarpus. Lines denote standard Watson-Crick pairs; dots denote suboptimal base pairings

 
Base Composition and Gene Content
For the gene sequences encoding the 13 protein genes, the nucleotide composition is as follows: A = 3,239 (29.1%), T = 4,509 (40.5%), G = 1,716 (15,4%), and C = 1,669(15.0%). The 30.40% GC content falls within the range found in arthropod mtDNA genomes so far. The lower end of this range is represented by the insect Apis mellifera (14.7%; Crozier and Crozier 1993Citation ) and the chelicerates Ixodes hexagonus (27.4%) and Rhipicephalus sanguineus (22.0%; Black and Roehrdanz 1998Citation ), while the higher end is represented by the crustaceans Artemia franciscana (35.5%; Valverde et al. 1994Citation ) and Daphnia pulex (37.7%; Crease 1999Citation ).

Translation Initiation and Termination Signals
All termini of the 13 protein-coding genes are shown in figure 1 and summarized in table 1 . As with the Drosophila genetic code, ATT, ATA, and ATG initiate translation in 12 out of 13 of the protein-coding genes (table 1 ). For the initiation of COI in P. longicarpus, we propose a novel 4-bp initiation codon (ATCA) based on the locations of COI initiation in D. yakuba and D. pulex (Clary and Wolstenholme 1985;Citation Crease 1999Citation ) and the fact that there are no other initiation codons near the beginning of COI in P. longicarpus. A 4-bp initiation codon (ATTA) has also been proposed for the COI of Daphnia pulex (Van Raay and Crease 1994Citation ). For 11 of the 13 protein-coding genes, the 3' sequence regions revealed a termination codon of TAA, while COI used TAG. For ND4, we could find no complete termination codon. Following Crease (1999)Citation , we inferred an incomplete termination codon beginning with the base T. This association between incomplete termination codons and tRNA genes is common in metazoan mtDNA (Wolstenholme 1992Citation ). One possible explanation is that complete termination codons are formed by polyadenylation of transcripts, as has been observed for some human mtDNA genes (Ojala, Montoya, and Attardi 1981Citation ).


View this table:
[in this window]
[in a new window]
 
Table 1 Comparisons of the Mitochondrial Protein-Coding Genes of the Hermit Crab (Pagurus longicarpus) with Those of the Brine Shrimp (Artemia franciscana) and the Fruit Fly Drosophila yakuba

 
As is common in many mtDNA genomes, a number of the genes overlap (Wolstenholme 1992Citation ). Of the eight overlapping genes reported here, two are encoded on the opposite strand (fig. 1 ). The A8-A6 overlap falls into the pattern consistent with both proteins being translated from a single bicistronic message by initiation at a 5'-terminal start site for A8 and at an internal start site for A6 (Wolstenholme 1992Citation ). The terminal end of the ND4L transcript overlaps with the initiation end of ND4 by 7 bp, although they are on different reading frames. The other five overlaps were found to be no longer than 3 bp. All 13 protein-coding genes for P. longicarpus were close in size to those for Drosophila and Daphnia, although the gene sizes of Artemia differed markedly from those of these three taxa in five cases. Each of nine protein-coding genes in Pagurus was closer in amino acid sequence to Drosophila and Daphnia than to Artemia (table 1 ), consistent with the large amount of divergence found in branchiopod crustaceans (e.g., Crease 1999Citation ).

tRNAs and Codon Usage
Twenty tRNAs have anticodons identical to those of D. yakuba and D. pulex, while two differ: tRNALys (TTT instead of CTT) and tRNASer(AGN) (TCT instead of GCT). This usage of TTT for tRNALys is consistent with most metazoan mtDNA systems. As with tRNALys, the difference for SerineSer(AGN) corresponds to the third base in the wobble position. The presence of these two anticodons in P. longicarpus could be due to codon usage. According to the codon usage inferred for the 13 P. longicarpus amino acid sequences, tRNALys is coded by AAA (rather than AAG) in 85.0% of the codons, while tRNASer is coded by AGA 52.0% of the time. This codon usage makes the anticodon of tRNALys more efficient than the anticodon found in D. yakuba (table 2 ).


View this table:
[in this window]
[in a new window]
 
Table 2 Number of Occurrences of the 3,711 Codons in the 13 Protein-Encoding Genes of Pagurus longicarpus mtDNA

 
Fourteen anticodons have T in the wobble position, seven have G, and tRNAMet has C. All nine tRNAs with anticodons that recognize fourfold-degenerate codons have T in the wobble position. Of the 11 anticodons that recognize twofold-degenerate codons, six have G and five have T in the first position. Comparisons of codon usage (table 2 ) indicate that for 6 of the 11 twofold-degenerate codons, the tRNAs match the less used codon. All six of these cases involve G/T mispairings, which are the most allowable type of base mispairings (Wolstenholme 1992Citation ).

Phylogenetic Utility of Mitochondrial Gene Rearrangements in the Decapoda
To demonstrate a mitochondrial gene rearrangement, we must know the original gene order. Inferring the primitive gene order in crustaceans is relatively straightforward, since several insects (e.g., D. yakuba) have a gene order identical to that of one of the two previously reported crustacean genomes (D. pulex; Crease 1999Citation ). The other previously reported crustacean gene order (the brine shrimp A. franciscana; Valverde et al. 1994Citation ) shows two tRNA rearrangements that are unique relative to all other arthropods. This suggests that the A. franciscana tRNA rearrangements are likely autapomorphies with respect to the taxa sampled to date.

In contrast, we found a total of eight rearrangements in the decapod hermit crab P. longicarpus relative to the putatively primitive D. pulex/D. yakuba gene order (fig. 3 ). Two of these include the first reported rearrangements of arthropod protein-coding genes outside of the Chelicerata (Black and Roehrdanz 1998Citation ; Campbell and Barker 1998Citation , 1999). These two rearrangements involve NADH2 and NADH3 (regions 2 and 3 in fig. 3 ). Only one rearrangement in P. longicarpus involved an inversion of the direction of transcription (tRNAVal, fig. 3 ).



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 3.—Gene arrangements in several arthropods. For the sections shown, the gene order of Penaeus notialis is identical to the Drosophila yakuba/Daphnia pulex gene order. Rearrangements between the putatively primitive D. yakuba/D. pulex gene order and the hermit crab Pagurus longicarpus are shown for eight contiguous blocks of genes: (1) tRNALeu(UAA) (L); (2) tRNAGly (G) ND3 tRNAAla (A); (3) tRNAIle (I) tRNAMet (M) ND2; (4) tRNATyr (Y); (5) the AT-rich region; (6) tRNAPro (P); (7) tRNAVal (V); and (8) tRNAGln (Q). Only tRNAs that took part in rearrangements are shown

 
All eight of these rearrangements appear to have arisen within the Decapoda. Not only are they absent in the crustaceans D. pulex and A. franciscana, but the partial sequence of the decapod shrimp Penaeus notialis (Garcia-Machado et al. 1996Citation ) is sufficient to show that it possesses none of the eight rearrangements (fig. 3 ). The presence of so many rearrangements increases the number of characters available to resolve decapod phylogeny. As with any character data, identically derived gene orders can arise through homoplasy. Homoplasy in mitochondrial gene order has been found in a number of taxa (Flook, Rowell, and Gellissen 1995; Mindell, Sorenson, and Dimcheff 1998Citation ; Dowtin and Austin 1999Citation ). The strongest inferences will be made when mitochondrial gene rearrangements are congruent with other sources of phylogenetic data (Curole and Kocher 1999Citation ).

With eight rearrangements, it may be possible to demonstrate the intermediate gene orders that led to the arrangement found in P. longicarpus. Using this information in conjunction with other molecular data will help to evaluate the probabilities of homology and convergence in gene order. An investigation of the phylogenetic distribution of these rearrangements, together with a multigene sequencing effort, is currently in progress.

Acknowledgements

We must thank K. Tieu, who first discovered that there is something unusual about the hermit crab mitochondrial genome. We also thank C. Morrison for help in the cloning procedure and D. Snyder for running the ABI 373 and 377 automated sequencers (ABI/Perkin-Elmer). This work was supported by NSF grant DEB-9615461 to C.W.C.

Footnotes

Ross Crozier, Reviewing Editor

1 Keywords: crustacean mitochondrial DNA gene order gene rearrangements hermit crab Pagurus longicarpus. Back

2 Address for correspondence and reprints: Cliff Cunningham, Department of Zoology, Duke University, Durham, North Carolina 27708-0325. E-mail: cliff{at}duke.edu Back

literature cited

    Ballard, J. W. O., G. J. Olsen, D. P. Faith, W. A. Odgers, D. M. Rowell, and P. W. Atkinson. 1992. Evidence from 12S ribosomal-RNA sequences that onychophorans are modified arthropods. Science 258:1345–1348.

    Black, W. C. I., and R. L. Roehrdanz. 1998. Mitochondrial gene order is not conserved in Arthropods: prostriate and metastriate tick mitochondrial genomes. Mol. Biol. Evol. 15:1772–1785.[Abstract/Free Full Text]

    Boore, J. L., D. V. Lavrov, and W. M. Brown. 1998. Gene translocation links insects and crustaceans. Nature 392:667–668.

    Campbell, N. J. H., and S. C. Barker. 1998. An unprecedented major rearrangement in an arthropod mitochondrial genome. Mol. Biol. Evol. 15:1786–1787.[Free Full Text]

    ———. 1999. The novel mitochondrial gene arrangement of the cattle tick, Boophilus microlus: fivefold tandem repetition of a coding region. Mol. Biol. Evol. 16:732–740.[Abstract]

    Clary, D. O., and D. R. Wolstenholme. 1985. The mitochondrial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organization and genetic code. J. Mol. Evol. 22:252–271.[ISI][Medline]

    Crease, T. J. 1999. The complete sequence of the mitochondrial genome of Daphnia pulex (Cladocera: Crustacea). Gene 233:89–99.

    Crozier, R. H., and Y. C. Crozier. 1993. The mitochondrial genome of the honeybee Apis mellifera: complete sequence and genome organization. Genetics 133:97–117.

    Curole, A. P., and T. D. Kocher. 1999. Mitogenomics: digging deeper with complete mitochondrial genomes. Trends Ecol. Evol. 14:394–398.[ISI][Medline]

    Dowtin, M., and A. D. Austin. 1999. Evolutionary dynamics of a mitochondrial rearrangement "hot spot" in the Hymenoptera. Mol. Biol. Evol. 16:298–309.[Abstract]

    Flook, P., H. Rowell, and G Gellissen. 1995. Homoplastic rearrangements of insect mitochondrial transfer-RNA genes. Naturwissenschaften 82:336–337.

    Friedrich, M., and D. Tautz. 1995. Ribosomal DNA phylogeny of the major extant arthropod classes and the evolution of myriapods. Nature 376:165–167.

    Garcia-Machado, E., N. Dennebouy, M. O. Suarez, J. C. Mounolou, and M. Monnerot. 1996. Partial sequence of the shrimp Pennaeus notialis mitochondrial genome. C. R. Acad. Sci. D Sci. Nat. 319:473–486.

    Lowe, T., and S. R. Eddy. 1997. tRNAscan-SE: a program for imroved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:955–964.[Abstract/Free Full Text]

    Mindell, D. P., M. D. Sorenson, and D. E. Dimcheff. 1998. Multiple independent origins of mitochondrial gene order in birds. Proc. Natl. Acad. Sci. USA 95:10693–10697.

    Ojala, D., J. Montoya, and G. Attardi. 1981. tRNA punctuation model of RNA processing in human mitochondria. Nature 290:470–474.

    Regier, J. C., and J. W. Shultz. 1997. Molecular phylogeny of arthropods and the significance of the Cambrian "explosion" for molecular systematics. Am. Zool. 38:918–928.

    Valverde, J. R., B. Batuecas, C. Moratilla, R. Marco, and R. Garesse. 1994. The complete mitochondrial DNA sequence of the crustacean Artemia franciscana. J. Mol. Evol. 39:400–408.[ISI][Medline]

    Van Raay, T. J., and T. J. Crease. 1994. Partial mitochondrial DNA sequence of the crustacean Daphnia pulex. Curr. Genet. 25:66–72.

    Wolstenholme, D. R. 1992. Animal mitochondrial DNA: structure and evolution. Academic Press, New York.

Accepted for publication December 21, 1999.