The Complete Nucleotide Sequence of the Mitochondrial DNA of the Agnathan Lampetra fluviatilis: Bearings on the Phylogeny of Cyclostomes

Christiane Delarbre*, Hector Escriva{dagger}, Cyril Gallut*{ddagger}, Véronique Barriel{ddagger}, Philippe Kourilsky*, Philippe Janvier§, Vincent Laudet{dagger} and Gabriel GachelinGo,*

*Département d'Immunologie, Unité de Biologie Moléculaire du Gène, Institut Pasteur, Paris, France;
{dagger}Laboratoire de Biologie Moléculaire et Cellulaire, Ecole Normale Supérieure Lyon, France;
{ddagger}Service de Systématique Moléculaire, Institut de Systématique Centre du National de la Recherche Scientifique, Muséum National d'Histoire Naturelle, Paris, France; and
§Laboratoire de Paléontologie, Muséum National d'Histoire Naturelle, Paris, France

Abstract

There are two competing theories about the interrelationships of craniates: the cyclostome theory assumes that lampreys and hagfishes are a clade, the cyclostomes, whose sister group is the jawed vertebrates (gnathostomes); the vertebrate theory assumes that lampreys and gnathostomes are a clade, the vertebrates, whose sister group is hagfishes. The vertebrate theory is best supported by a number of unique anatomical and physiological characters. Molecular sequence data from 18S and 28S rRNA genes rather support the cyclostome theory, but mtDNA sequence of Myxine glutinosa rather supports the vertebrate theory. Additional molecular data are thus needed to elucidate this three-taxon problem. We determined the complete nucleotide sequence of the mtDNA of the lamprey Lampetra fluviatilis. The mtDNA of L. fluviatilis possesses the same genomic organization as Petromyzon marinus, which validates this gene order as a synapomorphy of lampreys. The mtDNA sequence of L. fluviatilis was used in combination with relevant mtDNA sequences for an approach to the hagfish/lamprey relationships using the maximum-parsimony, neighbor-joining, and maximum-likelihood methods. Although trees compatible with our present knowledge of the phylogeny of craniates can be reconstructed by using the three methods, the data collected do not support the vertebrate or the cyclostome hypothesis. The present data set does not allow the resolution of this three-taxon problem, and new kinds of data, such as nuclear DNA sequences, need to be collected.

Introduction

The relationships between hagfishes, lampreys, and jawed vertebrates (Gnathostoma) are one of the still-unresolved three-taxon problems in craniate phylogeny. Since Dumeril (1806) classified hagfishes and lampreys in the taxon Cyclostomi, characterized by horny teeth, a large notochord and pouch-shaped gills, the monophyly of this group has rarely been questioned, despite the subsequent discovery of apparently unique characters shared by the lampreys and the gnathostomes only. The apparent primitiveness of hagfishes has long been regarded as a consequence of "degeneracy" due to borrowing habits, a theory which still has many adherents (Fernholm 1985Citation ; Yalden 1985Citation ). Lovtrup (1977)Citation was the first to propose that, according to character distribution, lampreys should be regarded as the sister group of the gnathostomes and, thus, that lampreys and hagfishes are paraphyletic. Janvier (1978)Citation coined the name Myopterygii ("muscularized fins") for the group including lampreys and the gnathostomes, but later considered it a synonym of Vertebrata, since only lampreys and gnathostomes have vertebral elements (Janvier 1981Citation ). This theory implies that the horny teeth and complex "rasping tongues" of hagfishes and lampreys are either homoplastic or, more likely, a general craniate character, lost in jawed vertebrates. Thus, there exist two competing theories about the interrelationships of craniates, i.e., animals with a skull. The "cyclostome theory" assumes that lampreys and hagfishes are a clade, the cyclostomes, whose sister group is the gnathostomes. The "vertebrate theory" assumes that lampreys and gnathostomes are a clade, the vertebrates, whose sister group is hagfishes.

The vertebrate theory is supported by a large number (about 50) of unique anatomical and physiological characters (Lovtrup 1977Citation ; Hardisty 1982Citation ; Janvier 1996Citation ). Although only few DNA sequences related to the phylogeny of cyclostomes have been determined, sequence data from rRNA support the cyclostome, rather than the vertebrate, theory (Stock and Whitt 1992Citation ; Mallat and Sullivan 1998Citation ). Sequence data from the mtDNA of some species including the hagfish Myxine glutinosa tend to support the vertebrate theory (Rasmussen, Janke, and Arnason 1998Citation ) although the data set was limited in size. Several phylogenetically important mtDNA sequences have been determined since the sequence of the mtDNA of the hagfish was published. They include two lancelets, Branchiostoma lanceolatum (Spruyt et al. 1998Citation ) and Branchiostoma floridae (Boore, Daehler, and Brown 1999Citation ); two chondrichthyans; Scyliorhinus canicula (Delarbre et al. 1998Citation ) and Mustelus manazo (Cao et al. 1998Citation ); and a hemichordate; Balanoglossus carnosus (Castresana, Feldmaier-Fuchs, and Pääbo 1998Citation ). Also, since the genomic organization of the mtDNA of the lamprey Petromyzon marinus (Lee and Kocher 1995Citation ) was found to be different from the consensus organization of the mtDNA of vertebrates, it was of interest to determine the nucleotide sequence and genomic organization of the mtDNA of another lamprey so as to ascertain the homogeneity of the zoological group and make available an extended data set with which to approach the hagfish-lamprey-gnathostomes relationships.

We thus determined the entire mtDNA sequence of another lamprey, Lampetra fluviatilis. The mtDNA of L. fluviatilis possesses the same genomic organization as P. marinus. This finding validates this distinctive gene order as a synapomorphy of lampreys. The mtDNA sequence was used, in combination with relevant mtDNA sequences, to approach the present three-taxon problem using three different computational methods. In contrast with previous conclusions (Naylor and Brown 1998Citation ), the present analysis shows that complete coding regions of the mtDNA may be used to study the phylogenetic relationships between recent but anciently rooted animals but does not, however, settle the problem of the monophyly/paraphyly of agnathans. Finally, the present study points to the need for additional mtDNA and nuclear DNA sequences, such as those of urochordates and cyclostomes.

Materials and Methods

Animals
A specimen of L. fluviatilis was caught in the Atlantic Ocean on the shores of the estuary of the Garonne river. The animal was anesthesized, killed, and dissected. Organs were immediately frozen in liquid nitrogen and stored at -78°C.

Preparation of DNA
Total DNA was prepared from lamprey muscles by Proteinase K digestion according to conventional procedures (Hogan et al. 1994Citation ).

Isolation and Sequencing of mtDNA
Overlapping fragments of mtDNA were obtained by PCR run on total genomic DNA and using degenerate primers. The sequences of the primers used for PCR amplification are described in table 1 . Two strategies for obtaining PCR-amplified DNA were used.


View this table:
[in this window]
[in a new window]
 
Table 1 PCR Primers Used in the Determination of the Nucleotide Sequence of the Lampetra fluviatilis Mitochondrial Genome

 
For PCR aimed at isolating the 16S rRNA to ND4 sequence, the conditions were as follows: 300 ng total DNA, 200 µM dNTP, and 500 nM primers; the enzyme was the Pfu Exo + polymerase (Stratagene); the thermal cycles were 3 min at 94°C followed by 50 cycles of 1 min at 94°C, 1 min at 48°–55°C depending upon the primers, and 5 min at 72°C. The last cycle was ended by incubating for 10 min at 72°C. These PCR products were purified by electroelution, phosphorylated using the polynucleotide kinase (Pharmacia), and ligated at the EcoRV site of dephosphorylated KS Bluescript vector (Stratagene) using the Rapid DNA Ligation kit (Boehringer). XL1Blue-competent bacteria (Stratagene) were transformed with the ligation products. Several recombinant clones were selected for each PCR product, and plasmid DNA was recovered using the ClearCut Miniprep kit (Stratagene). The cloned PCR products were sequenced first using the M13 (-40) and KS reverse primers, and subsequently using primers derived from the sequence determined. Thus, primer walking was used throughout with primers located on both strands. Three hundred base pairs were read using each primer. The entire sequence was determined on both strands. All overlapping sequences were found to be identical.

For PCR aimed at the isolation of the ND4-16S rRNA segment of the mtDNA, the Expand Long Template PCR system (Boehringer) was used. The PCR conditions were as follows: 500 ng DNA, 350 µM dNTP, and 300 nM primers. Denaturation of the template for 2 min at 92°C was followed by 10 cycles of 10 s at 92°C, 30 s at 60°C, and 8 min at 68°C, followed by 20 cycles with an elongation step increased by 20 s/cycle. The last cycle was ended by incubating for 7 min at 68°C. The 8-kb-long fragment was cloned in the TOPO vector (Invitrogen), digested with EcoRI, and subcloned in pBCKS (Stratagene). The subcloned fragments were sequenced using forward and reverse primers. The two strands were sequenced.

Phylogenetic Analyses
The 13 protein-coding sequences of the mtDNA of 12 taxa; Paracentrotus lividus (Cantatore et al. 1989Citation ), Asterina pectinifera (Asakawa et al. 1995Citation ), B. carnosus, B. lanceolatum, B. floridae (we used the nucleotide sequence corrected by Boore, Daehler, and Brown [1999]Citation ), M. glutinosa, L. fluviatilis, P. marinus, S. canicula, M. manazo, Cyprinus carpio (Chang, Huang, and Lo 1994Citation ), and Oncorhynchus mykiss (Zardoya, Garrido-Pertierra, and Bautista 1995Citation ), were first aligned manually following translation of the nucleotide sequences using the mitochondrial genetic code specific for each taxon. The alignment was refined using the physical amino acid similarity criterion. All inserted gaps were triplets. The 5' parts of the ND5 genes (about 200 bp) were not amenable to alignment and were thus omitted. Whenever alignment problems due to deletion or insertion events were locally encountered, a parsimony analysis search using PAUP was carried out on that zone in order to define the "optimal" alignment that would (1) minimize the number of inferred mutations (number of steps), (2) test the number of weighted mutations (one transition [Ts] preferred to one transversion [Tv]), (3) minimize the number of variable sites, and (4) minimize the a priori phylogenetic implications of the alignments (Barriel 1994aCitation ). Thus, each alignemnt was tested with Tv = 1 or Tv = 2 for three different codings of the gaps (gap = missing data ?, gap = new state, and gap = ID). Indeed, standard procedures for coding gaps offer alternative strategies which suffer from several weaknesses: either the different sites are analyzed independently (gap = new state) so that each gap is artificially weighted relatively to the number of sites, or each site is coded "?" (gap = missing data) and the optimization procedure makes the whole zone devoid of phylogenetic information. Our new coding strategy (gap = ID) is aimed at expressing the potential phylogenetic information contained in complex zones with internested insertions/deletions (indels) and substitutions (Barriel 1994bCitation ). It precludes loss or distortion of information in all of those cases in which gaps are present in the aligned sequences. Gaps are coded as multiple-state characters; following an indel, shared subsequent mutations (substitutions as well as indels) should be interpreted in terms of common descent. According to the hierarchy of internested states of characters, this strategy introduces question marks in the data matrix, which are optimized in fine in cladograms based on all data. They are thus not missing data, but rather methodological codes, neutral to any a priori phylogenetic analysis (Barriel 1994a, 1994bCitation ). The accession numbers of the aligned sequences are 39776, 39786, 39787, 39789, and 39795–39803.

Three different analysis were carried out. The maximum-parsimony (MP) method was applied, using the branch-and-bound search implemented in PAUP, version 3.1.1 to determine the most parsimonious tree. The robustness of the MP trees was tested using the bootstrap method with 500 replications each implemented in PHYLIP (Felsenstein 1985, 1993Citation ) and using the Bremer (1988)Citation support. The analysis of nucleotide and amino acid sequences was also performed using the neighbor-joining (NJ) algorithm (BioNJ; Gascuel 1997Citation ) with Kimura's (1980)Citation two-parameter model for nucleotides and Kimura's (1983)Citation model for amino acids. Finally, the fastDNAml program was used for maximum-likelihood (ML) analysis of DNA sequences (Felsenstein 1981Citation ; Olsen et al. 1994Citation ). The protml program was used for ML analysis of amino acid sequences using the Jones, Taylor and Thornton model (jtt; Adachi and Hasegawa 1996Citation ).

Results

Comparison of the mtDNA of L. fluviatilis with the mtDNA of P. marinus
Overall Genomic Organization
The mtDNA of L. fluviatilis (accession number Y18683) is 16,159 bp long, thus slightly smaller (42 bp) than the mtDNA of P. marinus, a size difference mostly due to a shorter noncoding region II (see below). The mtDNA of L. fluviatilis harbors 13 protein-coding genes, 22 tRNAs genes and 2 rRNAs, and a noncoding region (including the control region). The molecular map is identical to that of P. marinus (table 2 ) and is thus characteristic of the lampreys and different from the map of the mtDNA of all other vertebrates.


View this table:
[in this window]
[in a new window]
 
Table 2 Localization of the Genes in the Mitochondrial Genome of Lampetra fluviatilis

 
Control Regions
As in P. marinus, the control region of the mtDNA of L. fluviatilis is located between the ND6 and CYTb genes, instead of being located between the CYTb and 12S rRNA genes. It is split into two parts (noncoding regions I and II) which are 491 and 151 bp long, respectively, in L. fluviatilis, separated by the tRNA-Thr and tRNA-Glu genes. The percentage of similarity between the nucleotide sequences of the noncoding regions of the two lampreys reaches 90.6%. The noncoding region II of Lampetra is only 151 bp long, whereas that of Petromyzon is 199 bp long. This difference is due to the absence of two repeats in Lampetra. The percentage of identity is 86.7% in the part of this region common to the two species.

We have previously shown that the origin of replication of the light chain of the mtDNA of L. fluviatilis is not located between the ND1 and ND2 genes (Delarbre et al. 1997Citation ). No other obvious origin of replication of the light chain can be evidenced elsewhere in the noncoding and control regions of the mtDNA of L. fluviatilis or P. marinus.

RNAs
The percentages of identity of the 12S and 16S rRNA genes of Petromyzon and Lampetra are 96.3% and 93.9%, respectively.

The locations of all tRNA genes are found to be identical in both species. However, the sizes of the tRNA-Trp, -Cys, -Asp, -Phe, and -Ser (AGY) genes differ by a single (either missing or additional) nucleotide. The sequences of the tRNAs are slightly divergent, with an average 94.7% similarity, with tRNA-Arg and tRNA-Gln genes being strictly identical in the two species. The overall 80 nucleotide differences are mainly located in the DHU (13) or in the T{Psi}C (34) loops, although 22 are located in the stems (table 3 ). The more divergent tRNA genes are the tRNA-Cys, -Pro, and -Trp genes. The absence of loop in the T arm of the tRNA-Phe gene of Lampetra is worthy of note: all five nucleotides of the T arm can pair in Lampetra, whereas only three nucleotides pair in Petromyzon, and an additionnal nucleotide yields a loop at the end of the T arm.


View this table:
[in this window]
[in a new window]
 
Table 3 Distribution of Substitutions in the tRNAs of Lampetra and Petromyzon

 
Protein-Coding Genes
The sizes of the genes coding for the 13 proteins are the same in the two species. The size of the cytochrome b protein is the same in both species but is 12 amino acids longer than in fishes. The initiation codons have no distinctive features. Two termination codons are found to be different between Lampetra and Petromyzon. They are TAG for the ND3 gene of Lampetra (TAA for Petromyzon) and AGG for the ND5 gene of Lampetra (AGA for Petromyzon). We have controlled the identity of the stop codons used by ATPase 6 and COI mRNAs by cloning the corresponding cDNAs and sequencing the 3' part. In Petromyzon, Lee and Kocher (1995)Citation found the ATPase 6 amino acid sequence to be 11 amino acids longer (with an overlap of 35 nt with the COIII gene and a stop codon AGA) than in other animals. In L. fluviatilis, the mRNA coded by the ATPase 6 gene is polyadenylated after the T located immediately before the ATG of the COIII gene, giving a stop codon TAA and an ATPase 6 protein the same size as those in other animals. We assume that the same process is used to terminate the ATPase 6 coding sequence in Petromyzon. The mRNA coding for the COI protein (using the stop codon AGA) shows that the tRNA-Ser (TCN) gene which follows the coding sequence of COI is transcribed on the same RNA, with the polyadenylation occuring after the tRNA-Ser (TCN) gene sequence. The COI gene overlaps the tRNA-Ser gene by 10 nt. Incidentally, a similar situation is found for M. glutinosa: AGG is used as the stop codon, and the tRNA-Ser (TCN) gene which immediately follows the COI gene is transcribed on the same RNA, with the polyadenylation occuring after the tRNA-Ser (TCN) gene. In M. glutinosa, the overlap between the COI gene and the tRNA-Ser (TCN) gene is 13 nt long. A similar finding has been reported for humans except for the absence of overlap between the COI and tRNA-Ser genes (Anderson et al. 1981Citation ).

Lampetra fluviatilis employs essentially the same codon usage as P. marinus. However, the codons ended by G are more frequently used. In particular, the GCG codon is used eight times by Lampetra and is not used at all in the mitochondrial genes of Petromyzon. In the ND6 gene, which is located on the L strand of the mitochondrial DNA, there is not the same usage of C and A in Petromyzon and Lampetra in the third nucleotides of fourfold-degenerate codons (table 4 ).


View this table:
[in this window]
[in a new window]
 
Table 4 Third-Nucleotide Usage in the Fourfold-Degenerate Codons

 
The amino acid compositions of the proteins coded by the mitochondrial DNA are identical in P. marinus and L. fluviatilis. The deduced amino acid sequences were easy to align at both the amino acid and the nucleotide levels. The percentage of similarity at the nucleotide level was 82.5%–87.9% (average 85.8%) depending on the genes (table 5 ). The sequence of the COI gene is the most conserved as usual, with the least conserved being the gene coding for ATPase 6. At the amino acid level, a high percentage of similarity is noted; it ranges from 97.9% for ND4L to 83.6% for ATPase 8, with an average of 92.5%.


View this table:
[in this window]
[in a new window]
 
Table 5 Comparison Between Mitochondrial Protein-Coding Sequences of Lampetra and Petromyzon

 
Substitutions
When the nucleotide sequence of Petromyzon was compared with that of Lampetra, the C-T or T-C substitutions were found to be the most frequent, with an average of 55.7% of total substitutions in the 13 genes. The average of the percentages of A-G or G-A substitutions was 16.8%. The frequency of G-C or C-G substitutions was only 1.2% of the total number of substitutions. The ratio of nonsilent to total substitutions varies with the genes, ranging from 8.1% to 42.9% with peaks above 30% for the ATPase 6, ATPase 8, and ND6 genes. The passage of C to T or T to C is usually synonymous (particularly in the COI gene, where only 0.9% of the mutations are nonsynonymous), in contrast to the ND6 gene, for which 44% of the C-T or T-C substitutions are nonsynonymous. In the ATP8 and ATP6 genes, the percentage of nonsilent substitutions is high for nearly all kinds of substitutions.

Comparison of the mtDNA of the Lampreys to that of the Hagfish
At this stage, it can be concluded that few differences exist between the mtDNA and the deduced amino acid sequences of Petromyzon and Lampetra, making them a homogeneous group. In contrast, the data available so far concerning the hagfish (i.e., the protein-coding sequences) point to a remarkable divergence between the lampreys and a third representative of cyclostomes, the hagfish M. glutinosa. First, the gene map of the mtDNA of M. glutinosa is identical to that of the gnathostomes in general (Rasmussen, Janke, and Arnason 1998Citation ) and thus differs from that of the lampreys. Second, the nucleotide sequences of the protein-coding genes of the hagfish were found to be difficult to align for phylogenetic analysis (see below) with the corresponding genes of the lampreys, and the sizes of the genes were also not always identical. More importantly, the average percentages of similarity of the 13 mitochondrial proteins were found to be only 52.6% (Petromyzon compared with Myxine) and 53% (Lampetra vs. Myxine), whereas the percentage of similarity between Petromyzon and Lampetra was 92.5% and that between Petromyzon and the chondrichthyan Scyliorhinus canicula was 62.4%. Some additional differences between the mtDNA of the lampreys and the hagfish could also be noted: the codon usage of M. glutinosa was markedly different, and serine and phenylalanine were more frequently used by the lampreys than by the hagfish, in contrast with threonine and alanine, which exhibited the opposite behavior.

Analysis of the Lamprey/Hagfish Relationships
The high divergence between hagfishes and lampreys does not tell much about their phylogenetic relationships. Thus, a phylogenetic analysis was carried out on the mtDNA of 12 taxa "flanking" the hagfish/lamprey node and including L. fluviatilis, including P. lividus, A. pectinifera, B. carnosus, B. lanceolatum, B. floridae, M. glutinosa, L. fluviatilis, P. marinus, S. canicula, M. manazo, C. carpio, and O. mykiss. The analysis was carried out on the complete data set and on the subset of it as defined below.

Analysis of Nucleotide Sequences
Use of All Protein-Coding Mitochondrial Genes
The complete data set (13 genes) was composed of 12 taxa, and 11,583 nt were used as sites (3,424 invariant and 7,266 parsimony-informative sites). Three taxa were used as outgroups: B. carnosus, A. pectinifera, and P. lividus. Using the MP method without weighting (equal weight given to transitions and transversions) and gaps treated as missing data, a single most-parsimonious tree was obtained in which cyclostomes are paraphyletic (tree A) (fig. 1 ; length = 26,979 steps, consistency index [CI] = 0.577, and retention index [RI] = 0.498). All the nodes were supported by a 100% bootstrap value, except the craniate node (69% bootstrap value) and the vertebrate node (64% bootstrap value). If gaps were recoded as in Barriel (1994a, 1994b)Citation (Gap = ID), the same unique resulting tree was computed (length = 27,114 steps, CI = 0.578, and RI = 0.500). The same result was obtained when transversions were given double the weight of transitions and when gaps were treated as missing data (length = 40,918 steps).



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1.—Phylogenetic position of agnathans determined by using the maximum-parsimony strategy. The most parsimonious tree (pattern A) was computed out of the 11,583 aligned positions using PAUP, version 3.1.1 (Swofford 1993Citation ), with branch-and-bound search, unweighted parsimony, and gaps treated as missing data. The values above the branches indicate both the minimum and the maximum possible lengths (synapomorphies or autapomorphies) of the branch according to the optimization of informative sites according to the table of linkage present in the "Describe Trees" dialog box of PAUP. Numbers below the branches indicate the Bremer (1994)Citation support, i.e., the number of steps needed for a node to disappear in a most-parsimonious tree (italics), and the bootstrap proportions (bold type).

 
We used two additional methods to approach the hagfish/lamprey relationships. The same data set was used in a neighbor-joining analysis in the Kimura's distance analysis using dnadist implemented in PHYLIP. All coding nucleotides were used, with a weighting of 2 for transversions. A unique tree denominated "B" was obtained (fig. 2 ) in which cyclostomes appeared to be monophyletic, with bootstrap values of 64.5% for the cyclostome node and 100% for all other nodes. Upon ML analysis using dnaml, the same data set also resulted in a unique tree of the B type in which cyclostomes were monophyletic. All nodes were at P < 0.01. Bootstraps values were 100% for all nodes except for the node supporting the monophyly of cyclostones (98.5%).



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 2.—Phylogenetic position of agnathans determined using the neighbor-joining (NJ) and maximum-likelihood (ML) analyses. The same data set as in figure 1 was used, generating the unique B tree. The values above the lines are bootstrap values obtained using ML analysis. The values below the lines are bootstrap values obtained using NJ analysis.

 
Use of a Selected Subset of Data
Naylor and Brown (1998)Citation suggested that subsets of genes should be selected prior to computational analysis and preferably used in place of the complete mtDNA-coding sequences. In order to select the proper set of genes to be used, every individual gene in each taxon was first studied using MP and NJ strategies and using the same weighting and coding as for the complete nucleotide sequence. Several different topologies were generated. Among them, trees A and B were the most commonly observed and were also in agreement with the generally admitted phylogenetic relationships, as well as with the cladograms obtained using the complete mtDNA-coding sequences. The nucleotide sequences of the genes which generated mostly A or B patterns (the subset comprised COI, COII, COIII, ND1, ND4, and ND5, with a total of 7,239 nucleotides, of which 2,376 were invariant and 4,321 were parsimony-informative) were then assembled into a unique nucleotide sequence and analyzed as above. Upon parsimony analysis, the subset of data yielded a unique parsimonious tree (length = 15,685 steps, CI = 0.577, and RI = 0.504) identical to that obtained using unselected mtDNA sequences and in which cyclostomes appeared paraphyletic. The cyclostomes also appeared paraphyletic when the NJ method was used but became monophyletic when the ML strategy was used with all nodes at P < 0.01.

Naylor and Brown (1998)Citation also suggested usage of only the first and second codon positions of sites modally coding for proline, cysteine, methionine, glutamine, and asparagine. Using the first and second codon positions of these sites in the above defined subset of genes (a total of 776 sites, 359 invariant and 338 parsimony-informative), MP analysis yielded a unique A-type tree, with cyclostomes being paraphyletic. Interestingly enough, the consistency and retention indices were higher (length = 996 steps, CI = 0.719, and RI = 0.685). Bootstrap values were 100%/97%. The same result (tree A) was obtained by using the three nucleotides of each codon encoding modal amino acids.

Analysis of Amino Acid Sequences
Analysis of All Protein Sequences
The amino acid data set is less prone to saturation effects than the nucleotide data set. Thus, the amino acid sequences already aligned as a requisite for the alignment of the nucleotide sequences were also analyzed using the same three methods. MP analysis (3,848 sites, 1,276 constant and 2,183 parsimony-informative) yielded a single B-type parsimonious tree (cyclostomes are monophyletic) (length = 8,729 steps, CI = 0.812, and RI = 0.720). Bootstrap values were 98% for the cyclostome node and 100% for the other nodes. NJ analysis yielded a single tree of the A type (M. glutinosa appears to be paraphyletic). Bootstrap values were 63% for the vertebrate node and 100% for the others. ML analysis yielded two equally possible trees (A and B, with {Delta}1n -54,568.0 and -54,567.3, respectively).

Analysis of Protein Subsets
The amino acid sequences of the individual genes were analyzed using the MP analysis (2,407 sites, 910 constant and 1,256 parsimony-informative). Again, several topologies were obtained, with the A and B patterns predominating in the proteins whose genes had yielded A and B patterns in the above section, thus defining the same subset of proteins as the nucleotide sequences. MP analysis of the subset yielded a single parsimonious tree belonging to the B type in which cyclostomes are monophyletic (length = 4,734 steps, CI = 0.815, and RI = 0.734). Bootstrap values were 85% for the cyclostome node and 100% for the other nodes. NJ analysis of the same data set yielded a single tree of the A type in which hagfish appears to be paraphyletic. Bootstrap values were 100%/62%. ML analysis yielded the A and B patterns with equal probability.

Discussion
The mtDNA of L. fluviatilis is highly similar in all respects to that of P. marinus. It displays the same genomic organization as P. marinus, a finding which validates this particular gene order as a synapomorphy of lampreys.

In contrast, the mtDNA of the lampreys profoundly differs from that of M. glutinosa. Thus, the mtDNA sequence of L. fluviatilis was used in combination with the mtDNAs of taxa which, to the best of our present knowledge, flank the rooting under study to construct cladograms aimed at better defining the lamprey/hagfish relationships. The use of complete mitochondrial DNA to approach the phylogeny of chordates has recently been questioned by Naylor and Brown (1998)Citation . These authors concluded that only a small fraction of the DNA sequence could be retained for such studies and excluded the use of the entire coding sequence. They retained as informative the first and second positions of the triplets coding for proline, cysteine, methionine, glutamine, and asparagine in a subset of mitochondrial genes to be defined for each data set studied. In contrast to these conclusions, the present analysis shows that, at least within the range of taxa we have selected, the complete protein-coding mitochondrial DNA sequences can generate cladograms compatible with our present knowledge of the phylogeny of craniates, whether using the MP, NJ, or ML method. Moreover, the subsets of data selected according to Naylor and Brown's (1998)Citation criteria resulted in cladograms identical to those obtained using complete mtDNA. Thus, there appears to exist no need to select subsets of data; complete coding mtDNA sequences can be used to construct cladograms in which the accepted phylogenetic relationships are retrieved. Although difficult to ascertain on a statistical basis, the influence of different evolution rates of the genes coding for different proteins is most probably minimized if a sum of coding genes is used, provided a maximum of phylogenetic information is extracted from the sequences. The use of all positions in the triplets is justified by different codon usages in different species and also by the preferential nucleotide usage at the third position, observations which both reflect distinctive properties of the translation machinery in the different taxa.

However, concerning the precise problem of the lamprey/hagfish relationships, the cladograms generated from the nucleotides and amino acids using the MP, NJ, and ML analyses support either the cyclostome or the vertebrate hypothesis (table 6 ); the question remains unanswered, since the conclusions produced out of the same data set by using the same outgroups are markedly dependent on the kind of mathematical analysis used.


View this table:
[in this window]
[in a new window]
 
Table 6 Summary of the Results Obtained Using Different Methods and Data Sets

 
Using a limited number and a less focused choice of taxa, Rasmussen, Janke, and Arnason (1998)Citation deduced the paraphyly of agnathans. However, earlier analyses (Stock and Whitt 1992Citation ) based on 1,631 nt of the 18S rDNA with tunicates and cephalochordates as outgroups, as well as a more recent analysis of the 28S and 18S rDNA sequences (Mallatt and Sullivan 1998Citation ), supported the cyclostome theory, but weakly and with some ambiguity. These different conclusions most probably reflect the number of informative sites used and intrinsic differences in the uses of rRNA-coding and protein-coding DNA sequences in phylogenetic studies: rDNA consists of highly conserved sequences which are likely to be poorly informative, surrounded by highly variable stretches of DNA which are not amenable to alignment. The choice and number of taxa, the quality of the alignments, the coding strategies, and the choice of the outgroups also obviously all influence the conclusions reached.

Part of these difficulties can be due to the too small number of sequences which are directly relevant to the problem under study. Additional mitochondrial and nuclear DNA sequences, such as those of other agnathans and urochordates, must be determined. They may still be unsufficient to definitely elucidate the present three-taxon problem, and additional morphological data, including information on the embryonic development of hagfishes (Wicht and Tusch 1998Citation ), will be needed.

Acknowledgements

C.G. acknowledges receipt of a fellowship of the Ministère de l'Education Nationale de la Recherche et de la Technologie. The work carried out in the Unité de Biologie Moléculaire du Gène has been supported by the EEC, the Institut National de la Santé et de la Recherche Médicale, and the Institut Pasteur. The work carried out in the Unité Mixte de Recherche CNRS/ENS 49 was supported in part by the Centre National de la Recherche Scientifique and the Ecole Normale Supérieure de Lyon. C.D. and H.E. contributed equally to the sequencing of the mtDNA of L. fluviatilis.

Footnotes

Axel Meyer, Reviewing Editor

1 Keywords: Lampetra fluviatilis, mitochondrial DNA phylogeny cyclostomes Back

2 Address for correspondence and reprints: Gabriel Gachelin, Unité de Biologie Moléculaire du Gène, Département d'Immunologie, Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris cedex 15, France. E-mail: ggachel{at}pasteur.fr Back

literature cited

    Adachi, J., and M. Hasegawa. 1996. MOLPHY version 2.3: programs for molecular phylogenetics. I. PROTML: maximum likelihood inference of protein phylogeny. Comput. Sci. Monogr. 28:1–150.

    Anderson, S., A. T. Bankier, B. G. Barrell et al. (11 co-authors). 1981. Sequence and organization of the human mitochondrial genome. Nature 290:457–465.

    Asakawa, S., H. Himeno, K.-I. Miura, and K. Watanabe. 1995. Nucleotide sequence and gene organization of the starfish Asterina pectinifera mitochondrial genome. Genetics 140:1047–1060.

    Barriel, V. 1994a. Phylogénies moléculaires et insertion délétion de nucléotides. C. R. Acad. Sci. III 317:693–701.

    ———. 1994b. Relations de parenté au sein des Hominoidea et la place de Pan paniscus. Comparaison et analyse méthodologique des phylogénies morphologique et moléculaire. Ph.D. thesis, University of Paris VI, Paris.

    Boore, J. L., L. L. Daehler, and W. M. Brown. 1999. Complete sequence, gene arrangements and genetic code of the mitochondrial DNA of the cephalochordate Branchiostoma floridae (Amphioxus). Mol. Biol. Evol. 16:410–418.[Abstract]

    Bremer, K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42:795–803.

    ———. 1994. Branch support and tree stability. Cladistics 10:295–304.

    Cao, Y., P. J. Waddell, N. Okada, and M. Hasegawa. 1998. The complete mitochondrial DNA sequence of the shark Mustelus manazo: evaluating rooting contradictions to living bony vertebrates. Mol. Biol. Evol. 15:1637–1646.[Abstract/Free Full Text]

    Cantatore, P., M. Roberti, G. Rainaldi, M. N. Gadaleta, and C. Saccone. 1989. The complete nucleotide sequence, gene organization and genetic code of the mitochondrial genome of Paracentrotus lividus. J. Biol. Chem. 264:10695–10975.

    Castresana, J., G. Feldmaier-Fuchs, and S. Pääbo. 1998. Codon reassignment and amino acid composition in hemichordate mitochondria. Proc. Natl. Acad. Sci. USA 95:3703–3707.

    Chang, Y.-C., F.-L. Huang, and T.-B. Lo. 1994. The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38:138–155.[ISI][Medline]

    Delarbre, C., V. Barriel, S. Tillier, P. Janvier, and G. Gachelin. 1997. The main features of the craniate mitochondrial DNA between the ND1 and the COI genes were established in the common ancestor to the lancelet. Mol. Biol. Evol. 14:807–813.[Abstract]

    Delarbre, C., N. Spruyt, C. Delmarre, C. Gallut, V. Barriel, P. Janvier, V. Laudet, and G. Gachelin. 1998. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula. Genetics 150:331–344.

    Dumeril, A. M. C. 1806. Zoologie analytique ou méthode naturelle de classification des animaux. Didot, Paris.

    Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368–376.[ISI][Medline]

    ———. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791.

    ———. 1993. PHYLIP (phylogeny inference package). Version 3.5c. Distributed by the author, Department of Genetics, University of Washington, Seattle.

    Fernholm, B. 1985. Evolutionary biology of primitive fishes. Pp. 113–192. in R. E. Foreman, A. Gorbmann, J. M. Dodd, and R. Olsson, eds. Plenum Press, New York.

    Gascuel, O. 1997. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol. Biol. Evol. 14:685–695.[Abstract]

    Hardisty, M. W. 1982. Lampreys and hagfishes: analysis of cyclostome relationships. Pp. 166–260 in M. W. Hardisty and I. C. Potter, eds. The biology of lampreys. Vol. 4B. Academic Press, London.

    Hogan, B., R. Beddington, F. R. Costantini, and E. Lacy. 1994. Isolating high molecular weight DNA from mouse tails. Manipulating the mouse embryo, a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

    Janvier, P. 1978. Les nageoires paires des ostéostracés et la position systématique des céphalaspidomorphes. Ann. Paleontol. 64:113–142.

    ———. 1981. The phylogeny of the Craniata, with particular reference to the significance of the fossil agnathans. J. Vertebr. Paleontol. 1:121–159.

    ———. 1996. The dawn of the vertebrates: characters versus common ascent in the rise of current vertebrate phylogenies. Paleontology 39:259–287.

    Kimura, M. 1980. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111–120.[ISI][Medline]

    ———. 1983. The neutral theory of evolution. Cambridge University Press, Cambridge, England.

    Lee, W.-J., and T. D. Kocher. Complete sequence of a sea lamprey (Petromyzon marinus) mitochondrial genome: early establishment of the vertebrate genome organization. Genetics 139:873–887.

    Lovtrup, S. 1997. The phylogeny of Vertebrata. Wiley, New York.

    Mallatt, J., and J. Sullivan. 1998. 28S and 18S rDNA sequences support the monophyly of lampreys and hagfishes. Mol. Biol. Evol. 15:1706–1718.[Abstract/Free Full Text]

    Naylor, G. J. P., and W. M. Brown. 1998. Amphioxus mitochondrial DNA, chordate phylogeny and the limits of inference based on comparisons of sequences. Syst. Biol. 47:61–76.[ISI][Medline]

    Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. FastDNAml version 1.2: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10:41–48.[Abstract]

    Rasmussen, A.-S., A. Janke, and U. Arnason. 1998. The mitochondrial DNA molecule of the hagfish (Myxine glutinosa) and vertebrate phylogeny. J. Mol. Evol. 46:382–388.[ISI][Medline]

    Spruyt, N., C. Delarbre, G. Gachelin, and V. Laudet. 1998. Complete sequence of the amphioxus (Branchiostoma lanceolatum) mitochondrial genome: relations to vertebrates. Nucleic Acids Res. 26:3279–3285.[Abstract/Free Full Text]

    Stock, D. W., and G. S. Whitt. 1992. Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group. Science 257:787–789.

    Swofford, D. L. 1993. PAUP. Version 3.1. Illinois Natural History Survey, Champaign.

    Wicht, H., and U. Tusch. 1998. Ontogeny of the head and nervous system of myxinoïds. Pp. 431–451 in J. M. Jirgensen, J. P. Lomholt, R. E. Weber, and H. Malte, eds. The biology of hagfishes. Chapman & Hall, London.

    Yalden, D. W. 1985. Feedings mechanisms as evidence for cyclostome monophyly. Zool. J. Linn. Soc. 84:291–300.[ISI]

    Zardoya, R., A. Garrido-Pertierra, and J. M. Bautista. 1995. The complete nucleotide sequence of the mitochondrial DNA genome of the rainbow trout, Oncorhynchus mykiss. J. Mol. Evol. 41:942–951.[ISI][Medline]

Accepted for publication December 10, 1999.