Actin Gene Family Evolution and the Phylogeny of Coleoid Cephalopods (Mollusca: Cephalopoda)

David B. Carlini2,, Kimberly S. Reece and John E. Graves

School of Marine Science, Virginia Institute of Marine Science, The College of William and Mary

Abstract

Phylogenetic analysis conducted on a 784-bp fragment of 82 actin gene sequences of 44 coleoid cephalopod taxa, along with results obtained from genomic Southern blot analysis, confirmed the presence of at least three distinct actin loci in coleoids. Actin isoforms were characteri zed through phylogenetic analysis of representative cephalopod sequences from each of the three isoforms, along with translated actin cDNA sequences from a diverse array of metazoan taxa downloaded from GenBank. One of the three isoforms found in cephalopods was closely related to actin sequences expressed in the muscular tissues of other molluscs. A second isoform was most similar to cytoplasmic-specific actin amino acid sequences. The muscle type actins of molluscs were found to be distinct from those of arthropods, suggesting at least two independent derivations of muscle actins in the protostome lineage, although statistical support for this conclusion was lacking. Parsimony and maximum-likelihood analyses of two of the isoforms from which >30 orthologous coleoid sequences had been obtained (one of the cytoplasmic actins and the muscle actin) supported the monophyly of several higher-level coleoid taxa. These included the superorders Octopodiformes and Decapodiformes, the order Octopoda, the octopod suborder Incirrata, and the teuthoid suborder Myopsida. The monophyly of several taxonomic groups within the Decapodiformes was not supported, including the orders Teuthoidea and Sepioidea and the teuthoid suborder Oegopsida. Parametric bootstrap analysis conducted on the simulated cytoplasmic actin data set provided statistical support to reject the monophyly of the Sepioidea. Although parametric bootstrap analysis of the muscle actin isoform did not reject sepioid monophyly at the 5% level, the results (rejection at P = 0.068) were certainly suggestive of sepioid nonmonophyly.

Introduction

The Cephalopoda are the most complex class of molluscs. In light of their special adaptations related to bioluminescence, buoyancy, crypsis, feeding, intelligence, speed, and vision, they are generally considered to be among the most highly evolved marine invertebrates. There are more than 700 extant species of cephalopods, divided into 2 subclasses, 5 orders, 47 families, and 139 genera (Sweeney and Roper 1998Citation ). The cephalopods diverged from a monoplacophoran ancestor in the late Cambrian period (Salvini-Plawen 1980Citation ). With the exception of the Nautiloidea, all extant cephalopods are members of the subclass Coleoidea, which are distinct from the Nautiloidea in several ways, most notably the reduction and internalization or complete loss of shell (Teichert 1988Citation ). The extant coleoid cephalopods are currently divided into four orders: Sepioidea, Teuthoidea, Octopoda, and Vampyromorpha (table 1 ). Although the fossil record of early cephalopods is rich and demonstrates the success of the group in Paleozoic times, the mainly soft-bodied coleoid cephalopods are poorly represented. Therefore, little is known of the evolutionary history of coleoids through paleontology, and current classifications of the group are based primarily on the morphology of living representatives. Our understanding of higher-level coleoid relationships is rudimentary. A cladistic analysis of morphological character data (Young and Vecchione 1996Citation ) has helped elucidate some of the relationships within the Coleoidea, particularly that of the Vampyromorpha and the Octopoda, but relationships within the Decapodiformes remain unresolved.


View this table:
[in this window]
[in a new window]
 
Table 1 Classification of Cephalopod Taxa in this Study (Sweeney and Roper 1998)

 
To date, molecular phylogenetic studies of higher-level coleoid relationships have focused exclusively on mtDNA sequences (16S [Bonnaud, Boucher-Rodoni, and Monnerot 1994Citation ], COIII [Bonnaud, Boucher-Rodoni, and Monnerot 1997Citation ], COI [Carlini and Graves 1999Citation ]). While these studies have provided insight into relationships within the Decapodiformes, the conclusions, especially those regarding the Sepioidea, conflict. As nuclear and mitochondrial genes possess unique evolutionary histories, the phylogenetic relationships obtained from analysis of nuclear genes may differ from those obtained through analysis of mitochondrial genes. In addition, nuclear genes are often informative at different levels of phylogeny than are mitochondrial genes and could potentially provide resolution in regions for which analyses of mtDNA sequence data have not (Graybeal 1994Citation ). The 16S rRNA, COIII, and COI studies all indicated significant levels of saturation in the molecular data, potentially accounting for some of the anomalous results reported in those studies. The highly conserved actin gene family (Hightower and Meagher 1986Citation ) was selected to examine higher-level relationships within the Coleoidea in an attempt to minimize saturation and homoplasy and to offer a unique perspective of the group's phylogenetic history.

Actin is a ubiquitous protein in eukaryotic cells and plays a crucial role in muscle contraction, cell motility, cytoskeletal structure, cell division, intracellular transport, and cell differentiation (Herman 1993Citation ). Actin proteins are encoded by a multigene family in the nuclei of all animals and plants and in many protozoans examined to date. However, in yeast and some alveolates, actin is encoded by only a single gene (Cupples and Pearlman 1986Citation ; Hightower and Meagher 1986Citation ; Reece et al. 1997Citation ). Actin isoforms are encoded by a set of structurally related genes that descended by duplication and divergence from common ancestral genes (Hightower and Meagher 1986Citation ). The number of actin isoforms varies in different lineages. Mammals possess at least six different isoforms (Vandekerckhove and Weber 1978Citation ). Nine different isoforms have been characterized in teleost fishes (Venkatesh et al. 1996Citation ). The echinoderm genome contains at least eight nonallelic actin genes (Lee et al. 1984Citation ; Fang and Brandhorst 1994Citation ). Insects have been shown to have at least six actin genes (Fyrberg et al. 1980Citation ). The actin gene family of plants is much larger than that of animals, comprising 8–44 genes, depending on the specific taxa (Reece, McElroy, and Wu 1992Citation ; Moniz de Sá and Drouin 1996Citation ). The petunia (Petunia hybrida) genome contains over 100 actin genes, although most are thought to be pseudogenes (McLean et al. 1990Citation ). The potentially high number of paralogous actin genes poses a problem for the use of actin gene sequences in phylogenetic studies, for which a comparison of orthologous sequences among taxa is compulsory for meaningful results.

The actin gene family is divided into two broad categories: cytoplasmic (ß) and muscle ({alpha}) type actins. Invertebrate muscle and cytoplasmic actins are generally thought to be more similar to chordate cytoplasmic actins than to chordate muscle actins (Vandekerckhove and Weber 1984Citation ). It has been suggested that the muscle actins of arthropods differ from the muscle actins of deuterostomes to such an extent that two independent derivations of muscle actins probably occurred, one within the protostome lineage and one within the deuterostome lineage (Mounier et al. 1992Citation ). Surprisingly, little is known about the diversity, types, expression, and molecular evolution of actin genes in the phylum Mollusca. To date, only two studies have attempted to determine the number of actin genes in molluscs, one in the sea hare (class Gastropoda) Aplysia californica (DesGroseillers et al. 1994Citation ), the other in the sea scallop (class Bivalvia) Placopectin magellanicus (Patwary 1996Citation ). Although the results of the Aplysia study were not entirely conclusive, the number of actin genes was estimated to be between three and five copies per haploid genome. Southern blot data on Placopectin suggested the presence of approximately 12–15 actin genes. Further analysis of actin gene evolution in molluscs is clearly warranted and would provide insight into the origin(s) of muscle type actin isoforms in the protostome lineage.

The use of actin as a phylogenetic marker has been largely restricted to analyses of actin gene evolution (see preceding references) or analysis of distantly related taxa, such as analyses of relationships between phyla (Bhattacharya and Ehlting 1995Citation ; Reece et al. 1997Citation ). The evolutionary rate of the actin gene(s) has been considered too slow to determine relationships of taxa below the phylum/division level (Mounier et al. 1992Citation ). However, categorical designation of a gene or gene family as "highly conserved" is somewhat vague and arbitrary, as there are clear differences in the evolutionary rates of genes between taxa, as well as differences in the evolutionary rates of paralogous genes (Li 1997Citation , pp. 177–193). In addition, many of the conclusions drawn from studies addressing the evolutionary rate of actin are pertinent to the evolutionary rates of the amino acid sequences, not those of the nucleotide sequences. The synonymous substitution rate of actin genes can be quite high, in some cases up to 35 times the nonsynonymous substitution rate (Moniz de Sá and Drouin 1996Citation ). While the use of synonymous substitutions is not generally appropriate for determining relationships at deep divergences due to saturation, there are exceptions, for example, in the albumin gene and the c-myc oncogene (Graybeal 1994Citation ). Synonymous substitutions in highly conserved genes may also provide a wealth of information about lower-level relationships. This was demonstrated for the "highly conserved" elongation factor-1{alpha} gene, in which synonymous substitutions were informative for reconstruction of relationships within a moth subfamily that diverged less than 20 MYA (Cho et al. 1995Citation ).

This study presents the results from the phylogenetic analysis of a 784-bp fragment from three paralogous actin genes of 44 cephalopod taxa. The number of protein-coding actin genes present in the genomes of coleoid cephalopods was estimated through Southern blotting of total genomic DNA and through phylogenetic analysis of 82 cephalopod actin sequences. The amino acid sequences of three paralogous actin genes from each of three coleoid taxa were aligned and analyzed with 30 amino acid sequences from an array of 30 metazoan taxa. Following the demonstration that at least three paralogous actin genes were present in coleoids, the results from a more thorough analysis of two of the three paralogs from which a sufficient number of taxa were sampled are presented. The monophyly of the Sepioidea were tested for the two data sets using the parametric bootstrap technique (Huelsenbeck, Hillis, and Nielsen 1996Citation ).

Materials and Methods

Taxonomic Sampling
A portion of the actin gene(s) was sequenced for 44 cephalopod taxa representing a broad spectrum of diversity within the class (table 1 ). Tissue samples (fin and/or mantle tissue) from specimens were stored in either 70% ethanol (-20°C) or tissue storage buffer (0.25 M ethylenediamine tetraacetate [EDTA], 20% dimethyl sulfoxide [DMSO], saturated NaCl [pH 8.0]) (Seutin, White, and Boag 1991Citation ) until DNA extractions were performed. A modification of a protocol designed explicitly for extracting DNA from mollusc tissue (Winnepenninckx and De Wachter 1993Citation ) was used in DNA extractions (Carlini and Graves 1999Citation ).

PCR Amplification, Cloning, and Sequencing
Two sets of degenerate "universal" actin primers, designed by G. Warr (Medical University of South Carolina) and M. Wilson (Mississippi State Medical Center) for amplification for vertebrate actin genes, were used to amplify actin genes from cephalopods. Initially, a primer set (Actin 480 and Actin 483) was used to amplify a 623-bp fragment (excluding primer sequence) of the actin gene(s) corresponding to amino acids 127–333 in vertebrates. After obtaining sequence data for several taxa using the Actin 480 and Actin 483 primers, a second set of primers, Actin 481 and Actin 482, was used to amplify a larger portion (784 bp) of the actin gene. Amplification conditions were similar for both pairs of primers. An MJ Research (Watertown, Mass.) PTC-200 thermocycler was used to conduct 40 cycles of the following temperature profile: 94°C for 1 min, 45–46°C (depending on the sample) for 1 min, and 68°C for 2 min. A final extension step at 68°C for 7 min followed the 40 cycles of amplification.

PCR products were cloned into the plasmid vector pCR2.1 using the Original TA Cloning Kit (Invitrogen Corp., San Diego, Calif.). Plasmid DNA from transformant colonies was isolated and digested with EcoRI (Life Technologies) to check for the presence of the actin insert. The Thermo Sequenase fluorescent-labeled primer cycle sequencing kit with 7-deaza-dGTP (Amersham Pharmacia Biotech, Buckinghamshire, England) was used in all cycle sequencing reactions. Denatured samples were loaded onto a 4% Long Ranger acrylamide gel (FMC Bioproducts, Rockland, Maine) and run on a LI-COR model 4000L automated DNA sequencer. Both strands of plasmid DNA were fully sequenced. Depending on the type of actin sequences obtained, between two and six clones from each species-specific PCR cloning reaction were sequenced with the aim of obtaining at least two different actin isoforms from each taxon. If multiple isoforms were not obtained within a single species after six clones had been surveyed, no further attempt was made to clone an additional isoform for that species.

Genomic Southern Blot Analysis
To determine the number of actin genes in the cephalopod genome, two representative species, the epipelagic squid Ommastrephes bartramii and the primitive octopod Vampyroteuthis infernalis, were selected for Southern blot analysis. Total genomic DNA from O. bartramii and V. infernalis was independently digested with four different restriction enzymes (AvaI, EcoRI, HindIII, and PstI). Approximately 10 µg of digested DNA was loaded on a 0.8% agarose gel, subjected to electrophoresis, and transferred to a nylon membrane (Boehringer Mannheim number 1209-299) for Southern blot analysis (Southern 1975Citation ). The actin probe was constructed from the cloned actin PCR product from O. bartramii (Actin I, clone 49; see table 1 ). The cloned actin insert from O. bartramii clone 49 was digested from the pCR2.1 vector with EcoRI, and the actin insert was isolated from an agarose gel and purified using the Geneclean kit (Bio 101 Inc., La Jolla, Calif.). The O. bartramii actin insert was labeled by random octamer labeling with biotin using the Bioprime DNA labeling kit (Life Technologies, Gaithersburg, Md.). Membranes were incubated overnight at 55°C in standard hybridization buffer (5 x SSC, 0.1% N-lauroylsarcosine, 0.02% SDS, 1% blocking reagent) with a probe concentration of 10 ng/ml. Posthybridization washes were conducted according to the Southern-Star kit protocol (Tropix Inc., Bedford, Mass.). Nonisotopic chemiluminescent detection of the biotin-labeled probe was performed following the Southern-Star kit protocol using the CDP-Star chemiluminescent substrate (Tropix Inc.).

Phylogenetic Analysis
Cephalopod DNA sequences were aligned by eye with the aid of the downloaded sequences and compiled in MacClade, version 3.0 (Maddison and Maddison 1992Citation ). Introduction of gaps into the aligned cephalopod actin sequences was unnecessary in all but one sequence, as there were no insertion/deletion events or alignment ambiguities. Introduction of two gaps into the Spirula spirula actin clone 40 sequence, which had deletions at positions 433–438 and 737–739, was necessary. Both deletions were in-frame, resulting in a loss of two and one amino acid residue from the deduced amino acid sequence, respectively. Gapped regions in the Spirula clone 40 sequence were treated as missing data.

Maximum-parsimony (MP) analysis of the 82 aligned nucleotide sequences from cephalopods, along with two single-copy actin sequences from ciliates as outgroups, was conducted using the heuristic tree search option in PAUP* (Swofford 1998Citation ) with 100 random sequence addition replicates. Clade support was tested using the heuristic bootstrap search command (100 replicates) in PAUP*. Analysis of the entire actin data set (82 cephalopod sequences) revealed the presence of three distinct actin paralogs. The inclusive actin data set was therefore partitioned into three data sets, one for each paralog (analysis described below).

To explore the relationship among the three cephalopod actin isoforms and their relationship to other metazoan actin isoforms, an actin amino acid data set was constructed. This data set consisted of the amino acid sequences from three of seven cephalopod taxa for which all three isoforms had been sequenced (=nine terminal "taxa"), along with actin amino acid sequences from 30 metazoan taxa downloaded from GenBank. The PROTPARS program (PHYLIP, version 3.57c; Felsenstein 1995Citation ) was used to construct an MP tree from a heuristic search of the amino acid data (1,000 random addition replicates), and support for clades was determined through bootstrap analysis using the SEQBOOT program in PHYLIP (1,000 replicates). The PROTPARS program was selected for inferring an MP tree from protein sequences because that method takes into account the genetic code when assigning costs to amino acid substitutions, rather than considering all amino acid changes to have equal costs. The cost associated with any given amino acid substitution is equal to the number of nonsynonymous changes required at the codon level to accomplish the amino acid replacement (Felsenstein 1995Citation ). A parsimony search on the constrained amino acid data, where all protostome muscle actins were constrained to be monophyletic, was also conducted. The test of Templeton (1983)Citation , as implemented in PROTPARS, was used to estimate the statistical significance of the difference in the number of steps between the unconstrained most-parsimonious trees and the constrained alternative trees. Competing trees were considered significantly different when the numbers of steps were more than 1.96 standard deviations different (Felsenstein 1985Citation ).

The two largest of the three paralogous actin data sets were analyzed in detail. PAUP* (Swofford 1998Citation ) was used to generate MP trees from heuristic searches (1,000 random addition replicates) of the Actin I and Actin II data sets. The Actin I data set contained 38 coleoid cephalopod actin sequences, along with two outgroup sequences determined from the results of the analysis described in the preceding paragraph. The two outgroups used were the sea hare muscle actin (accession number X52868) and the scallop muscle actin (accession number U55046) sequences downloaded from GenBank. In analysis of the amino acid data, the sea hare and scallop sequences clustered with the cephalopod Actin I sequences, indicating homology. The Actin II data set contained 31 coleoid cephalopod actin sequences, along with one outgroup sequence from Nautilus pompilius, a cephalopod in the separate subclass Nautiloidea. Clade support was assessed through bootstrapping (1,000 replicates) in PAUP*. A second measure of clade support, the Bremer decay index (Bremer 1988Citation ), was also determined for each clade on the most-parsimonious trees of the Actin I and Actin II data sets using the software program TreeRot (Sorenson 1996Citation ). MP trees consistent with the constraints defined by TreeRot were determined through heuristic searches (100 random addition replicates each) of the data. Differences in tree scores between the unconstrained tree and each of the constrained trees (one for each node) reflected the Bremer decay values for the nodes. Sequences from the third actin paralog were not analyzed due to insufficient taxonomic sampling, as sequences were obtained from only 12 taxa.

To assess the level of saturation in the data, and also to compare the evolutionary rates of the Actin I and Actin II genes, the uncorrected sequence divergence was plotted against the number of inferred changes obtained from the average branch lengths on the MP trees. Since parsimony provides a minimum estimate of the amount of evolutionary change, it provides a useful lower bound to assess the extent of saturation in the data (Philippe et al. 1994Citation ). The observed number of substitutions was plotted separately for third codon positions, which are expected to reach saturation relatively quickly, and for pooled first and second codon positions. Two methods of weighted parsimony were used to analyze the Actin II data set. In the first, third-codon-position nucleotides were excluded from the analysis, as they appeared to be saturated (see Results). Since it is unlikely that every third-position character was saturated, a second means of character weighting was employed, in which the characters were weighted by their rescaled consistency indices (RCIs). RCI-weighting assigns more weight to characters which have greater consistency and less weight to homoplastic characters; in other words, those characters that are more likely to be saturated are assigned less weight (Farris 1989Citation ).

For maximum likelihood (ML) analyses, the strategy used to test models of substitution was similar to that described in Huelsenbeck and Crandall (1997)Citation . Five models of nucleotide substitution were examined, and the statistical significance of model comparisons was determined by a hierarchy of likelihood ratio tests (LRTs), each compared with the chi-square distribution with the appropriate degrees of freedom. The five substitution models tested were those of Jukes and Cantor (1969)Citation (JC69), Felsenstein (1981)Citation (F81), Hasegawa, Kishino, and Yano (1985)Citation (HKY85), HKY85 with rate heterogeneity among sites (HKY85+{Gamma}), and the general time-reversible model with rate heterogeneity among sites (GTR+{Gamma}). Parameters were estimated separately in heuristic searches (simple addition sequence) conducted under each of the models. In all cases, the GTR+{Gamma} substitution model provided the best fit to the data. This model allows for unequal base frequencies, allows for six categories of base substitution rates corresponding to all six classes of reversible substitutions, and accounts for rate heterogeneity among sites. Four categories of rates, as well as the shape parameter {alpha}, were estimated in ML tree searches using the discrete model to approximate the gamma distribution (Yang 1994Citation ).

An ML tree, for which the monophyly of the Sepioidea was enforced, was generated for each actin data set assuming a GTR+{Gamma} model of sequence evolution. Parameters were estimated from a single heuristic ML search of the data. Parametric bootstrap analysis of the Actin I and Actin II data was conducted to test for significant differences between the unconstrained and the constrained trees. The constrained ML tree was used to generate 1,000 simulated data sets for each actin gene with the computer program SeqGen, version 1.1 (Rambaut and Grassly 1997Citation ), using the GTR+{Gamma} model parameters estimated from the constrained tree. The simulated data sets for each gene were then used to generate a null distribution of most-parsimonious tree length differences (TLDs), calculated as the difference in tree lengths under the null (Sepioidea monophyly constrained) and alternate (unconstrained) hypotheses for each of the 1,000 simulated data sets. The null distribution was obtained from MP heuristic searches (100 random addition replicates) rather than ML searches of the simulated data sets to save computational time (Hillis, Mable, and Moritz 1996Citation ). Since an ML search of a single simulated data set required several hours, it was not possible to perform ML searches of the simulated data sets (a total of 400,000 MP searches were performed to create the two null distributions). The TLD for the actual data was then compared with the null distribution to determine if the actual TLD was statistically significant. The proportion of the replicates in which the TLD calculated using the actual data was exceeded for the simulated data represented the significance level of the test.

Analysis of the combined Actin I and Actin II data sets was restricted to the 26 taxa from which both gene sequences had been obtained. We tested for incongruence among the Actin I and Actin II data partitions using the incongruence length difference test (Mickevich and Farris 1981Citation ), implemented in PAUP* as the partition homogeneity test, after excluding invariant sites.

Results

Genomic Southern Blot Analysis
Genomic Southern blots detected two to three bands in each of the three digests of O. bartramii and V. infernalis DNA (fig. 1 ). The EcoRI and HindIII digests clearly revealed the presence of three copies of the actin gene in O. bartramii, while the probe hybridized to larger fragments in the PstI digest, making it difficult to discern the total number of bands (at least two). The probe did not hybridize to the AvaI digest of O. bartramii genomic DNA, possibly due to an insufficient quantity of digested genomic DNA loaded in the AvaI lane or the presence of a single AvaI site at 440 bp in the probe sequence. The probe hybridized to at least two fragments of the AvaI, EcoRI, and HindIII digests of V. infernalis genomic DNA, both of which were relatively large. The probe hybridized to three fragments in the PstI digest of V. infernalis.



View larger version (66K):
[in this window]
[in a new window]
 
Fig. 1.—Genomic Southern blot analysis of actin gene sequences in Ommastrephes bartramii (lanes 2–5) and Vampyroteuthis infernalis (lanes 6–9). Genomic DNA was digested with AvaI (lanes 2 and 6), EcoRI (lanes 3 and 7), HindIII (lanes 4 and 8), or PstI (lanes 5 and 9) and probed with a biotinylated 784-bp actin insert gel-purified from O. bartramii actin clone 49. Lane 1 contains a biotinylated {lambda}/HindIII molecular weight marker (Stratagene, La Jolla, Calif.)

 
Phylogenetic Analysis
Parsimony analysis of 82 cephalopod actin sequences, rooted with the two single-copy ciliate sequences, yielded 144 equally parsimonious trees and revealed the presence of three distinct classes of actin sequences (fig. 2 ). The first isoform cloned and examined was arbitrarily designated Actin I. The second most common isoform was designated Actin II, and the third isoform discovered was designated Actin III. Bootstrap analysis strongly supported the monophyly of the Actin I and Actin III isoforms (99% and 96%, respectively), whereas support for the monophyly of the Actin II isoform was moderate (70%).



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 2.—Strict consensus of 144 equally parsimonious trees generated by a heuristic search (100 random addition replicates) of the 784-bp actin fragment from all cephalopod sequences (tree length = 3,252; consistency index = 0.257; retention index = 0.671), rooted with single-copy actin sequences from two ciliates downloaded from GenBank, Tetrahymena thermophila (M139139) and Tetrahymena pyriformes (X05195). Numbers after taxon names refer to clone numbers of bacterial colonies from which the plasmids containing actin fragments were purified and sequenced. Bootstrap proportions (100 replicates) are indicated as percentages above or below nodes. The arbitrarily designated actin isoforms I, II, and III referred to throughout this study are indicated above the basal branches of the three major actin clades

 
The sequences of all three actin isoforms (Actin I, Actin II, and Actin III) were obtained from seven cephalopod taxa. The amino acid sequences of three of these seven taxa, Chtenopteryx sicula, Sepia opipara, and V. infernalis, were analyzed with actin sequences downloaded from GenBank. Analysis of the deduced amino acids of cephalopod actins, along with other metazoan actin protein sequences, also revealed the presence of three distinct cephalopod actin isoforms (fig. 3 ). The Actin I gene was most closely related to the mollusc muscle type actins and exhibited the least variability (1.5% mean amino acid sequence divergence). Mollusc muscle actins were found to be distinct from arthropod muscle actins. Actins II and III exhibited comparable levels of variation (5.4% and 8.4% mean amino acid sequence divergence, respectively). Actin II clustered among the other mollusc cytoplasmic actins, and the Actin III sequences were found to be basal to all metazoan actins but bootstrap support was lacking for both findings. A statistical test (Templeton 1983Citation ) on the 50 unconstrained trees versus the 22 trees obtained when the protostome muscle actins were constrained to be monophyletic found the difference in tree lengths, four steps, to be insignificant (SD = 3.47 steps; 1.96 x SD = 6.80 > 4 steps).



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 3.—Strict consensus of 50 equally parsimonious trees derived from a heuristic search (1,000 random addition replicates) of the deduced amino acid sequences (261 residues) from the Actin I, Actin II, and Actin III genes of Chtenopteryx, Sepia opipara, and Vampyroteuthis, along with the actin genes from a diverse array of metazoan taxa (tree length = 366; consistency index = 0.717; retention index = 0.738). Bootstrap proportions (1,000 replicates) are indicated as percentages above nodes. Tissue specificity is indicated for the sequences for which the information was available. GenBank accession numbers: ascidian cytoplasmic (D45164); ascidian larval muscle (D10887); ascidian adult muscle (L21915); brine shrimp (#403) cytoplasmic (X52605); brine shrimp (#205) muscle (X52602); brine shrimp (#211) muscle (X52603); Caenorhabditis (#I) (J01042); cnidarian (M32364); Drosophila (#5C) cytoplasmic (K00667); Drosophila (#42A) cytoplasmic (K00670); Drosophila (#57A) muscle (K00673); Drosophila (#87E) muscle (K00674); gastropod (Z72387); eastern oyster (X75894); Pacific oyster (AF026063); pufferfish anomalous (U38962); pufferfish cardiac (U38959); pufferfish skeletal (U38850); pufferfish cytoplasmic (U37499); sea hare cytoplasmic (U01352); sea hare muscle (X52868); sea urchin (He I) cytoplasmic (U09633); sea urchin (He) muscle (U32348); sea urchin (Ht I) cytoplasmic (U12272); sea urchin (Ht) muscle (U32353); scallop muscle (U55046); sea urchin (Sp IIb) cytoplasmic (M35323); sea urchin (Sp IIIb) cytoplasmic (M35324); Tetrahymena thermophila (M139139); Tetrahymena pyriformes (X05195)

 
Chi-square tests of the Actin I and Actin II data indicated no significant departure from homogeneity of base frequencies across taxa (Actin I: {chi}2 = 19.7, df = 111, P = 1.0; Actin II: {chi}2 = 63.6, df = 111, P = 0.99). The Actin I gene exhibited less variability than the Actin II and Actin III genes (table 2 ). Since the number of parsimony-informative sites and the number of variable sites are both positively correlated with the total number of sequences, comparisons of sequence variability among the three genes must be considered with reference to the number of sequences in each data set. Although it appeared that the Actin III gene was intermediately conserved relative to the Actin I and Actin II genes, the Actin III gene was actually the least conserved, because it exhibited a comparable number of informative sites with only a fraction of the sequences (12 vs. 38 and 32). The interspecific divergences were calculated using only the seven taxa from which all three gene sequences had been obtained. Comparison of mean interspecific divergences revealed that the Actin III gene was the least conserved of the three isoforms (Actin III: 17.7% ± 5.0% > Actin II: 11.1% ± 5.7% > Actin I: 8.5% ± 2.4%). The relatively large divergences exemplified by the Actin III data, combined with the deletion of amino acids in the Spirula sequence, raise the possibility that Actin III is a pseudogene. We calculated the effective number of codons (ENC; Wright 1990Citation ), a measure of codon bias which ranges from 20, for maximally biased genes, to 61, for completely unbiased genes, for the Actin I, Actin II, and Actin III genes from the seven taxa. Within each taxon, the ENC for Actin III was greater than that for both the Actin I and the Actin II genes. Furthermore, the average ENC for Actin III (52.53 ± 2.32) was greater than the averages for either Actin I (38.98 ± 2.66) or Actin II (41.47 ± 1.86). These results suggest that Actin III is a pseudogene, since the degree of codon bias in neutrally evolving pseudogenes is expected to be less than that in their functional counterparts. Furthermore, we were unable to clone Actin III from any representative of the Octopoda, a finding that might also be consistent with the idea that Actin III is a pseudogene, since the loss of Actin III in the Octopoda would not carry severe fitness consequences. However, it is quite possible that the Octopoda also have a copy of the Actin III gene but we were unable to clone it due to our sampling method (see Discussion).


View this table:
[in this window]
[in a new window]
 
Table 2 Comparison of Nucleotide Variability in the 784-bp Region of the Three Actin Genes Sequenced

 
Saturation plots of the Actin I data for all possible pairwise comparisons among the 38 ingroup taxa indicated that uncorrected sequence divergence increased linearly with parsimony branch lengths (fig. 4a ). Surprisingly, this pattern held for third codon positions, as well as for pooled first and second codon positions, which exhibited very low levels of variation. The saturation plots for the Actin II data, however, revealed that third positions become saturated at higher divergences: a 12% observed divergence in third positions could correspond to anything from 50 to 87 steps on the parsimony tree (fig. 4b ). Therefore, third codon positions were downweighted to explore the effects of saturation on parsimony analysis of the Actin II data.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 4.—Uncorrected percentage of sequence divergence plotted against the number of inferred changes obtained from the average branch lengths on the parsimony trees. The observed numbers of substitutions are plotted separately for third codon positions, which are expected to reach saturation relatively quickly, and pooled first and second codon positions. a, Actin I divergences plotted against average branch lengths from the 10 most-parsimonious trees derived from analysis of the Actin I data (see fig. 5a ) indicate that sequence divergence increases linearly with parsimony branch lengths for all codon positions. b, Actin II divergences plotted against average branch lengths from the five most-parsimonious trees derived from analysis of the Actin II data (see fig. 6a ) suggests that third codon positions become saturated at approximately 12% divergence

 
Parsimony analysis of the Actin I data set yielded 10 equally parsimonious trees of length 1,019 (fig. 5a ). The Actin I data did not provide much resolution within the Decapodiformes; however, several of the distal nodes were supported by bootstrap analysis. As the Actin I gene exhibited little variability; Bremer support values were quite low for most nodes. The Actin I data supported the monophyly of the Octopodiformes (99% bootstrap support), the Octopoda (100%), the Incirrata (95%), the Decapodiformes (55%), the Sepiolidae (89%), the Sepiidae (56%), and the Myopsida (50%). ML analysis of the Actin I data set under the GTR+{Gamma} model (fig. 5b ) also supported the monophyly of these groups (Myopsida excepted) and various families represented by more than one taxon. The monophyly of the Sepioidea, Teuthoidea, and Oegopsida was not supported by parsimony or ML analysis of the Actin I data. Interestingly, ML analysis of the Actin I data supported a clade consisting of Chtenopteryx, Bathyteuthis, Myopsida, and Sepioidea.



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 5.—a, Strict consensus of 10 most-parsimonious trees obtained in a heuristic search (1,000 random addition replicates) of the 784-bp Actin I data set (tree length = 1,019; consistency index = 0.404; retention index = 0.518). Bootstrap proportions are indicated as percentages below nodes, and Bremer support values are indicated above nodes. Higher-level taxonomic designations are indicated in boldface to the right of each terminal taxon (C = suborder Cirrata; I = suborder Incirrata; M = suborder Myopsida; O = suborder Oegopsida; S = order Sepioidea; V = order Vampyromorpha). b, Maximum-likelihood tree generated from a heuristic search of the Actin I data assuming a general time-reversible (GTR) model of substitution with site-specific rates estimated according to the gamma distribution (-ln L = 6,007.44). Branch lengths are drawn proportional to the probabilities of change occurring along each branch under the GTR model. Substitution parameters estimated in the likelihood search were as follows: {pi}A = 0.232, {pi}C = 0.298, {pi}G = 0.214, {pi}T = 0.256; A->C = 1.776, A->G = 4.840, A->T = 1.548, C->G = 0.615, C->T = 6.569, G->T = 1.000; {alpha} = 0.552; r1 = 0.043, r2 = 0.285, r3 = 0.855, r4 = 2.817

 
MP analysis of the Actin II data set yielded five equally parsimonious trees of length 1,217. The consensus tree (fig. 6a ) supported the monophyly of the Octopodiformes (92% bootstrap support), the Cirrata (98%), the Incirrata (92%), the Bolitaenidae (100%), the Decapodiformes (83%), the Cycloteuthidae (96%), the Ommastrephidae (83%), the Sepiolidae (92%), the Sepiidae (100%), and the Myopsida (100%). Also supported was a close relationship between Histioteuthis and Psychroteuthis (74%). Both methods of weighted parsimony analyses did not change any of the relationships among the Octopodiformes (results not shown). The Decapodiformes were found to be monophyletic, but most resolution within the Decapodiformes was lost when third positions were excluded from the analysis. The RCI-weighted data provided some resolution within the Decapodiformes, supporting the monophyly of the Myopsida, (Sepiolidae + Sepiadariidae) and families represented by more than one taxon. However, the Sepioidea, the Teuthoidea, and the Oegopsida were not supported by RCI-weighted parsimony analysis. ML analysis of the Actin II data set under the GTR+{Gamma} model supported the monophyly of the Octopodiformes, the Octopoda, the Cirrata, the Incirrata, the Decapodiformes, the Myopsida, (Sepiolidae + Sepiadariidae), and various families represented by more than one taxon (fig. 6b ). The monophyly of the Sepioidea, Teuthoidea, and Oegopsida was not supported by ML analysis of the Actin II data.



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 6.—a, Strict consensus of five most-parsimonious trees obtained in a heuristic search (1,000 random addition replicates) of the 784-bp Actin II data set (tree length = 1,217; consistency index = 0.418; retention index = 0.555). Bootstrap proportions are indicated as percentages below nodes, and Bremer support values are indicated above nodes. Higher-level taxonomic designations are indicated in boldface to the right of each terminal taxon (C = suborder Cirrata; I = suborder Incirrata; M = suborder Myopsida; O = suborder Oegopsida; S = order Sepioidea; V = order Vampyromorpha). b, Maximum-likelihood tree generated from a heuristic search of the Actin II data assuming a general time-reversible (GTR) model of substitution with site-specific rates estimated according to the gamma distribution (-ln L = 6,519.69). Branch lengths are drawn proportional to the probabilities of change occurring along each branch under the GTR model. Substitution parameters estimated in the likelihood search were as follows: {pi}A = 0.249, {pi}C = 0.261, {pi}G = 0.219, {pi}T = 0.271; A->C = 1.761, A->G = 5.552, A->T = 2.366, C->G = 0.903, C->T = 8.984, G->T = 1.000; {alpha} = 0.894; r1 = 0.112, r2 = 0.438, r3 = 0.977, r4 = 2.473

 
The TLDs (in number of parsimony steps) between the unconstrained and the constrained trees are compared with the null distribution of TLDs obtained by parsimony searches of the simulated data sets in figure 7 . The observed TLD between the constrained and the unconstrained ML trees (in parsimony scores) was 9 steps for the Actin I data. For the simulated Actin I data (shown as filled bars), 68 of the 1,000 sampled tree lengths resulted in a difference greater than the observed difference of 9 steps. Therefore, an observed difference greater than 9 steps would be expected about 6.8% of the time if the null hypothesis were true, so the null hypothesis of sepioid monophyly could not be rejected for the Actin I gene at P < 0.05, although the observed difference approached statistical significance. The observed TLD was 14 steps for Actin II data, and none of the simulated data sets resulted in a TLD greater than 5 steps. Therefore, the observed difference of 14 steps would be expected less than 0.1% of the time if the null hypothesis were true, so the null hypothesis of sepioid monophyly was rejected at P < 0.001 for the Actin II data.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 7.—Results from parametric bootstrap analysis of the Actin I and Actin II data sets. The trees derived from maximum-likelihood analysis of the Actin I and Actin II data sets constraining the monophyly of the Sepioidea were used to generate 100 simulated data sets (784 bp each) for each gene. The substitution parameters under the GTR+{Gamma} model of evolution that were used to obtain the initial tree were also used to generate the simulated data sets (Actin I parameters: {pi}A = 0.232, {pi}C = 0.298, {pi}G = 0.214, {pi}T = 0.256; A->C = 1.752, A->G = 4.918, A->T = 1.493, C->G = 0.634, C->T = 6.812, G->T = 1.000; {alpha} = 0.554; r1 = 0.047, r2 = 0.295, r3 = 0.865, r4 = 2.793. Actin II parameters: {pi}A = 0.239, {pi}C = 0.268, {pi}G = 0.192, {pi}T = 0.301; A->C = 1.773, A->G = 5.504, A->T = 2.366, C->G = 0.904, C->T = 9.005, G->T = 1.000; {alpha} = 0.894; r1 = 0.047, r2 = 0.295, r3 = 0.895, r4 = 2.793). Two parsimony searches were conducted on each simulated data set; the first search was conducted under the constraint of sepioid monophyly, while the second search was unconstrained. The differences in scores between the best tree derived from the constrained and unconstrained parsimony searches of each of the 1,000 simulated data sets were recorded and graphed to obtain the expected distribution under the null model. The tree length difference (in parsimony steps) between the constrained and the unconstrained maximum-likelihood trees for the Actin I data, i.e., the observed difference, was 9 steps. For the simulated Actin I data (shown as filled bars), 68 of the 1,000 sampled tree lengths resulted in a difference greater than the observed difference of 9 steps. Therefore, an observed difference greater than 9 steps would be expected about 6.8% of the time if the null hypothesis were true, so the null hypothesis of sepioid monophyly could not be rejected for the Actin I gene at P < 0.05. The observed tree length difference (in parsimony steps) between the constrained and the unconstrained maximum-likelihood trees for the Actin II data was 14 steps. In the Actin II simulations (shown as open bars), none of the 1,000 sampled tree length differences were greater than 5 steps. Therefore, an observed difference of 14 steps would be expected less than 0.1% of the time if the null hypothesis were true, so the null hypothesis of sepioid monophyly was rejected at P < 0.001

 
A partition homogeneity test indicated that the Actin I and Actin II data partitions were significantly incongruent (P < 0.01). This result may be due to the different evolutionary rates of the two genes, or it may be due to gene conversion. Additional factors related to the concept of independent process partitions (Miyamoto and Fitch 1995Citation ) cannot be excluded without further investigation. When a test for homogeneity among data sets fails, the data should not be combined, as such an approach would violate the assumptions of analysis of the combined data sets (Bull et al. 1993Citation ). Particularly relevant to the present data is the assumption that the individual Actin I and Actin II gene trees should recover the species tree. It is possible that either or both of the individual gene trees may differ from the species tree due to lineage sorting within the Decapodiformes, such that common ancestry of alleles would extend further back in time than the speciation events (Maddison 1997Citation ). If related species were to carry different ancestral alleles due to lineage sorting, discord between gene trees and species trees would result, and the assumptions of a "total evidence" approach would be violated.

Discussion

Evolution and Phylogenetic Utility of the Actin Gene Family of Coleoid Cephalopods
Genomic Southern blot analysis suggests that there are at least three closely related actin sequences in the genomes of coleoid cephalopods, as represented by O. bartramii and V. infernalis genomic DNA (fig. 1 ). Neither of the two actin clones sequenced from O. bartramii contained recognition sites for EcoRI, HindIII, or PstI restriction enzymes. The Actin I isoform from O. bartramii, from which the actin probe was constructed, possessed a single AvaI site. This may have interfered with proper hybridization to O. bartramii genomic DNA digested with AvaI. None of the three actin clones sequenced from V. infernalis contained recognition sites for the AvaI, EcoRI, HindIII, or PstI restriction enzymes. We interpret the number of bands to represent the number of distinct actin loci possessed by cephalopods and hypothesize that the three actin loci are the result of two gene duplication events prior to the divergence of the Coleoidea.

It is possible that two of the bands may represent different alleles of the same locus, although phylogenetic analysis of cephalopod actin sequences supports our interpretation of the results from Southern blot analysis. The strict consensus of 144 equally parsimonious trees obtained in analysis of the 82 cephalopod actin sequences clearly demonstrates the presence of three distinct isoforms of the actin gene in coleoids (fig. 2 ). Each of these isoforms clearly belonged to one of the three major clades that were supported by bootstrap analysis of the nucleotide data. In addition, the high levels of nucleotide sequence divergence between the different isoforms within the same species (on the order of 15%–20%) are inconsistent with the low levels of intraspecific allelic divergences (typically <1%) observed in most protein-coding loci (Li 1997Citation , pp. 237–242). However, sequence divergence between different actin isoforms is low enough that a heterologous probe would detect all the isoforms under the hybridization conditions used in this study.

We were unable to clone and sequence all three isoforms from every species examined in this study. This probably had little to do with the number of actin gene copies in each species and was probably due to the method we employed. In surveying a broad spectrum of taxa for which no information on actin had previously been available, it was necessary to use highly degenerate PCR primers which may not have been effective for amplifying all actin isoforms across all species. Additional work is required to determine a definitive actin gene copy number and to type all actin isoforms present in each individual species.

Phylogenetic analysis of representative amino acid sequences of the three cephalopod actin isoforms, along with known actin isoform amino acid sequences from a diverse array of metazoan taxa downloaded from GenBank, was conducted to examine the relationship between the three cephalopod actin paralogs and other metazoan actin paralogs. To maximize informativeness about the type of actin isoforms possessed by cephalopods, most of the sequences from noncephalopod taxa included in the analysis were derived from cDNA libraries so that the tissue in which the particular isoforms were expressed would be defined. The Actin I gene of cephalopods clustered with the muscle type actins of molluscs, while the cephalopod Actin II sequences were most similar to the other molluscan cytoplasmic actins. The Actin III sequences placed basal to all metazoan actins included in the parsimony analysis, although the lengths of the branches separating Actin III from the remaining metazoan actins were very short on all 50 most-parsimonious trees. Mollusc muscle type actin sequences were not most closely related to the arthropod muscle actins, suggesting independent derivations of the mollusc and arthropod muscle type actins. However, a test of the statistical significance of the difference in tree length (Templeton 1983Citation ) obtained from parsimony analysis of the unconstrained data and parsimony analysis of the constrained data (i.e., protostome muscle actins constrained to form a monophyletic group) indicated that the two sets of trees were not significantly different. Therefore, at this time, it is not possible to conclude that the ancestral protostome lineage lacked a muscle-specific actin gene, as suggested by Mounier et al. (1992)Citation . Additional muscle actin sequences from other molluscan classes, as well as from other protostome phyla, are required to resolve this issue. As an example, the unexpected placement of Caenorhabditis as sister to the deuterostome muscle actin clade illustrates the need for more extensive sampling of actin sequences from additional ecdysozoan protostomes.

Each of the three classes of actin isoforms found in this study may comprise different subclasses of isoforms. Indeed, studies have shown that distinct loci, determined through analysis of cDNA library clones, may possess identical amino acid sequences and nearly identical (>95%) nucleotide sequences (Wahlberg and Johnson 1997Citation ). Thus, although the sequences may be nearly identical, they may not be homologous. Gene conversion has been invoked as the mechanism maintaining homogeneity among separate actin loci (Crain et al. 1987Citation ; Wahlberg and Johnson 1997Citation ). If gene conversion has been a factor in the evolution of a seemingly homologous group of sequences, one would expect separate loci within a taxon to cluster together or near one another on a phylogenetic tree. Phylogenetic analysis of the entire actin data set of 82 terminal taxa (fig. 2 ) suggests that gene conversion is an unlikely scenario in the molecular evolution of cephalopod actin isoforms. Each of the three main clades (Actin I, Actin II, and Actin III) are distinct from one another, and each contains one and only one representative from each taxon. In contrast, if gene conversion had occurred, two or more separate loci from a single taxon would cluster within a single major clade due to the homogenizing effects of conversion.

Phylogenetic analysis of the entire actin data set revealed three distinct clades of actin genes within the Cephalopoda. It may be posited that subtle intraclade differences among paralogs could have been obscured by the large interclade differences in the comprehensive actin data set. Intraspecific comparisons of Actin I and Actin II nucleotide sequences could potentially reveal the presence of multiple actin lineages within each purported isoform. Although such comparisons would not provide rigorous proof that each isoform is itself composed of multiple gene lineages, the absence of obvious heterogeneity within lineages would be another line of evidence supporting the conclusion that each isoform represents a single, distinct actin paralog. Intraspecific actin isoform comparisons revealed that none of the 26 taxa considered departed substantially from the overall mean divergence between the Actin I and the Actin II isoforms (18.73%). However, a comparison between the mean intraspecific divergence within the Octopodiformes (21.11%) and that within the Decapodiformes (18.02%) revealed that the Octopodiformes actin genes were significantly more divergent than their decapodiform counterparts (t = 8.08, df = 6, P < 0.01). A comparison of the variation in octopodiform and decapodiform mitochondrial COI amino acid sequences also demonstrated greater divergence in the Octopodiformes (Carlini and Graves 1999Citation ), suggesting either a faster rate of molecular evolution or earlier divergences within the latter group.

The Actin I gene appeared to be less saturated than might be expected based on the highly conserved nature of the amino acid sequences. While first- and second-codon-position substitutions were rare, third-codon-position substitutions tended to accumulate linearly with increasing sequence divergence (fig. 4a ). The Actin II gene exhibited some saturation in third-position substitutions; however, the Actin II sequences were more variable at first and second codon positions than Actin I (fig. 4b ); therefore, both genes contain substantial phylogenetic signal. Patterns of sequence divergence among taxa must be examined prior to dismissing the use of "highly conserved" sequences to infer the phylogeny of a given group (Graybeal 1994Citation ). First and second positions of both actin genes appear to conform to the first pattern of molecular evolution described in Graybeal (1994)Citation , with very few sites varying and those that vary changing very slowly. Third positions of the Actin II gene appear to exhibit pattern 2, with a greater proportion of sites free to change and a relatively rapid rate of change, resulting in saturation at higher divergences. Third positions of the Actin I gene exhibit pattern 3, with an even greater proportion of varying sites, although the rate at which they vary is less than the rate for third positions of the Actin II gene. Weighted parsimony analysis was conducted on the Actin II data in an attempt to counter the effects of saturation. The results of weighted parsimony analysis were similar to those obtained by ML analysis of the Actin II data; thus, they did not provide any significant additional insights.

Phylogenetic Relationships of the Coleoidea
The monophyly of the Decapodiformes and the Octopodiformes was supported in analyses of both actin data sets. These results agree with those obtained through analysis of morphological data (Young and Vecchione 1996Citation ). Previous molecular studies have also confirmed the monophyly of the Decapodiformes (Bonnaud, Boucher-Rodoni, and Monnerot 1994, 1997Citation ; Carlini and Graves 1999Citation ). The monophyly of the Octopodiformes was unequivocally confirmed in a single previous molecular study (Carlini and Graves 1999Citation ). Both the Actin I and the Actin II data indicated substantially less bootstrap and Bremer support for the monophyly of the Decapodiformes clade than for the monophyly of the Octopodiformes clade. A parallel pattern was reported in a cladistic analysis of 50 morphological characters (Young and Vecchione 1996Citation ).

Octopoda
Parsimony analysis of the Actin I and Actin II genes did not resolve the relationships among Octopus, Graneledone, and the bolitaenids. The results presented here support Voight's (1993)Citation assertion that our knowledge of octopodid relations is not yet at the level required for subfamilial designations. Indeed, our classification of incirrates at the family level may be flawed in that all incirrates may be derived from octopodids. This is quite plausible, as the few morphological characters that support the monophyly of the Octopodidae could be plesiomorphic.

Sepioidea
In contrast to the results obtained from analysis of the COI gene (Carlini and Graves 1999Citation ), the sepioid families did not emerge basal to the remaining representatives of the Decapodiformes in trees derived from analyses of the actin data. The monophyly of the Sepioidea was not supported in analysis of the Actin I gene. However, when the Myopsida, Chtenopteryx, and Bathyteuthis were included, the group was found to be monophyletic, although bootstrap and Bremer support were lacking for this clade. The monophyly of the Sepioidea was also not supported in phylogenetic analyses of the Actin II gene.

Sepioloidea, a member of the family Sepiadariidae, consistently emerged basal to the Sepiolidae clade in all analyses of the Actin I and Actin II data sets. This relationship was also supported in bootstrap analyses of both data sets (85% and 71%). This result supports the relationships described first by Naef (1923)Citation , who considered the Sepiadariids ancestral within the Sepiolidae, and later by Khromov (1990)Citation , who considered the Sepiolidae, the Sepiadariidae, and the Idiosepiidae more closely related to each other than to the Sepiidae and the Spirulidae.

A statistical test of monophyly for the Actin II data set rejected the monophyly of the Sepioidea, although a test on the Actin I data could not reject the null hypothesis at {alpha} = 0.05 (fig. 7 ). However, the Actin I data are quite close to statistical significance (P = 0.068) and are certainly suggestive of the conclusion that the Sepioidea are not a natural group. The Actin I gene was much less variable than the Actin II gene and probably does not contain enough phylogenetic signal to reject sepioid monophyly at the 5% significance level. Although the results of such tests cannot be considered "proof" against a group's monophyly, the rejection of sepioid monophyly for the actin data sets reported here and the COI data reported elsewhere (Carlini 1998Citation ) provide compelling evidence to refute the monophyly of the group.

Our conclusions based on statistical tests of monophyly should be tempered with some caution, as several factors could potentially contribute to a false rejection of sepioid monophyly, i.e., type I error. The use of inappropriate models of DNA substitution can be a serious problem affecting the relevance of statistical tests of monophyly based on null distributions of TLDs from simulated data sets. The use of an inappropriate model of DNA substitution in generating simulated data sets results in an increased frequency of rejecting the null hypothesis when it is in fact true. However, Huelsenbeck, Hillis, and Nielsen (1996)Citation found that the assumption of an incorrect model was a serious problem only when the rates of sequence evolution were exceptionally high. The parameter-rich GTR model of sequence evolution was used to generate the simulated data sets in the present study in order to simulate the evolution of DNA sequences as realistically as possible. Therefore, the rejection of sepioid monophyly is probably not due to the use of an inappropriate model of nucleotide substitution. Another potential source of type I error is severe base composition bias, although bias was found to be insignificant for both the Actin I and the Actin II data sets (see Results). A third potential source of type I error could have been the result of our approach to analyzing the simulated data sets, specifically in the use of MP searches for ML-simulated data sets. If the difference in length between unconstrained and constrained ML trees was greater than the difference in length between unconstrained and constrained MP trees, then the null distribution generated by searching the simulated data sets using parsimony would be biased toward smaller TLDs. If this were the case, then the simulated differences between the optimal and the constrained trees might be greater and the test difference less significant.

Teuthoidea
While the monophyly of most families represented by more than one taxon was supported in analyses of the actin data sets, few relationships among the oegopsid families were conclusively determined. Among the oegopsid families, the Chtenopterygidae and the Bathyteuthidae appeared most closely related to myopsid squids and the Sepioidea. A close relationship between Chtenopteryx and Bathyteuthis was found in all analyses of the Actin I data set and was supported by a moderate proportion (59%) of bootstrap replicates. This relationship was also found in parsimony analysis of the Actin II data, but without bootstrap support. Likelihood analysis of the Actin II data did not support a sister group relationship between the two families, although they branch off immediately adjacent to one another, basal to the myopsid/sepiid/Spirula clade. Some researchers have considered the Chtenopterygidae and the Bathyteuthidae to be related to one another and also to the myopsid squids (Naef 1923Citation ; Young 1977, 1991Citation ; Anderson 1996Citation ; Brierley, Clarke, and Thorpe 1996Citation ).

The monophyly of the oegopsid squids was not supported in any of the analyses conducted in this study. The present study, as well as previous molecular studies (Bonnaud, Boucher-Rodoni, and Monnerot 1994Citation ; Carlini and Graves 1999Citation ), suggest that the oegopsid squids represent a polyphyletic group with uncertain phylogenetic affinities. These findings are consistent with a recent conclusion that the Oegopsida are "a phylogenetic void" (Young, Vecchione, and Donovan 1998Citation ). Although relationships among a few families were supported in many of the analyses, the lack of stability in deep-level relationships renders any conclusions about phylogenetic relationships of oegopsid families premature. The lack of stability in oegopsid family relationships was observed in three ways. Both measures of clade stability employed in this study failed to lend strong support to most oegopsid clades. The results from analyses of the same data set using two different reconstruction methodologies, parsimony and likelihood, were frequently discordant with respect to deep divergences within the Oegopsida. Finally, analyses of the two different data sets were frequently discordant with respect to oegopsid relations, although this may have been due in part to the different taxonomic compositions of the data sets. Parsimony and likelihood analysis of the COI gene, which included representatives from 23 of the 25 oegopsid families, did not elucidate stable relationships among the oegopsid families, nor was the monophyly of the Oegopsida supported (Carlini 1998Citation ). Although representation of oegopsid families was limited in the 16S and COIII studies, neither study demonstrated oegopsid monophyly, nor were stable relationships among families observed (Bonnaud, Boucher-Rodoni, and Monnerot 1994, 1997Citation ).

Conclusions

The data presented in this study suggest that cephalopods possess three paralogous actin genes. One of the actin paralogs (Actin I) examined in this study is most similar to mollusc muscle actin genes, whereas another paralog (Actin II) is most similar to cytoplasmic actin genes.

With respect to higher-level relationships within the Cephalopoda, the following conclusions can be drawn from analyses of the Actin I and Actin II data sets: (1) the Coleoidea, the Octopodiformes, the Decapodiformes, and the Incirrata are monophyletic groups; (2) the Vampyromorpha and the Octopoda are sister taxa; and (3) the Sepioidea and the Oegopsida, as currently defined, are polyphyletic. The actin data also suggest that the myopsid squids may be more closely related to the Sepiidae and Spirula than to most oegopsid squids and that the Bathyteuthidae and the Chtenopterygidae appear to be more closely related to the Myopsida and the Sepioidea than to other oegopsid families.

Acknowledgements

We thank Michael Vecchione and Richard Young for reviewing this manuscript and for providing helpful advice, discussion, and criticism throughout this project. J. McDowell, N. Stokes, P. Cooper, and V. Heatwole provided valuable laboratory assistance. Tissue samples from cephalopod specimens were obtained with the generous help of R. Young, M. Vecchione, B. Seibel, Y. Sakurai, T. Stranks, A. Reid, D. Woodbury, J. Semmens, M. Parry, S. Herke, J. Bower, L. Bonnaud, and R. Boucher-Rodoni. We also thank the officers and crews of FTS HOKUSEI MARU and R/V DAVID STARR JORDAN for their assistance in oceanographic sampling. This research was supported by a National Science Foundation Doctoral Dissertation Improvement Grant (DEB-9623353), a Lerner Gray Fund for Marine Research Grant, a Western Society of Malacologists Student Research Grant in Malacology, and a VIMS Minor Research Grant awarded to D.B.C.

Footnotes

Richard Thomas, Reviewing Editor

1 Keywords: actin Cephalopoda gene duplication octopods squids statistical test of monophyly Back

2 Address for correspondence and reprints: David B. Carlini, Department of Biology, University of Rochester, Rochester, New York 14627-0211. E-mail: dcarlini{at}fisher.biology.rochester.edu Back

literature cited

    Anderson, F. E. 1996. Preliminary cladistic analysis of relationships among loliginid squids (Cephalopoda: Myopsida) based on morphological data. Am. Malacol. Bull. 12:113–128.[ISI]

    Bhattacharya, D., and J. Ehlting. 1995. Actin coding regions: gene family evolution and use as a phylogenetic marker. Arch. Protistenkd. 145:155–164.

    Bonnaud, L., R. Boucher-Rodoni, and M. Monnerot. 1994. Phylogeny of decapod cephalopods based on partial 16S rDNA nucleotide sequences. C.R. Acad. Sci. Paris 317:581–588.

    ———. 1997. Phylogeny of cephalopods inferred from mitochondrial DNA sequences. Mol. Phylogenet. Evol. 7:44–54.[ISI][Medline]

    Bremer, K. 1988. The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42:795–803.

    Brierley, A. S., M. R. Clarke, and J. P. Thorpe. 1996. Ctenopteryx sicula, a bathypelagic loliginid squid? Am. Malacol. Bull. 12:137–143.

    Bull, J. J., J. P. Huelsenbeck, C. W. Cunningham, D. L. Swofford, and P. J. Waddell. 1993. Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42:384–397.[ISI]

    Carlini, D. B. 1998. The phylogeny of coleoid cephalopods inferred from molecular evolutionary analyses of the cytochrome c oxidase I, muscle actin, and cytoplasmic actin genes. Ph.D. dissertation, The College of William and Mary, Rochester, N.Y.

    Carlini, D. B., and J. E. Graves. 1999. Phylogenetic analysis of cytochrome c oxidase I sequences to determine higher-level relationships within the coleoid cephalopods. Bull. Mar. Sci. 64:57–76.[ISI]

    Cho, S., A. Mitchell, J. C. Regier, C. Mitter, R. W. Poole, T. P. Friedlander, and S. Zhao. 1995. A highly conserved nuclear gene for low-level phylogenetics: Elongation Factor-1a recovers morphology-based tree for Heliothine moths. Mol. Biol. Evol. 12:650–656.[Abstract]

    Crain, W. R., M. F. Boshar, A. D. Cooper, D. S. Durica, A. Nagy, and D. Steffen. 1987. The sequence of a sea urchin muscle actin gene suggests a gene conversion with a cytoskeletal actin gene. J. Mol. Evol. 25:37–45.[ISI][Medline]

    Cupples, C. G., and R. E. Pearlman. 1986. Isolation and characterization of the actin gene from Tetrahymena thermophila. Proc. Natl. Acad. Sci. USA 83:5160–5164.

    DesGroseillers, L., D. Auclair, L. Wickham, and M. Maalouf. 1994. A novel actin cDNA is expressed in the neurons of Aplysia californica. Biochim. Biophys. Acta 1217:322–324.

    Fang, H., and B. P. Brandhorst. 1994. Evolution of actin gene families of sea urchins. J. Mol. Evol. 39:347–356.[ISI][Medline]

    Farris, J. S. 1989. The retention index and the rescaled consistency index. Cladistics 5:417–419.

    Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368–376.[ISI][Medline]

    ———. 1985. Confidence limits on phylogenies with a molecular clock. Syst. Zool. 34:152–161.[ISI]

    ———. 1995. PHYLIP: phylogenetic information package. Version 3.57c. Distributed by the author, Department of Genetics, University of Washington, Seattle.

    Fyrberg, E. A., K. L. Kindle, N. Davidson, and A. Sodja. 1980. The actin genes of Drosophila: a dispersed multigene family. Cell 19:365–378.

    Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: a search for genes informative about deep divergences among vertebrates. Syst. Biol. 43:174–193.[ISI]

    Hasegawa, M., H. Kishino, and T. Yano. 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160–174.[ISI][Medline]

    Herman, I. M. 1993. Actin isoforms. Curr. Opin. Cell Biol. 5:48–55.[Medline]

    Hightower, R. C., and R. B. Meagher. 1986. The molecular evolution of actin. Genetics 114:315–332.

    Hillis, D. M., B. K. Mable, and C. Moritz. 1996. Applications of molecular systematics: the state of the field and a look to the future. Pp. 515–543 in D. M. Hillis, C. Moritz, and B. K. Mable, eds. Molecular systematics. Sinauer, Sunderland, Mass.

    Huelsenbeck, J. P., and K. A. Crandall. 1997. Phylogeny estimation and hypothesis testing using maximum likelihood. Annu. Rev. Ecol. Syst. 28:437–466.[ISI]

    Huelsenbeck, J. P., D. M. Hillis, and R. Nielsen. 1996. A likelihood-ratio test of monophyly. Syst. Biol. 45:546–558.[ISI]

    Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21–132 in H. M. Munro, ed. Mammalian protein metabolism. Academic Press, New York.

    Khromov, D. N. 1990. Cuttlefishes in the systematics and phylogeny of Cephalopoda. Zool. Zh. 69:12–20.

    Lee, J. J., R. J. Shott, S. J. I. Rose, T. L. Thomas, R. J. Britten, and E. H. Davidson. 1984. Sea urchin actin gene subtypes: gene number, linkage and evolution. J. Mol. Biol. 172:149–176.[ISI][Medline]

    Li, W. H. 1997. Molecular evolution. Sinauer, Sunderland, Mass.

    Maddison, W. P. 1997. Gene trees in species trees. Syst. Biol. 46:523–536.[ISI]

    McLean, M., G. M. Gerats, W. V. Baird, and R. B. Meagher. 1990. Six actin gene subfamilies map to five chromosomes of Petunia hybrida. J. Hered. 81:341–346.

    Maddison, W. P., and D. R. Maddison. 1992. MacClade: analysis of phylogeny and character evolution. Sinauer, Sunderland, Mass.

    Mickevich, M. F., and J. S. Farris. 1981. The implications of congruence in Menidia. Syst. Zool. 30:351–370.

    Miyamoto, M. M., and W. M. Fitch. 1995. Testing species phylogenies and phylogenetic methods with congruence. Syst. Biol. 44:64–76.[ISI]

    Moniz de Sá, M., and G. Drouin. 1996. Phylogeny and substitution rates of angiosperm actin genes. Mol. Biol. Evol. 13:1198–1212.[Abstract]

    Mounier, N., M. Guoy, D. Mouchiroud, and C. Prudhomme. 1992. Insect muscle actins differ distinctly from invertebrate and vertebrate cytoplasmic actins. J. Mol. Evol. 34:406–415.[ISI][Medline]

    Naef, A. 1923. Fauna and flora of the Bay of Naples. Cephalopoda (English translation). Keter Press, Jerusalem.

    Patwary, M. U. 1996. Isolation and characterization of a cDNA encoding an actin gene from sea scallop (Placopecten magellanicus). J. Shellfish Res. 15:265–270.[ISI]

    Philippe, H., U. Sorhannus, A. Baroin, R. Perasso, F. Gasse, and A. Adoutte. 1994. Comparison of molecular and palaeontological data in diatoms suggests a major gap in the fossil record. J. Evol. Biol. 7:247–264.[ISI]

    Rambaut, A., and N. C. Grassly. 1997. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput. Appl. Biosci. 13:235–238.[Abstract]

    Reece, K. S., D. McElroy, and R. Wu. 1992. Function and evolution of actins. Evol. Biol. 26:1–34.[ISI]

    Reece, K. S., M. E. Siddall, E. M. Burreson, and J. E. Graves. 1997. Phylogenetic analysis of Perkinsus based on actin gene sequences. J. Parasitol. 83:417–423.[ISI][Medline]

    Salvini-Plawen, L. v. 1980. A reconsideration of the systematics in the Mollusca (phylogeny and higher classification). Malacologia 19:249–278.

    Seutin, G., B. N. White, and P. T. Boag. 1991. Preservation of avian blood and tissue samples for DNA analyses. Can. J. Zool. 69:82–90.[ISI]

    Sorenson, M. D. 1996. TreeRot. Museum of Zoology, University of Michigan, Ann Arbor.

    Southern, E. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98:503–517.[ISI][Medline]

    Sweeney, M. J., and C. F. E. Roper. 1998. Classification, type localities, and type repositories of Recent Cephalopoda. Smithson. Contrib. Zool. 586:561–599.[Medline]

    Swofford, D. L. 1998. PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4.0. Sinauer, Sunderland, Mass.

    Teichert, C. 1988. Main features of cephalopod evolution. Pp. 11–79 in M. R. Clarke and E. R. Trueman, eds. The Mollusca, Vol. 12. Paleontology and neontology of cephalopods. Academic Press, New York.

    Templeton, A. R. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37:221–244.

    Vandekerckhove, J., and K. Weber. 1978. At least six different actins are expressed in a higher mammal: an analysis based on the amino acid sequences of the amino-terminal tryptic peptide. J. Mol. Biol. 126:783–802.[ISI][Medline]

    ———. 1984. Chordate muscle actins differ distinctly from invertebrate muscle actins. J. Mol. Biol. 179:391–413.[ISI][Medline]

    Venkatesh, B., B. H. Tay, G. Elgar, and S. Brenner. 1996. Isolation, characterization and evolution of nine pufferfish (Fugu rubripes) actin genes. J. Mol. Biol. 259:655–665.[ISI][Medline]

    Voight, J. 1993. A cladistic reassessment of octopodid classification. Malacologia 35:343–349.

    Wahlberg, M., and M. S. Johnson. 1997. Isolation and characterization of five actin cDNAs from the cestode Diphyllobothrium dendriticum: a phylogenetic study of the multigene family. J. Mol. Evol. 44:159–168.[ISI][Medline]

    Winnepenninckx, T. B., and R. De Wachter. 1993. Extraction of high molecular weight DNA from molluscs. Trends Genet. 9:407.

    Wright, F. 1990. The ‘effective number of codons' used in a gene. Gene 87:23–29.

    Yang, Z. 1994. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39:306–314.[ISI][Medline]

    Young, J. Z. 1977. Brain, behaviour, and evolution of cephalopods. Symp. Zool. Soc. Lond. 38:377–434.

    ———. 1991. Ctenopteryx the comb-fin squid is related to Loligo. Bull. Mar. Sci. 49:148–161.

    Young, R. E., and M. Vecchione. 1996. Analysis of morphology to determine primary sister taxon relationships within coleoid cephalopods. Bull. Am. Malacol. Union 12:91–112.

    Young, R. E., M. Vecchione, and D. T. Donovan. 1998. The evolution of coleoid cephalopods and their present biodiversity and ecology. S. Afr. J. Mar. Sci. 20:393–419.[ISI]

Accepted for publication May 29, 2000.