Duplication and Concerted Evolution of the Mitochondrial Control Region in the Parrot Genus Amazona

Jessica R. Eberhard, Timothy F. Wright and Eldredge Bermingham

Smithsonian Tropical Research Institute, Balboa, Panamá
Department of Biology, University of Maryland


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
We report a duplication and rearrangement of the mitochondrial genome involving the control region of parrots in the genus Amazona. This rearrangement results in a gene order of cytochrome b/tRNAThr/pND6/pGlu/CR1/tRNAPro/NADH dehydrogenase 6/tRNAGlu/CR2/tRNAPhe/12s rRNA, where CR1 and CR2 refer to duplicate control regions, and pND6 and pGlu indicate presumed pseudogenes. In contrast to previous reports of duplications involving the control regions of birds, neither copy of the parrot control region shows any indications of degeneration. Rather, both copies contain many of the conserved sequence features typically found in avian control regions, including the goose hairpin, TASs, the F, C, and D boxes, conserved sequence box 1 (CSB1), and an apparent homolog to the mammalian CSB3. We conducted a phylogenetic analysis of homologous portions of the duplicate control regions from 21 individuals representing four species of Amazona (A. ochrocephala, A. autumnalis, A. farinosa, and A. amazonica) and Pionus chalcopterus. This analysis revealed that an individual's two control region copies (i.e., the paralogous copies) were typically more closely related to one another than to corresponding segments of other individuals (i.e., the orthologous copies). The average sequence divergence of the paralogous control region copies within an individual was 1.4%, versus a mean value of 4.1% between control region orthologs representing nearest phylogenetic neighbors. No differences were found between the paralogous copies in either the rate or the pattern in which the two copies accumulated base pair changes. This pattern suggests concerted evolution of the two control regions, perhaps through occasional gene conversion events. We estimated that gene conversion events occurred on average every 34,670 ± 18,400 years based on pairwise distances between the paralogous control region sequences of each individual. Our results add to the growing body of work indicating that under some circumstances duplicated mitochondrial control regions are retained through evolutionary time rather than degenerating and being lost, presumably due to selection for a small mitochondrial genome.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
The animal mitochondrial genome is generally considered to be under selection for both small size and a conserved gene order (Rand and Harrison 1986Citation ; Quinn and Wilson 1993Citation ; Boore 1999Citation ). This view has arisen from several general characteristics of these genomes. They are typically only 15–20 kb in size and almost invariably contain the same set of 37 genes plus the control region (Boore 1999Citation ), a noncoding region involved in mtDNA replication (Clayton 1991Citation ). Mitochondrial genomes rarely contain either introns or intergenic spacers, and those that have been found are generally small (Quinn and Wilson 1993Citation ; but see McKnight and Shaffer 1997Citation ). Furthermore, while the order of these genes does vary among major animal lineages, gene order tends to be highly conserved within these lineages, and gene rearrangements are thought to be infrequent (Boore 1999Citation ). Recently, however, the advent of automated DNA sequencing has led to a rapid growth in the number of studies examining the organization and evolution of mitochondrial genomes. Several of these studies have found evidence of repeated gene rearrangements within lineages, as well as the persistence of duplicated regions in animal mitochondria (Moritz and Brown 1986, 1987Citation ; Kumazawa et al. 1996, 1998Citation ; Macey et al. 1997Citation ; Arndt and Smith 1998Citation ; Black and Roehrdanz 1998Citation ; Campbell and Barker 1999Citation ). These studies are raising new questions about the strength and nature of stabilizing selection on mitochondrial genomes.

The publication of the first complete avian mitochondrial genome revealed that the mitochondrial gene order of the chicken (Gallus gallus) differs from the arrangement prevalent in nonavian vertebrates (Desjardins and Morais 1990Citation ). In most vertebrates, the gene order near the control region is NADH dehydrogenase 6/tRNAGlu/cytochrome b/tRNAThr/tRNAPro/control region/tRNAPhe, while in the chicken, NADH dehydrogenase 6 (ND6) and tRNAGlu are found between tRNAPro and the control region (see fig. 1 ). While this unique gene order is shared by most avian lineages and appears to be ancestral (we shall term it the "typical" bird gene order), Mindell, Sorenson, and Dimcheff (1998)Citation recently reported a second gene order (the "novel" gene order) that appears to have evolved independently in several distantly related avian lineages. In the novel arrangement, the control region is located between tRNAThr and tRNAPro, and a second noncoding region (NC) is found in the position typically occupied by the control region (fig. 1 ). This NC region varied in length in the taxa sampled by Mindell, Sorenson, and Dimcheff (1998)Citation and in some cases resembled the control region (up to 82% sequence similarity for a section of the NC in Smithornis sharpei).



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 1.—Schematic diagrams of gene rearrangements in birds for the area surrounding the mitochondrial control region (not drawn to scale). a, The typical gene order that was first reported for the chicken and appears to be the most common arrangement in birds. b, The novel avian gene order first reported by Mindell, Sorensen, and Dimcheff (1998)Citation . c, The gene order found in our study of mtDNA in Amazona parrots. The Amazona arrangement differs from the novel avian gene order in two respects: (1) the presence of degenerate copies of ND6 and tRNAGlu (designated pND6 and pGlu) between tRNAThr and the first control region, and (2) a second, apparently functional, control region in the same location as the degenerate control region found in the novel gene order. Locations of the primers used in this study are indicated with arrows. Thick lines indicate the location of the CR1* and CR2* fragments used in the phylogenetic analysis

 
The typical avian gene order can be derived from the common vertebrate arrangement with a single translocation event involving ND6/tRNAGlu and cytochrome b/tRNAThr/tRNAPro (Desjardins and Morais 1990Citation ). In turn, the novel gene order can be derived from the typical avian order in a single translocation of tRNAPro/ND6/tRNAGlu and the control region (Mindell, Sorenson, and Dimcheff 1998Citation ; but see alternative scenario in Boore 1999Citation ). Similar gene rearrangements in the mitochondria of some reptiles have been explained via tandem duplication followed by deletion or gradual degeneration of duplicated genes (Moritz and Brown 1987Citation ; Macey et al. 1997Citation ). Bensch and Härlid (2000)Citation proposed that the derived gene order in birds resulted from a tandem duplication of the region tRNAPro/ND6/tRNAGlu/CR, followed by the deletion of one copy of each of the duplicated coding genes and partial degeneration of the duplicate control region. This hypothesis was supported by a phylogenetic analysis of NC and control region sequences from Phylloscopus warblers that indicated a single duplication event in the common ancestor of the genus followed by subsequent independent evolution of the two sequences (Bensch and Härlid 2000)Citation . The length of the control region (approximately 1,100 nt) was conserved across these Phylloscopus and was similar to that found in other birds (Baker and Marshall 1997Citation ). In contrast, the NC region varied among species (171–308 nt) and could be only partially aligned to the control region (Bensch and Härlid 2000)Citation , suggesting that the NC region in Phylloscopus is nonfunctional and degenerating.

A similar pattern of gradual degeneration and loss has been observed in duplicated tRNAs of two Cucumaria sea cucumbers (Arndt and Smith 1998Citation ); the cattle tick, Boophilus microplus (Campbell and Barker 1999Citation ); and the akamata snake, Dinodon semicarinatus (Kumazawa et al. 1998Citation ). While these observations of duplication and subsequent degeneration and loss are consistent with the idea of selection for compact mitochondria, some of these same taxa appear to have fully duplicated control regions that show no signs of degeneration. In the case of the sea cucumbers, for example, the two control region sequences differ by only 3% (Arndt and Smith 1998Citation ), while in the akamata snake they are identical (Kumazawa et al. 1998Citation ). Furthermore, metastriate ticks possess two nearly identical control regions (Black and Roehrdanz 1998Citation ; Campbell and Barker 1999Citation ), and the Western rattlesnake (Crotalus viridis) and the himehabu viper (Ovophis okinavensis) have duplicate control regions that differ at only one nucleotide position (Kumazawa et al. 1996Citation ). At present, it remains uncertain why these widely divergent taxa should maintain two nondegenerate copies of a duplicated control region while the vast majority of taxa have only a single control region.

Here, we present results from a study of Amazona parrots and a Pionus parrot outgroup, showing that these taxa also exhibit the novel avian gene order first identified by Mindell, Sorensen, and Dimcheff (1998)Citation . However, in contrast to previous studies of birds, the duplicate control regions show a high degree of sequence similarity, and several conserved sequence features typically found in avian control regions are present in both control regions of these parrots. A phylogenetic analysis of homologous sequences from the two control region copies demonstrates a pattern of concerted evolution consistent with occasional gene conversion events at a frequency higher than the rate of speciation in these parrots.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
We obtained feather or blood samples from wild birds, live captives, and museum specimens (table 1 ). DNA extraction, amplification, and sequencing of samples took place in either the Department of Biology at the University of Maryland (UMD) or the Smithsonian Tropical Research Institute's Molecular Labs (STRI) using slightly different methodologies. At UMD, whole genomic DNA was extracted using Qiamp tissue extraction kits (Quiagen); at STRI, total cellular DNA was extracted by incubating the samples overnight in CTAB buffer (Murray and Thompson 1980Citation ) and Proteinase K, followed by standard phenol-chloroform extraction and dialysis. Both labs amplified mitochondrial DNA using PCR and the primers in table 2 . Specific PCR cocktails and cycle profiles for different sections of the study are given below. We sequenced both strands of all products using either Big Dye cycle sequencing chemistry on an ABI 310 Genetic Analyzer (UMD) or d-rhodamine chemistry on an ABI 377 Sequencer (STRI).


View this table:
[in this window]
[in a new window]
 
Table 1 Sample Sequenced in this Study

 

View this table:
[in this window]
[in a new window]
 
Table 2 Primers Used to Characterize Parrot control Regions

 
The presence of duplicate control regions was initially identified in Amazona ochrocephala auropalliata when we compared sequences obtained with the light-strand primer LThr paired with the heavy-strand primer CR522Rb with those obtained with the primer LGlu paired with CR522Rb. If this species had the typical bird gene order identified in figure 1 , these two sequences should have been overlapping and identical in the section of overlap. Instead, we found slight but consistent differences between the two sequences. These differences were not related to either sample origin or sequencing location, but instead appeared to be due to a duplication and rearrangement of mitochondrial genes in the control region area.

To identify the order of genes in the area surrounding the control region duplication, and to verify that neither copy was nuclear in origin, we amplified the entire segment from the middle of cytochrome b to the middle of 12S in large (1–3 kb) overlapping sections followed by selected reamplification of nested fragments as necessary for sequencing (see table 2 for primer sequences and figure 1 for primer locations). PCR was performed in 25-µl reactions with a final concentration of 1 x Taq Extender 10 x buffer, 0.2 mM of each dNTP, 0.6 µM of each primer, 1.5 U Taq polymerase (Sigma-Aldrich), 1.5 U Taq Extender (Stratagene), and 1 µl of template. We performed 30 PCR cycles with annealing temperatures ranging from 50°C to 55°C. All gene order PCR and sequencing was performed at UMD. We constructed consensus sequences for one individual each of Amazona ochrocephala oratrix, Amazona ochrocephala auropalliata, and Amazona farinosa for the entire segment from cytochrome b to 12S using overlapping sequences aligned in Sequencher 3.1 (Gene Codes Corporation). These sequences were aligned with those for the appropriate coding genes and tRNAs of the chicken (G. gallus) using the Clustal routine in Megalign 1.1 (DNASTAR). Sequence length limitations in Megalign forced us to align these 5.3-kb sequences in two 2.6-kb sections, which were then combined. Further alignments of the parrot sequences with conserved sequence blocks from published control region sequences were performed using Sequencher 3.1 and by eye. The parrot sequences are deposited in GenBank (accession numbers AF338819AF338821), and the alignment of the sequences with the chicken gene sequences is available as supplementary material. Secondary structures and their thermodynamic properties were found using M. Zucker's DNA mfold (SantaLucia 1998Citation ) using the A. o. auropalliata sequence.

To examine patterns of evolution in the duplicated control regions, we sequenced portions of both duplicate control regions (fragments CR1* and CR2*; see fig.1 ) for 21 individuals representing six subspecies of the A. ochrocephala complex, three other species of Amazona (A. autumnalis, A. farinosa, and A. amazonica), and Pionus chalcopterus, a species from a closely related genus. We amplified these segments using either LThr (for CR1*) or LGlu (for CR2*) paired with CR522Rb (fig 1 ). PCR was performed in 25-µl reactions with a final concentration of 1 x PCR buffer, 2 mM MgCl2, 0.2 mM of each dNTP, 0.5 µM of each primer, 1.25 U of Taq (Perkin Elmer AmpliTaq at STRI or Sigma Taq at UMD), and 1 µl template. At UMD, all sequences were amplified for 35 cycles with an annealing temperature of 54°C. The same PCR profile was used for CR1* sequences at STRI, while CR2* sequences were amplified using five cycles with 50°C annealing temperatures followed by 30 cycles at 56°C. Both strands of these products were sequenced as detailed above using the amplifying primers. For each species, at least one individual was sequenced at STRI and one was sequenced at UMD with the exception of P. chalcopterus; table 1 lists the sequencing location for each sample. We aligned 550-bp of the 42 resulting sequences using Sequencher 3.1 and used this alignment for subsequent phylogenetic analyses. These sequences are deposited in GenBank (accession numbers AF338277AF338318).

We conducted phylogenetic analyses of the aligned CR1* and CR2* sequences using both maximum-likelihood and parsimony algorithms in PAUP, version 4.0b3 (Swofford 1999Citation ). Since the 5' end of CR1* includes two degenerate pseudogenes (see Results), only the 550 bases homologous to the CR2* sequence and belonging to the control region sensu stricto were included in further analyses unless specifically noted. Optimal parameters for maximum-likelihood searches were obtained with Modeltest, version 3.0 (Posada and Crandall 1998Citation ). Both the hierarchical likelihood ratio tests and the Akaike information criterion selected the Hasegawa-Kishino-Yano model with the following parameters: empirical base frequencies (A = 0.29, C = 0.25, G = 0.15, T = 0.31), transition/transversion ratio = 7.0, gamma distribution shape = 3.7, and proportion of invariable sites = 0.32. For maximum-parsimony searches, we set all characters to be unordered and of equal weight, with gaps treated as fifth bases and multistate characters treated as uncertainties. For both optimality criteria we obtained starting trees via stepwise addition and used the tree bisection-reconnection branch-swapping algorithm. We ran bootstrap searches of 30 replicates for maximum likelihood and 5,000 replicates for parsimony. CR1* and CR2* sequences from P. chalcopterus served as outgroups in all searches.

Estimates of the rate of gene conversion between the duplicated control regions were obtained using the Kimura two-parameter distances between the alignable portions of the CR1* and CR2* sequences for each individual. Distance to the last common ancestor was assumed to represent the conversion of one gene to another and was estimated as half the distance between the two sequences. The time since last conversion for each individual was estimated by assuming a constant mutation rate of 20%/Myr for the control region, which was found for the snow goose, Anser caerulescens, using Kimura-corrected distances (Quinn 1992Citation ). Quinn's (1992)Citation estimate was based on Shields and Wilson's (1987)Citation fossil-based calibration of 2%/Myr for the mitochondrial genome and the observation that sequences from the first domain of the control region of the snow goose evolve approximately 10 times as fast as the mitochondrial genome as a whole (Quinn 1992Citation ). A similar 10-fold faster substitution rate in the first domain of the control region has been observed in parrots (unpublished data), so in the absence of parrot fossil data with which to obtain a parrot-specific rate calibration, we used that of Quinn (1992)Citation . Although the mutation rate of the first domain of the control region in geese may differ from that in parrots, any error in the clock calibration would only affect the absolute time estimates, and not the relative rates of gene conversion.

We compared the rates of evolution in the CR1* and CR2* sequences in two ways. First, to evaluate whether one copy of the control region evolves more quickly than the other, we separately calculated uncorrected p distances between P. chalcopterus and each of the Amazona samples for the alignable portions of the CR1* and CR2* sequences. The CR1* distances were then compared with the CR2* distances using a paired t-test to determine whether one of the fragments tended to yield greater distance estimates. A t-test was also used to compare the rates of evolution in two different portions of the CR1*. t-tests were performed with StatView, version 4.1 (Abacus Concepts).

Second, to examine the pattern of variability along the two control regions, we compared the numbers of changes observed within nonoverlapping 25-nt windows along the CR1* and CR2* sequences. The CR1* and CR2* sequences were analyzed separately, and for each taxon pair, the numbers of mismatched sites were counted within 25-nt windows. The numbers of differences from all pairwise comparisons were then summed for each window. The patterns of variation in CR1* and CR2* were compared using a {chi}2 test in StatView. We excluded window 16 from this analysis because the observed number of changes was zero for both CR1* and CR2*.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
The control region duplication reported here was discovered when two of us (T.F.W. and J.R.E.) independently designed light-strand primers (LThr and LGlu) to amplify a portion of the 5' end of the control region in Amazona parrots in conjunction with the primer CR522Rb (A. Cooper, personal communication), located in the conserved D-block. From the initial comparisons of our sequences, it was evident that we had sequenced different fragments, and further sequencing of the segment extending from tRNAThr to tRNAPhe showed that the sequences corresponded to two different copies of the control region. An alignment of complete sequences for this segment for A. o. oratrix, A. o. auropalliata, and A. farinosa to that of conserved sections from the chicken (see Supplementary Material) confirms that the parrot mitochondrial genome is rearranged relative to the typical gene order for birds (fig. 1 ). In Amazona parrots, the gene order is cytb/tRNAThr/pND6/pGlu/CR1/tRNAPro/ND6/tRNAGlu/CR2/tRNAPhe/12s, where CR1 and CR2 refer to the duplicate control regions, and pND6 and pGlu indicate presumed pseudogenes (discussed below). At present, it is unclear whether all parrots share this duplication and rearrangement, although preliminary evidence suggests this is the case for at least some species other than the Neotropical species reported here (unpublished data).

In the following sections, we describe and compare the structures of the CR1 and CR2 regions. Although the two differ somewhat in length, both contain structural elements that have been identified in avian control regions. In contrast to previous studies of mitochondrial genome rearrangements in birds, we found no signs of degeneration in either control region copy. Instead, the paralogous copies are evolving in parallel, as we show in a phylogenetic analysis of homologous portions of the two control regions. In the final section, we discuss the maintenance of duplicate control regions in parrots and in other animals and mention some possible mechanisms responsible for their concerted evolution.

Comparison of CR1 and CR2
While both CR1 and CR2 contain many of the conserved features typically identified in avian control regions, an alignment of CR1 with CR2 highlights some differences between the two regions. For any given individual, CR1 and CR2 can be aligned starting at approximately nucleotide 150 of CR1 and nucleotide 4 of CR2. From that point, a span of about 1,296 nt can be easily aligned with few, if any, indels. This alignable section encompasses all of the conserved sequence blocks, TASs, and secondary structures described below.

The first 156 nt of CR1 do not have a corresponding analog in CR2 and do not appear to belong to the control region sensu stricto. The first 137 nt at the 5' end of CR1 can be aligned to the parrot tRNAGlu and adjacent ND6 sequence with 63% sequence similarity. The chicken tRNAGlu can also be aligned with the 5' end of CR1, but the similarity is only 50%, which is lower than the 66% match between the chicken tRNAGlu and the presumably functional parrot tRNAGlu adjoining the 5' end of CR2. The parrot tRNAGlu sequence can be folded into the expected cloverleaf structure, while the tRNAGlu-like sequence from the 5' end of CR1 cannot (fig. 2 ). Furthermore, we found an accelerated rate of sequence evolution relative to the outgroup in the 156 nt at the 5' end of CR1* when compared with the adjacent 550 nt portion of CR1* that could be aligned to CR2* (t-test; N = 20, t = -13.567, P < 0.0001) (fig. 3 ). The lack of functional structure in the tRNAGlu-like sequence, coupled with the accelerated rate of change in this section, suggests that the 5' portion of CR1 contains two pseudogenes corresponding to degenerating copies of ND6 and tRNAGlu. Degenerating copies of ND6 and tRNAGlu at this location are predicted by the rearrangement model described by Bensch and Härlid (2000)Citation for the control region duplication in Phylloscopus warblers. For clarity, we will continue to use CR1 and CR2 to refer to the sections bounded by functional tRNAs, as shown in figure 1 ; "control region" will be used sensu stricto, referring to the sections between pseudo-tRNAGlu and tRNAPro in CR1 and between tRNAGlu and tRNAPhe in CR2.



View larger version (10K):
[in this window]
[in a new window]
 
Fig. 2.—Secondary structures of (a) the presumably functional tRNAGlu gene located between ND6 and CR2 and (b) the presumably nonfunctional tRNAGlu pseudogene located between cytochrome b and CR1. Both sequences are from Amazona ochrocephala auropalliata.

 


View larger version (26K):
[in this window]
[in a new window]
 
Fig. 3.—Histograms indicating the number of base pair changes per 25-nt window as a percentage of changes possible in each window. Windows are numbered, with 0 corresponding to the first window in the alignable portions of CR1* and CR2*. The windows prior to window 0 correspond to the ND6 and tRNAGlu pseudogene portions of CR1*, which show higher rates of mutation than do the two control region copies

 
We found no differences between uncorrected pairwise distances for alignable CR1* sequences versus CR2* sequences (t-test; N = 20, t = 0.121, P = 0.9053), indicating that the two control regions are accumulating changes at approximately the same rate. There was no significant difference in the overall pattern of pairwise changes between sequences from CR1* versus CR2* when changes were summed over 25-nt windows (see fig. 3 , {chi}2 test; N = 19, {chi}2 = 30.0, P = 0.052).

Structure of the Control Regions
If the boundaries of the two control regions are defined by pseudo-tRNAGlu and tRNAPro in CR1 and tRNAGlu and tRNAPhe in CR2 (fig. 4 ), then the duplicate control regions differ in size, with the first control region in the three species ranging in length from 1,553 to 1,713 nt, and the second ranging from 1,457 to 1,868 nt. The lengths of the longest of these control region sequences are minimum estimates because the presence of long stretches of tandem repeats in these sequences prevented complete sequencing of overlapping strands. Even the shortest of the control regions sequenced here (1,457 nt) is longer than control regions previously described for other birds, which are in turn longer than those of most vertebrates (Baker and Marshall 1997Citation ). As is the case for most avian control regions (Baker and Marshall 1997Citation ), both parrot control regions can be divided into three subsections that differ in levels of sequence variation: a highly variable domain I closest to tRNAGlu in the typical avian gene order; a central conserved domain II that contains the F, D, and C boxes; and a moderately variable domain III closest to tRNAPhe in the typical gene order.

Both control region copies in Amazona parrots show many of the features typically found in the functional control regions of birds and other vertebrates. The first domain of both paralogs (approximately positions 469–890 in fig. 4 ) includes several features that have been identified in the control regions of other animals. Both parrot control regions have a C-rich sequence near the 5' end, corresponding to the "goose hairpin" (Quinn and Wilson 1993Citation ), which has been found in similar locations in other bird species (Baker and Marshall 1997Citation ; Marshall and Baker 1997Citation ; Randi and Lucchini 1998Citation ; Bensch and Härlid 2000Citation ). Conserved extended termination-associated sequences (ETASs) have been identified in domain I of some mammal control regions (Sbisà et al. 1997Citation ) and are thought to indicate the 3' end of the nascent H-strand in the three-strand D-loop structure. The consensus mammalian ETAS1 and ETAS2 can be aligned to domain I of the parrot control regions with approximately 50% sequence agreement. In some partridges, the sequence corresponding to ETAS2 can form a stem-and-loop structure (Randi and Lucchini 1998Citation ); the same is true for the parrot control region sequences (fig. 4 ), which form hairpin loops ({Delta}G = -4.13 kcal/mol). Another sequence capable of forming a stem-and-loop structure is found approximately 40 nt upstream of the F box ({Delta}G = -3.80 kcal/mol).



View larger version (48K):
[in this window]
[in a new window]
 
Fig. 4.—Alignment of the CR1 region sequences from three Amazona parrots (Amazona ochrocephala auropalliata, Amazona ochrocephala oratrix, and Amazona farinosa) and conserved features of the chicken (Gallus gallus) control region. Italics indicate the CR1* sequence that was used in the phylogenetic analysis. The locations of the ND6 and tRNAGlu pseudogenes and of conserved sequence features are noted below the corresponding sequences. Bold type indicates the regions with approximately 50% similarity to mammalian ETAS1 and ETAS2 sequences (in that order), and underlining indicates the regions that form a stem in stem-and-loop structures (see text). A complete alignment spanning the 3' region of cytochrome b to the 5' region of 12s is available as a supplement to this paper; in this figure, nucleotides are numbered so that they are consistent with the complete alignment

 
In the more conserved second domain (approximately positions 891–1340 in fig. 4 ), the portions of the chicken sequence that correspond to the conserved F box, D box, and C box can be aligned with both parrot control regions with sequence similarity ranging from 53% to 100% (fig. 4 ). In domain III (approximately positions 1341–2194 in fig. 4 ), the chicken CSB1 sequence, a conserved sequence block first identified in mammals (Walberg and Clayton 1981Citation ), aligns to the parrot CR1 and CR2 with 85% similarity (fig. 4 ). Two other conserved sequence blocks, CSB2 and CSB3 (Walberg and Clayton 1981Citation ), have been found in domain III of most mammalian control regions (Sbisà et al. 1997Citation ) and some birds (e.g., Ramirez, Savoie, and Morais 1993Citation ). Of these, only the CSB3 mammal sequence can be aligned to the parrot control regions with better than 50% sequence similarity (fig. 4 ).

In all animals studied to date, the origin of heavy-strand replication (OH) has been found in the third domain, located immediately next to or within CSB1 (Sbisà et al. 1997Citation ). As in partridges (Randi and Lucchini 1998Citation ), a polyC sequence that resembles the mammalian OH is located 20 nt upstream of the parrot CSB1 in both parrot control regions. The chicken's bidirectional transcription promoter sequence (L'Abbé et al. 1991Citation ) does not align with any part of the parrot control regions with better than 50% similarity. Since this promoter is associated with a stem-and-loop structure, we examined the parrot control regions downstream of CSB1 for inverted repeats that could form such structures. One such sequence ({Delta}G = -1.93 kcal/mol) was located approximately 65 nt downstream of CSB1 (fig. 4 ).

Both parrot control regions contain one or more series of tandem repeats in the third domain downstream of the conserved features described above. These repeat motifs are short microsatellite-like sequences that are tandemly repeated up to 50 times (e.g., the 5'-TTCATTCG-3' motif in CR2 of A. o. auropalliata). Both the repeat motif and the number of these repeats differ among the three taxa examined and between the two control region copies within each taxon (table 3 ). Only the first seven or eight repeats (a TTTG motif) are alignable between the two control region copies of any one individual; the remaining stretch of tandem repeats does not align well. This variation in repeat number accounts for much of the observed variation in length of the control regions. Such repeat regions have been found to be highly variable in length in a range of vertebrate species (Baker and Marshall 1997Citation ; Wilkinson et al. 1997Citation ). There is also some evidence of heteroplasmy in the first control region of A. o. auropalliata, where consistent double peaks were observed in the primary sequence of the repeat section of CR1. Such heteroplasmy may be due to slippage during replication of the short tandem sequences. Heteroplasmy due to replication errors of tandemly repeated sequences is found in a number of species (Wilkinson et al. 1997Citation ).


View this table:
[in this window]
[in a new window]
 
Table 3 Control Region Lengths and Tandem Repeat Types and Numbers for Three Species of Amazona

 
The nonrepetitive sequence found between the tandem repeats and the tRNA also does not align to any other portion of the parrot control region or to any known chicken or parrot mtDNA genes. This unalignable sequence accounts for approximately 190 nt of CR1 and 85 nt of CR2. Within these unalignable portions, both CR1 and CR2 contain a 5'-CCCCTCCCCC-3' sequence that is reminiscent of the "goose hairpin" (fig. 4 ).

Phylogenetic Analysis of CR1* and CR2* Sequences
We performed phylogenetic analyses on CR1* and CR2* sequences from 21 individuals representing four species of the genus Amazona and P. chalcopterus. Both parsimony and maximum likelihood gave well-supported phylogenies that showed a similar branching pattern in which both control region copies of an individual in general were more closely related to each other than to corresponding segments of other individuals (fig. 5 ). The solid boxes in figure 5 indicate the three exceptions to this general pattern, all of which occur within subspecies of A. ochrocephala. In no case are sequences for a particular copy most closely related to the same segment in another species, as would be predicted by a scenario of independent evolution of the two copies following an ancient duplication event in the common ancestor of these species. The average sequence difference of the paralogous control regions (CR1* vs. CR2* within individuals) was 1.4%, versus a mean value of 4.1% between control region orthologs representing nearest phylogenetic neighbors. The CR1* and CR2* sequences show complete identity within an individual in only three cases: those of the two A. farinosa samples and that of the P. chalcopterus individual.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 5.—Phylogenetic reconstructions of the CR1* and CR2* segments of the duplicate control regions. Both parsimony and maximum-likelihood phylogenies demonstrate a general pattern in which sequences from both control region copies of an individual are more closely related to each other than to the corresponding segments in other individuals. The three exceptions to this pattern all occur within subspecies of Amazona ochrocephala and are indicated by solid boxes around the sample names. The dashed boxes indicate the three cases in which CR1* and CR2* copies are identical within an individual. Numbers at nodes in the phylogenies indicate bootstrap support values for that node; only values for the higher nodes are shown in the maximum-likelihood phylogeny

 
Although parsimony and maximum-likelihood analyses recover very similar bootstrap trees, they do differ somewhat regarding the branching order at the species level. The parsimony tree shows A. autumnalis as basal to a clade containing A. farinosa, A. amazonica, and the A. ochrocephala superspecies, whereas the maximum-likelihood tree has both A. autumnalis and A. farinosa as sister to the clade containing A. amazonica and A. ochrocephala. In both cases, the nodes defining these clades show the lowest levels of bootstrap support found in the trees (58 for parsimony, 53 for maximum likelihood). It is important to note that neither tree should be considered an accurate representation of species-level relationships within Amazona owing to our limited taxon sampling and the potential complications for phylogenetic reconstruction resulting from gene conversion.

The phylogenetic analysis suggests that the two control region copies are evolving in concert at the level of subspecies and above, but with some degree of independence within subspecies. One mechanism that could give rise to this pattern is occasional gene conversion events that occur sporadically. We estimated the frequency of such events based on the sequence data from CR1* and CR2*. Averaged across all samples, the time between conversion events was estimated to be 34,670 ± 18,400 years (table 4 ).


View this table:
[in this window]
[in a new window]
 
Table 4 Estimates of Gene Conversion Rates Using Pairwise Distances Between CR1* and CR2* Sequences

 
Maintenance of the Duplicate Control Regions
This study is the first to describe an avian mitochondrial genome with a control region duplication in which both copies of the control region are being maintained in an apparently functional state. Rearrangements of the mitochondrial genome resulting in a gene order similar to that found in Amazona parrots have been reported for a range of bird species, but in each case the putative copy of the control region appears to be degenerate (Mindell, Sorenson, and Dimcheff 1998Citation ; Bensch and Härlid 2000Citation ).

The presence of nondegenerate duplicate control regions has previously been described in only a few, diverse taxa: several snakes (Kumazawa et al. 1996, 1998Citation ), metastriate ticks (Black and Roehrdanz 1998Citation ; Campbell and Barker 1999Citation ), and sea cucumbers (Arndt and Smith 1998Citation ). The concerted evolution of the two parrot control regions, as shown by the phylogenetic analysis of CR1* and CR2* sequences, is very similar to the pattern suggested by the limited phylogenetic analysis of metastriate tick control region sequences (Black and Roehrdanz 1998Citation ). In both cases, two control region copies within an individual appear to evolve independently to some degree, but convergently over the long term. The tick pattern differs somewhat from that of the duplicate control regions found in several distantly related snake species, in which the two copies are identical to each other (or differ at only one nucleotide position) (Kumazawa et al. 1996, 1998Citation ).

The concerted evolution of the two control region copies in parrots differs from the pattern observed in both metastriate ticks and snakes in that it does not involve the entire control region, but rather only those portions that are believed to be functional. This observation suggests three alternative, but not necessarily exclusive, hypotheses for the maintenance of high sequence similarity between these regions: (1) a mechanism of gene conversion involving only the convergent portions, (2) gene conversion involving the entire control region with subsequent extremely rapid evolution of nonfunctional portions, and (3) parallel selection to maintain functionality of both regions.

A hypothetical gene conversion mechanism that could be responsible for concerted evolution of most of the "alignable" portions of the two control regions involves the three-strand D-loop structure. In the D-loop, the parental H strand is displaced by a nascent H strand (Clayton 1982Citation ), which originates at the OH and extends to the TASs near the 5' end of the control region (Sbisà et al. 1997Citation ). The nascent H strand can become disassociated by brief exposure to denaturing conditions or if one of the parental strands is nicked (Clayton 1982Citation ). In a genome with duplicate control regions, the nascent H strand fragment from one D-loop could dissociate and recombine with the parent H strand of the other control region. This recombination process would tend to homogenize the section between the OH and the TASs but would not explain the observed similarity between the putative OH and the beginning of the tandem repeats. This latter section corresponds to avian domain II of the control region, which tends to be highly conserved in most bird species (Baker and Marshall 1997Citation ). The relatively high level of conservatism in this section implies that it is of functional importance, so the similarity between these portions of CR1 and CR2 could result from an independent process of stabilizing selection.

A second hypothesis is that the entire control regions are sporadically homogenized, and during the intervals between homogenization events, the 3' portions of CR1 and CR2 (which contain tandem repeats) evolve more rapidly than the 5' sections. Kumazawa et al. (1998)Citation proposed two possible mechanisms for the concerted evolution of entire duplicate control regions in the snake. One of these models involves tandem duplication during replication, and the other involves frequent gene conversion due to crossing over of nicked strands between two control regions followed by replacement of one of the control region sequences via repair of the resulting heteroduplex DNA intermediate (Kumazawa et al. 1998Citation ). In parrots, our phylogenetic analysis suggests that these homogenization events might occur relatively infrequently, and in the interim, the portions of CR1 and CR2 containing tandem repeats could evolve rapidly, perhaps through replication slippage (Madsen, Ghivizzani, and Hauswirth 1993Citation ). Such rapid evolution of tandem repeats is widespread in the nuclear genome (Charlesworth, Sniegowski, and Stephan 1994Citation ) and is thought to account for heteroplasmy in the control regions of a range of bat species (Wilkinson and Chapman 1991Citation ; Wilkinson et al. 1997Citation ).

A possible alternative to gene conversion is selection that works in parallel on functional areas of the two control regions. One source for such selection could be provided by nuclear gene products that bind onto the nascent H strand fragment during replication (Albring, Griffith, and Attardi 1977Citation ). Mutations in this nuclear product could exert parallel selection on the two copies of the control region if both functioned in this manner, and could maintain a high degree of similarity between the two copies within an individual. However, the fact that avian domain I sequences typically evolve very rapidly implies that mutations at many sites in this domain are selectively neutral. Thus, it seems unlikely that selective constraints imposed by nuclear gene products would account for the degree of concerted evolution observed in Amazona.

When duplications occur within the mitochondrial genome, one of the copies usually degenerates and eventually disappears, as expected given the apparent selection for compact size of the genome (Rand and Harrison 1986Citation ). This has been shown to occur in a number of studies that have documented tRNA duplications followed by degeneration of one of the duplicates (Arndt and Smith 1998Citation ; Kumazawa et al. 1998Citation ; Mindell, Sorenson, and Dimcheff 1998Citation ; Campbell and Barker 1999Citation ; Bensch and Härlid 2000Citation ). In previously reported examples of control region duplication in birds, the copy corresponding to the parrot CR1 is maintained as a functional control region, while the section corresponding to the parrot CR2 has degenerated (Mindell, Sorenson, and Dimcheff 1998Citation ; Bensch and Härlid 2000Citation ). In Amazona parrots, there is no evidence that either copy is degenerate or in any way nonfunctional. If only one copy were functional and gene conversion were directional, such that the functional copy always converted the nonfunctional copy, the nonfunctional copy still would be expected to accumulate changes more quickly between conversion events. However, all comparisons for these parrots show that the two copies are accumulating changes at approximately the same rate. This pattern would be expected if both of the parrot control regions were functional.

Taken together with previously published work on mitochondrial genome rearrangements and gene duplications, our results demonstrate an emergent pattern. Observations of gene duplication followed by degeneration of duplicate copies are consistent with the idea that mitochondrial genomes are under selection for compactness. Gene duplications have been documented in a number of organisms, and the copies are usually degenerate (Arndt and Smith 1998Citation ; Kumazawa et al. 1998Citation ; Mindell, Sorenson, and Dimcheff 1998Citation ; Campbell and Barker 1999Citation ; Bensch and Härlid 2000Citation ) or short-lived on an evolutionary timescale (Moritz and Brown 1986, 1987Citation ).

In some cases, however, duplicate control regions may persist without any apparent loss of functionality (Kumazawa et al. 1996Citation ; Black and Roehrdanz 1998Citation ; Campbell and Barker 1999Citation ). Along with our findings, these results suggest that in some cases, the presence of two control regions may be advantageous and thus be maintained over evolutionary time, either through stabilizing selection or through occasional gene conversion. Alternatively, duplicate control regions may persist only if the duplication event gives rise to complete, functional copies; otherwise, the incomplete duplicate will degenerate and eventually be lost.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
GenBank accession numbers for the parrot CR1* and CR2* sequences are AF338277AF338318. GenBank accession numbers for the parrot cytb-12s sequences are AF338819AF338821. An alignment of the latter sequences with the corresponding chicken sequences is available at the MBE website or from the authors.



View larger version (53K):
[in this window]
[in a new window]
 
Fig. 4 (Continued)

 

    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
We are very grateful to G. Wilkinson for his invaluable advice and support throughout this study. We thank M. González for laboratory assistance, K. Harms for his program quantifying variation in sequence windows, and G. Reeves for suggesting the possibility of recombination involving the nascent H strand. K. Harms and two anonymous reviewers provided helpful comments on the manuscript. We thank A. Cooper, M. Sorensen, L. Joseph, B. Slikas, I. Lovette, and K. Petren for primers, and J. Murphy, S. Egender, E. Enkerlin, C. de Calderón, H. de Espinoza, M. J. West-Eberhard, the Philadelphia Academy of Natural Sciences, the U.S. National Museum of Natural History, Fundación ARA, and the Belize Zoo for providing samples. Tissue collection of the A. ochrocephala ochrocephala sample on the Rio Xingu by G. R. Graves was supported by the Academia Brasileira de Ciencias through a grant from Electronorte administered by P. E. Vanzolini, and by the Smithsonian's I.E.S.P. Neotropical Lowland Research Program. This research was funded by the Biology of Small Populations Program at UMD (NSF), the Graduate Research Board at UMD, the National Geographic Society's Explorer Fund, the AMNH Chapman Fund, the American Ornithologists' Union, and the Smithsonian Institution. J.R.E. and T.F.W. both contributed equally to this work.


    Footnotes
 
William R. Jeffery, Reviewing Editor

1 Present address: Macaulay Library of Natural Sounds, Cornell Laboratory of Ornithology, Ithaca, New York. Back

1 Keywords: Amazona control region mitochondrial DNA parrots concerted evolution gene duplication genomic rearrangement Back

2 Address for correspondence and reprints: Jessica R. Eberhard, Macaulay Library of Natural Sounds, Cornell Laboratory of Ornithology, 159 Sapsucker Woods Road, Ithaca, New York 14850. E-mail: jre24{at}cornell.edu Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Albring M., J. Griffith, G. Attardi, 1977 Association of a protein structure of probable membrane derivation with HeLa cell mitochondrial DNA near its origin of replication Proc. Natl. Acad. Sci. USA 74:1348-1352[Abstract]

    Arndt A., M. J. Smith, 1998 Mitochondrial gene rearrangement in the sea cucumber genus Cucumaria. Mol. Biol. Evol 15:1009-1016[Abstract]

    Baker A. J., H. D. Marshall, 1997 Mitochondrial control region sequences as tools for understanding evolution Pp. 51–83 in D. P. Mindell, ed. Avian molecular evolution and systematics. Academic Press, San Diego

    Bensch S., A. Härlid, 2000 Mitochondrial genomic rearrangements in songbirds Mol. Biol. Evol 17:107-113[Abstract/Free Full Text]

    Black W. C. I., R. L. Roehrdanz, 1998 Mitochondrial gene order is not conserved in arthropods: prostriate and metastriate tick mitochondrial genomes Mol. Biol. Evol 15:1772-1785[Abstract/Free Full Text]

    Boore J. L., 1999 Animal mitochondrial genomes Nucleic Acids Res 27:1767-1780[Abstract/Free Full Text]

    Campbell N. J. H., S. C. Barker, 1999 The novel mitochondrial gene arrangement of the cattle tick, Boophilus microplus: fivefold tandem repetition of a coding region Mol. Biol. Evol 16:732-740[Abstract]

    Charlesworth B., P. Sniegowski, W. Stephan, 1994 The evolutionary dynamics of repetitive DNA in eukaryotes Nature 371:215-220[ISI][Medline]

    Clayton D. A., 1982 Replication of animal mitochondrial DNA Cell 28:693-705[ISI][Medline]

    ———. 1991 Replication and transcription of vertebrate mitochondrial DNA Annu. Rev. Cell Biol 7:453-478[ISI]

    Desjardins P., R. Morais, 1990 Sequence and gene organization of the chicken mitochondrial genome: a novel gene order in higher vertebrates J. Mol. Biol 212:599-634[ISI][Medline]

    Kumazawa Y., H. Ota, M. Nishida, T. Ozawa, 1996 Gene rearrangements in snake mitochondrial genomes: highly concerted evolution of control-region-like sequences duplicated and inserted into a tRNA gene cluster Mol. Biol. Evol 13:1242-1254[Abstract]

    ———. 1998 The complete nucleotide sequence of a snake (Dinodon semicarinatus) mitochondrial genome with two identical control regions Genetics 150:313-329[Abstract/Free Full Text]

    L D. L., J.-F. Duhaime, B. F. Lang, R. Morais, 1991 The transcription of DNA in chicken mitochondria initiates from one major bidirectional promoter J. Biol. Chem 266:10844-10850[Abstract/Free Full Text]

    Macey J. R., A. Larson, N. B. Ananjeva, Z. Fang, T. J. Papenfuss, 1997 Two novel gene orders and the role of light-strand replication in rearrangement of the vertebrate mitochondrial genome Mol. Biol. Evol 14:91-104[Abstract]

    McKnight M. L., H. B. Shaffer, 1997 Large, rapidly evolving intergenic spacers in the mitochondrial DNA of the salamander family Ambystomatidae (Amphibia: Caudata) Mol. Biol. Evol 14:1167-1176[Abstract]

    Madsen C. S., S. C. Ghivizzani, W. M. Hauswirth, 1993 In vivo and in vitro evidence for slipped mispairing in mammalian mitochondria Proc. Natl. Acad. Sci. USA 90:7671-7675[Abstract/Free Full Text]

    Marshall H. D., A. J. Baker, 1997 Structural conservation and variation in the mitochondrial control region of fringilline finches (Fringilla sp.) and the Greenfinch (Carduelis chloris) Mol. Biol. Evol 14:173-184[Abstract]

    Mindell D. P., M. D. Sorenson, D. E. Dimcheff, 1998 Multiple independent origins of mitochondrial gene order in birds Proc. Natl. Acad. Sci. USA 95:10693-10697[Abstract/Free Full Text]

    Moritz C., W. M. Brown, 1986 Tandem duplication of d-loop and ribosomal RNA sequences in lizard mitochondrial DNA Science 233:1425-1427[ISI][Medline]

    ———. 1987 Tandem duplications in animal mitochondrial DNAs: variation in incidence and gene content among lizards Proc. Natl. Acad. Sci. USA 84:7183-7187[Abstract]

    Murray M. G., W. F. Thompson, 1980 Rapid isolation of high molecular weight plant DNA Nucleic Acids Res 8:4321-4325[Abstract]

    Posada D., K. A. Crandall, 1998 Modeltest: testing the model of DNA substitution Bioinformatics 14:817-818[Abstract]

    Quinn T. W., 1992 The genetic legacy of Mother Goose-phylogeographic patterns of lesser snow goose Chen caerulescens caerulescens maternal lineages Mol. Ecol 1:105-117[Medline]

    Quinn T. W., A. C. Wilson, 1993 Sequence evolution in and around the mitochondrial control region in birds J. Mol. Evol 37:417-425[ISI][Medline]

    Ramirez V., P. Savoie, R. Morais, 1993 Molecular characterization and evolution of a duck mitochondrial genome J. Mol. Evol 37:296-310[ISI][Medline]

    Rand D. M., R. G. Harrison, 1986 Mitochondrial DNA transmission genetics in crickets Genetics 114:955-970[Abstract/Free Full Text]

    Randi E., V. Lucchini, 1998 Organization and evolution of the mitochondrial DNA control region in the avian genus Alectoris. J. Mol. Evol 47:449-462[ISI][Medline]

    SantaLucia J. Jr., 1998 A unified view of polymer, dumbbell and oligonucleotide DNA nearest-neighbor thermodynamics Proc. Natl. Acad. Sci. USA 95:1460-1465[Abstract/Free Full Text]

    Sbis E., F. Tanzariello, A. Reyes, G. Pesole, C. Saccone, 1997 Mammalian mitochondrial d-loop region structural analysis: identification of a new conserved sequences and their functional and evolutionary implications Gene 205:125-140[ISI][Medline]

    Shields G. F., A. C. Wilson, 1987 Calibration of mitochondrial DNA evolution in geese J. Mol. Evol 24:212-217[ISI][Medline]

    Sorenson M. D., J. C. Ast, D. E. Dimcheff, T. Yuri, D. P. Mindell, 1999 Primers for a PCR-based approach to mitochondrial genome sequencing in birds and other vertebrates Mol. Phylogenet. Evol 12:105-114[ISI][Medline]

    Swofford D. L., 1999 PAUP* Phylogenetic analysis using parsimony (*and other methods). Sinauer, Sunderland, Mass

    Walberg M. W., D. A. Clayton, 1981 Sequence and properties of the human kb cell and mouse l cell d-loop regions of mitochondrial DNA Nucleic Acids Res 9:5411-5421[Abstract]

    Wilkinson G. S., A. M. Chapman, 1991 Length and sequence variation in evening bat D-loop mtDNA Genetics 128:607-617[Abstract/Free Full Text]

    Wilkinson G. S., F. Mayer, G. Kerth, B. Petri, 1997 Evolution of repeated sequence arrays in the d-loop region of bat mitochondrial DNA Genetics 146:1035-1048[Abstract/Free Full Text]

Accepted for publication March 6, 2001.