The Evolution of an {alpha}-Esterase Pseudogene Inactivated in the Drosophila melanogaster Lineage

G. Charles de Q. RobinGo,*{dagger}{ddagger}, R. J. Russell*, D. J. Cutler{ddagger} and J. G. Oakeshott*

*Commonwealth Scientific and Industrial Research Organisation, Canberra, Australia;
{dagger}Division of Botany and Zoology, Australian National University, Canberra, Australia; and
{ddagger}Center for Population Biology, University of California at Davis

Abstract

Previous analyses of the {alpha}-esterase cluster of Drosophila melanogaster revealed 10 active genes and the Dm{alpha}E4a-{Psi} pseudogene. Here, we reconstruct the evolution of the pseudogene from the sequences of 12 alleles from widely scattered D. melanogaster populations and single alleles from Drosophila simulans and Drosophila yakuba. All of the Dm{alpha}E4a-{Psi} alleles contain numerous inactivating mutations, suggesting that pseudogene alleles are fixed in natural populations. Several lines of evidence also suggest that Dm{alpha}E4a is now evolving without selective constraint in the D. melanogaster lineage. There are three polymorphic indels which result in frameshifts; a key nucleotide of the intron splice acceptor is polymorphic; the neutral mutation parameter is the same for replacement and silent sites; one of the nonsilent polymorphisms results in a stop codon; only 1 of the 13 replacement polymorphisms is biochemically conservative; residues that are conserved among active esterases have different states in Dm{alpha}E4a-{Psi}; and there are about half as many transitional polymorphisms as transversional ones. In contrast, the D. simulans and D. yakuba orthologs Ds{alpha}E4a and Dy{alpha}E4a do not have the inactivating mutations of Dm{alpha}E4a-{Psi} and appear to be evolving under the purifying selection typical of protein- encoding genes. For instance, there have been more substitutions in the introns than in the exons, and more in silent sites than in replacement sites. Furthermore, most of the amino acid substitutions that have occurred between Dy{alpha}E4a and Ds{alpha}E4a are located in sites that typically vary among active {alpha}-esterases rather than those that are usually conserved. We argue that the original {alpha}E4a gene had a function which it has lost since the divergence of the D. melanogaster and D. simulans lineages.

Introduction

Pseudogenes provide a baseline from which to measure various components of molecular evolution. As inactive copies of functional genes, they are thought to evolve without selective constraint, and they can therefore reflect the patterns and the rates of underlying mutational and stochastic processes. Pseudogenes are thought to be a frequent outcome of gene duplication, so they also elucidate this important mechanism of gene origination (Walsh 1995Citation ). Measurements of substitution rates in pseudogenes and of the frequencies at which pseudogenes themselves are generated, fixed, and removed within a population are perhaps most interesting in the context of how these values vary between taxa. For instance, Graur, Shuali, and Li (1989)Citation noticed that the rate of sequence loss among pseudogenes due to deletions was seven times as fast in rodents as it is in humans, whereas analyses of "dead on arrival" copies of Helena elements (Petrov, Lozovskaya, and Hartl 1996Citation ; Petrov and Hartl 1998Citation ) and the swallow pseudogene (sww{Psi}; Petrov et al. 1998Citation ) suggest that deletions in Drosophila pseudogenes are on average seven times as large as those in mammalian pseudogenes and that the rate of deletions is about 2.6-fold greater.

Pseudogenes are, in fact, relatively rare in Drosophila genomes. Moreover, inter- and intraspecific sequence comparisons of some of those cases originally claimed to be pseudogenes have revealed some anomalous results more consistent with the selective constraint expected of functional genes. For example, it is found that divergence or polymorphism at nonsynonymous sites is substantially less than that at synonymous sites for Alcohol dehydrogenase–like (Adh-like) sequences in the melanogaster (Long and Langley 1993Citation ) and repleta (Sullivan et al. 1994Citation ; Begun 1997Citation ) species groups, Cecropin pseudogene–1 (Cec{Psi}1) in the melanogaster group (Ramos-Onsins and Aguade 1998Citation ), and alleles of Esterase P (EstP) in the ß-esterase cluster of Drosophila melanogaster (Balakirev and Ayala 1996Citation ). Only in the cases of the Larval cuticle protein pseudogene (Lcp{Psi}; Pritchard and Schaeffer 1997Citation ) and the Cec{Psi}2 genes from Drosophila simulans, Drosophila mauritania, and Drosophila sechellia (Ramos-Onsins and Aguade 1998Citation ) do the patterns of divergence accord with neutral expectations. As it turns out, the Adh-like sequence from the melanogaster group is thought to encode a functional gene (renamed jingwei; Long and Langley 1993Citation ), and so do the Adh-like sequences from seven of the eight species examined in the repleta group (now called Finnegan; Begun 1997Citation ). Both jingwei and Finnegan have exons that were not detected in early reports. The two genes from the Cecropin cluster produce transcripts (Ramos-Onsins and Aguade 1998Citation ), as does EstP, now renamed Est7, some alleles of which also produce a catalytically active esterase (Dumancic et al. 1997Citation ). In fact, the only evidence to suggest that Est7, Cec{Psi}1, and Cec{Psi}2 are pseudogenes is that a relatively high frequency of alleles have been found with disrupted open reading frames (Balakirev and Ayala 1996Citation ; Ramos-Onsins and Aguade 1998Citation ).

In this paper, we examine polymorphism and divergence of the Dm{alpha}E4a pseudogene from the {alpha}-esterase cluster of D. melanogaster (Russell et al. 1995Citation ; Robin et al. 1996Citation ). The {alpha}-esterase cluster comprises 10 active esterase genes plus the pseudogene, dispersed over 60 kb. The esterases encoded by the cluster show 37%–66% amino acid identity, and no evidence for gene conversion or intergenic recombination has been detected. Orthologs for several of the genes, albeit not Dm{alpha}E4a-{Psi}, have been characterized in Drosophila buzzatii, the sheep blowfly Lucilia cuprina, and the housefly Musca domestica, and phylogenetic analyses and physical mapping of the clusters in these species suggest that the organization of the cluster has been fairly stable since the divergence of the Calliphoridae and the Drosophilidae (Newcomb et al. 1996Citation ; Claudianos, Russell, and Oakeshott 1999Citation ; Oakeshott et al. 1999Citation ). All of the {alpha}-esterases characterized to date except Dm{alpha}E4a-{Psi} have conserved motifs indicative of hydrolytic function, and the low divergences between some orthologs also suggest that they have conserved functions. In general, the functions of {alpha}-esterases are poorly understood, but mutant alleles of the {alpha}E7 gene have been shown to confer organophosphate insecticide resistance in L. cuprina and M. domestica (Newcomb et al. 1997Citation ; Campbell et al. 1997Citation ). While the expression of some D. melanogaster {alpha}-esterases (e.g., EST23 and EST9) in digestive tissues (Spackman et al. 1994Citation ) concurs with the idea that some may have a role in digestion or detoxification of xenobiotics, there is such diversity among paralogs in sequence, tissue and ontogenic expression pattern, substrate preferences, and inhibitor sensitivities (Oakeshott et al. 1999Citation ) that it is too early to conclude that such a role is a general feature of the {alpha}-esterase cluster.

The Dm{alpha}E4a-{Psi} gene is located within an intron of another {alpha}-esterase gene, Dm{alpha}E6. A partial cDNA clone of Dm{alpha}E6 (accession number AI389293) has the Dm{alpha}E4a-{Psi}-containing intron correctly spliced out and an intact open reading frame. Phylogenetic analyses suggest that Dm{alpha}E4a-{Psi} stems from the most recent gene duplication in the D. melanogaster {alpha}-esterase cluster. However, the silent-site divergence between Dm{alpha}E4a-{Psi} and its most closely related paralog (Dm{alpha}E4) is close to saturation, so the duplication event probably happened before the divergence of the melanogaster and willistoni species groups (30–40 MYA; Powell and DeSalle 1995Citation ). The Dm{alpha}E4a-{Psi} allele sequenced has three indels that disrupt and prematurely truncate the open reading frame, plus a noncanonical splice acceptor at intron site II, so it appears to be nonfunctional. The lack of a detectable transcript for Dm{alpha}E4a-{Psi} (unpublished data), the replacement of the catalytic histidine residue with a tyrosine, and a lower G+C content at third- position sites than other {alpha}-esterases are also consistent with this interpretation. However, the distribution of amino acid differences between Dm{alpha}E4a-{Psi} and Dm{alpha}E4 along the primary sequence is nonrandom and similar to the distribution observed among functional esterases, which suggests that the forces of purifying selection may have been acting in both the Dm{alpha}E4 and the Dm{alpha}E4a-{Psi} lineages. Furthermore, relative-rate tests using other paralogs as outgroups suggest that Dm{alpha}E4a-{Psi} is not evolving significantly faster than Dm{alpha}E4. Neither of these observations would be expected if Dm{alpha}E4a-{Psi} has been a pseudogene for most of the time since the {alpha}E4/{alpha}E4a gene duplication event. To address the hypothesis that Dm{alpha}E4a-{Psi} was a functional esterase which only recently became a pseudogene, we examine the sequences of the orthologous {alpha}E4a from D. simulans and D. yakuba and compare the sequences of 12 Dm{alpha}E4a-{Psi} alleles collected from around the world.

Materials and Methods

Drosophila DNA and Strains
Drosophila simulans AR1 genomic DNA was obtained from Dr. Jill Karotam (CSIRO Division of Entomology). Drosophila yakuba flies (14021-0261.0) were obtained from the U.S. National Drosophila Species Resource Center. Nine D. melanogaster lines that were homozygous for the third chromosome and derived from populations in Maryland (Md7B), Zimbabwe (c53, c86, c88, c171), Ecuador (Ec32, Ec100), and Beijing, China (Bei23, Bei65), were obtained from Dr. Charles Aquadro (Cornell University). Two others from Rollingstone (Rs4) and Coffs Harbour in Australia were obtained from Dr. Wendy Odgers (CSIRO Division of Entomology). (The twelfth D. melanogaster sequence was the original one from an Oregon R library described in Russell et al. [1995]Citation ; GenBank accession number U51049).

DNA Preparation
Single D. yakuba or D. melanogaster flies were prepared for PCR as described by Gloor and Engels (1992)Citation , except that debris was precipitated after homogenization by a brief pulse in a microcentrifuge and the supernatant was then diluted 1:4 in homogenization buffer (10 mM Tris-HCl [pH 8.2], 1 mM EDTA, 25 mM NaCl, 200 µg/ml proteinase K).

PCR Amplifications
Reactions were performed in 50 µl containing 10 mM Tris-HCl [pH 8.3], 1.5 mM MgCl2, 50 mM KCl, 200 µM of each dNTP, 1 µM of each primer, and 5 U Taq polymerase (Gibco BRL) under approximately 60 µl of mineral oil. The reactions were performed in a Corbett Research Thermal Sequencer FTS-1. A series of PCRs using four primer sets (GHS: 5'-ATIACIATITTYGGNCAYAGYTCNGG-3' and WSN: 5'-CCIARIATIATIGGDATNCGRTTRCTCCA-3'; Ds1: 5'-CCCGTGGTGCAGACCACNCAYGG-3' and Ds2: 5'-GCTGAGTGTTAACCCCCATCG-3'; Ds3: 5'-CCCAAGGAATTGCTGCGGAACAGT-3' and Ds4: 5'-AAGTTCCTCGGCGATGTTNAGNAC-3'; Ds8: 5'-GTAGATTGGAAGCCAGTAACCTCGGG-3' and Md1: 5'-YTGRTCYTTIARICCIGCRTTNCCNGGNAC-3') were used to amplify {alpha}E4a sequence from D. simulans genomic DNA (fig. 1 ). Approximately 0.02 µg of D. simulans DNA was used as a template, and the cycling regime was 1 cycle of 97°C for 3 min, 55°C for 1 min, and 72°C for 1 min and 40 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 90 s. For D. yakuba, a nested PCR approach was used to amplify {alpha}E4a. Ds1 and Ds4 primers were used in the first PCR (fig. 1 ). The first PCR conditions were 1 cycle of 97°C for 3 min, 50°C for 2 min, and 72°C for 1 min; 35 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 1 min; and 72°C for 10 min. The conditions for the second PCR were the same as those for the first except that in the second round, 1 µl of the product of the first PCR was used as a template, the Ds6 and Ds7 primers were used, the initial annealing temperature was 60°C, and the subsequent annealing temperature was 62°C.



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 1.—A, The arrangement of genes within the Drosophila melanogaster {alpha}-esterase gene cluster. Arrows indicate direction of transcription. Trop = tropomyosin; ubc = ubiquitin conjugating enzyme. B, The location of {alpha}E4a in the second intron of {alpha}E6. Thick black lines represent exons, and thin lines represent introns. C, Regions of {alpha}E4a amplified from Drosophila simulans, Drosophila yakuba, and D. melanogaster alleles. Amplicons are shown as lines bounded by triangles that represent primers. Sizes of amplicons and names of primers are shown beneath the lines and triangles, respectively. Note that the aberrantly amplified Ds{alpha}E4 fragment is also shown (see text). A nucleotide alignment of Ds{alpha}E4a, Dy{alpha}E4a, and Dm{alpha}E4a-{Psi} alleles can be accessed by FTP to ftp://ftp.ebi.ac.uk/pub/databases/embl/align/. The alignment number is ds40409

 
For D. melanogaster, the Coffs Harbour allele was amplified using the GHS and WSN primers in conditions identical to those described for D. simulans. For the other 10 new alleles, 1 µl of diluted fly homogenate was used as template in the PCR reactions. The Dm4a.0 (5'-GCAGGGAACACTTTTGAAGGGGC-3') and Dm4a.5 (5'-TCCCGGAATCCCTGTAANACYT-3') primers were used in these PCRs, and the cycling conditions were 1 cycle of 97°C for 30 s, 60°C for 2 min, and 72°C for 2 min; 50 cycles of 95°C for 30 s, 62°C for 30 s, and 72°C for 2 min; and 72°C for 8 min.

Cloning and Sequencing
The amplified DNA was purified using a QIAQUICK spin column (QIAGEN) following the manufacturer's instructions. PCR products from the D. simulans, D. yakuba, and D. melanogaster Coffs Harbour allele were cloned into pGEM-T or pGEM-T Easy (Promega). Single clones were sequenced using TaqFS chemistry (ABI) as recommended by the suppliers.

The other 10 D. melanogaster alleles were sequenced directly. Three PCRs were performed from each of the 10 strains, the DNA was purified using QIAQUICK spin columns, and the yields from these were determined spectrophotometrically. For each strain, equimolar amounts from each PCR were pooled and then sequenced as a precaution against errors that may have occurred during the PCR or sequencing. Ninety nanomoles of amplified DNA was sequenced using 20 pmol of either the PCR primers or specifically designed primers (not shown) and TaqFS chemistry in a total volume of 10 µl.

Analyses
The program DnaSP, version 3.14 (Rozas and Rozas 1999Citation ), was used to calculate Tajima's (1989)Citation D, Fu and Li's (1993)Citation D, Hudson's (1987)Citation 4Nc and the HKA test (Hudson, Kreitman, and Aguade 1987Citation ).

Results

{alpha}E4a Appears to Be Active in D. simulans and D. yakuba
To test the hypothesis that {alpha}E4a has recently been inactivated in the D. melanogaster lineage, a PCR strategy was used to amplify, clone, and sequence {alpha}E4a from D. simulans (hereafter called Ds{alpha}E4a). The strategy involved three rounds of primer design and amplification (fig. 1 ). One of the second-round PCRs (using the Ds1-Ds2 primer pair) yielded products of two sizes. Both products were cloned and sequenced, and one contained 925 nt from the Ds{alpha}E4a target sequence and the other contained 1,192 nt from a paralog, tentatively called Ds{alpha}E4, that had been amplified by the spurious binding of the Ds1 primer. The 1,192 nt obtained from Ds{alpha}E4 (GenBank accession number AF159418) corresponds to codons 18–376 of Dm{alpha}E4 and includes two introns of 58 and 54 nt. In the final round, a specific Ds{alpha}E4a primer (Ds8) was used in conjunction with a degenerate primer (Md1) designed to bind 3' of intron site II of {alpha}-esterases. The success of this PCR demonstrated that Ds{alpha}E4a, like its ortholog in D. melanogaster, is located within the second intron of {alpha}E6 (fig. 1 ).

In total, 1,572 of the estimated 1,624 coding nucleotides of Ds{alpha}E4a were sequenced (GenBank accession number AF159419), representing all but 17 of the 541 codons expected in the Ds{alpha}E4a open reading frame (fig. 2 ). The sequences homologous to introns II and III of Dm{alpha}E4a-{Psi} (84 and 178 nt, respectively) were also obtained, as were 282 nt 3' of the stop codon and 154 nt of exon II from Ds{alpha}E6. None of the inactivating mutations observed in Dm{alpha}E4a-{Psi} are present in Ds{alpha}E4a. Instead, with the exclusion of the two introns, Ds{alpha}E4a has an intact open reading frame.



View larger version (55K):
[in this window]
[in a new window]
 
Fig. 2.—An alignment of the amino acid sequences of {alpha}E4a from Drosophila simulans, Drosophila melanogaster (Oregon R), and Drosophila yakuba. Dots represent identity to the residues above. The sites are numbered according to a hypothetical open reading frame of Dm{alpha}E4a in which the mutated splice acceptor (X) and the fixed indels of 7, 17, and 1 nt ({Psi}) are ignored

 
Since Ds{alpha}E4a has no inactivating mutations, it is of interest to determine whether it is evolving in a fashion consistent with an active protein-encoding gene. To this end, a PCR strategy was used to obtain 1,290 nt (approximately 79%, corresponding to codons 44–465 of Dm{alpha}E4a-{Psi}; Robin et al. 1996Citation ) of sequence of the {alpha}E4a gene from D. yakuba (fig. 1 ; GenBank accession number AF159420). It, too, has an open reading frame that is only interrupted by two introns (in this case, introns of 73 and 145 nt) which are homologous to those observed in Ds{alpha}E4a and Dm{alpha}E4a-{Psi} (fig. 2 ). As with Ds{alpha}E4a, no disabling mutations are apparent in Dy{alpha}E4a. The orthologous relationships of the D. simulans and D. yakuba sequences to those of Dm{alpha}E4a-{Psi} are demonstrated by the phylogenetic analysis presented in figure 3 .



View larger version (8K):
[in this window]
[in a new window]
 
Fig. 3.—A phylogram constructed from {alpha}E4-like amino acid sequences. Branches are drawn proportional to length, and bootstrap scores of 100% from 100 replicates are shown. Db{alpha}E4b and Db{alpha}E4c are sequences from Drosophila buzzatii (unpublished data), and the other sequences are discussed in the text

 
An alignment of the nucleotide sequences of Ds{alpha}E4a and Dy{alpha}E4a reveals no indel differences in the exons, one in intron II, and at least three in intron III. One of these may be explained by a slippage event, as there is a 10-nt direct repeat in D. simulans (TACATCTAGA) which is very similar to a nearby 9mer (TACATCT-GA). This greater constraint in exons than in introns is also observed in the substitutional differences among those nucleotides that can be aligned, with the ratio of exon : intron divergences (E/I) being 0.36 (corrected for multiple hits using the Jukes and Cantor method; table 1 ).


View this table:
[in this window]
[in a new window]
 
Table 1 Pairwise Divergences (with standard errors in parentheses) Among Various aE4a Genes

 
There is also differential constraint on the replacement versus (exon) silent sites for Ds{alpha}E4a and Dy{alpha}E4a. The ratio of replacement to silent-site divergences (R/S) between these two genes is 0.27 (table 1 ). Although this R/S value is substantially higher than that observed for its paralog {alpha}E4 (R/S for Ds{alpha}E4 vs. Dm{alpha}E4 is 0.12), it is still well within the range for functional Drosophila proteins. For example, R/S for Esterase 6 from the ß-esterase cluster is 0.28 when D. yakuba and D. melanogaster are compared (Oakeshott et al. 1995Citation ).

The distribution of variation across sites that are conserved or otherwise among the 10 functional {alpha}-esterases in D. melanogaster can also be used to assess the level of selective constraint. Sites in the alignment of the 10 active D. melanogaster {alpha}-esterases have been divided into two categories: the 82 sites that are conserved in all 10 sequences and the 520 sites that are variable. Three sites (4%) in the "conserved" category differ between Ds{alpha}E4a and Dy{alpha}E4a, whereas there are 69 "variable" sites (13%) that differ between Ds{alpha}E4a and Dy{alpha}E4a. A G-test on these values is significant (Gadj = 5.4, df = 1, P < 0.05), indicating greater constraint acting on residues usually conserved among other {alpha}-esterases. Therefore, the variation between Ds{alpha}E4a and Dy{alpha}E4a amino acid sequences is consistent with the type of purifying selection observed for a functional {alpha}-esterase.

Change in Sequence Evolution in the D. melanogaster Lineage
Comparisons of Ds{alpha}E4a or Dy{alpha}E4a with Dm{alpha}E4a-{Psi} must reflect the relaxation of constraint on the putative pseudogene, as well as the constraint on the active gene. Consistent with this, R/S and E/I values are greater in comparisons involving Dm{alpha}E4a-{Psi} than in those just comparing Ds{alpha}E4a and Dy{alpha}E4a (table 1 ). Comparisons involving Dm{alpha}E4a-{Psi} also have lower transition-to- transversion ratios (table 1 ), which is to be expected for a pseudogene given the findings of Moriyama and Powell (1996)Citation that transversions made up 54% of noncoding polymorphisms but only 32% of coding polymorphisms among the 24 functional loci they surveyed.

Of the 96 amino acid sites that vary among the sequences of the {alpha}E4a lineages, there are 23 that have the same state in Ds{alpha}E4a and Dy{alpha}E4a but differ in Dm{alpha}E4a-{Psi}, and there are only 8 that are shared between Dm{alpha}E4a-{Psi} and Dy{alpha}E4a but differ in Ds{alpha}E4a. These values are significantly different using Tajima's (1993)Citation 1D test ({chi}2 = 7.3, df = 1, P < 0.01), demonstrating that the rate of amino acid substitution is greater in the DmaE4a-{Psi} lineage, as would be expected if {alpha}E4a became a pseudogene in the D. melanogaster lineage. Furthermore, only two of the 23 substitutions that have occurred in the Dm{alpha}E4a-{Psi} branch align to the conserved sites defined above by classifying the sites according to an independent set of active {alpha}-esterases. This distribution (2 to 21) is not significantly different from the ratio of conserved to variable sites (82 to 520), so there is no apparent functional constraint among the Dm{alpha}E4a-{Psi} amino acid changes (Gadj = 0.14, df = 1, P > 0.05).

Is Dm{alpha}E4a-{Psi} Evolving Neutrally?
Although the above comparisons indicate a relaxation of selection on {alpha}E4a in the D. melanogaster lineage, the only rigorous test of whether Dm{alpha}E4a-{Psi} is indeed evolving neutrally involves comparisons among D. melanogaster alleles (given that there are no species closer to D. melanogaster than D. simulans). To this end, a survey of 12 alleles sampled from various locations around the world was conducted. One of these was the original Oregon R allele (Robin et al. 1996Citation ), and another was from a fly caught near Coffs Harbour, Australia, for which a 317-bp amplicon (using primers WSN and GHS) was cloned and sequenced. In the latter case, the region amplified spanned two of the three frameshift mutations observed in the original Oregon R allele (i.e., the 17-nt deletion and the 1-nt insertion; see fig. 2 ). The Coffs Harbour allele has both of these inactivating mutations and an additional 13-nt deletion (of sites 171–183; see fig. 4 ). For the other 10 flies (Md7B, Zc53, Zc86, Zc88, Zc171, Ec32, Ec100, Bei23, Bei65, and Rs4), amplicons of approximately 1.1 kb (using primers 4a.0 and 4a.5) were obtained. These were sequenced directly in both strands (fig. 1 ). All 10 alleles have the three frameshift mutations originally observed in the Oregon R allele. The inactivating mutation in the splice acceptor is polymorphic, as are three further frameshifting deletions of 13, 28, and 13 nt (171–183, 675–702, and 992–1004, respectively; fig. 4 ; GenBank accession numbers AF159406AF159417).



View larger version (33K):
[in this window]
[in a new window]
 
Fig. 4.—Nucleotide substitutions and deletions differing among the 12 {alpha}E4a alleles from D. melanogaster. States for the D. simulans and D. yakuba alleles at these sites are also given. Note that question marks are used to show nucleotide sites not scored in the Coffs Harbour allele, and deletions (1 = 171–183; 2 = 675–702; 3 = 992–1004) are indicated by dashes. Numbering begins at the ATG and thereafter follows the Ec32 allele (GenBank accession number AF159410). Silent and intron changes are marked "I" and "S," respectively. For amino acid (aa), replacements the residues in both the reference Oregon R allele and the others are shown. The asterisk indicates a stop codon

 
There are 20 single-nucleotide polymorphisms: 2 in introns, 4 silent, and 14 in amino acid replacement sites (fig. 4 ). Overall, the average pairwise nucleotide diversity () is 0.006 (eq. 10.6 of Nei 1987Citation ), and the neutral parameter is 0.006 (with a confidence interval [CI] of 0.003–0.017; Waterson 1975Citation ). These values are fairly consistent across different nucleotide classes, with being 0.007 per silent site, 0.007 per replacement site, and 0.003 per intron site. Although the value for introns is lower than that for the others, it is not a significant difference ({chi}2L = 1.1, df = 1, P > 0.05 using the test of Kreitman and Hudson 1991Citation ). The equivalence of these values suggests that there is no functional constraint on the Dm{alpha}E4a-{Psi} sequences.

Also consistent with this proposition, there are about half as many transitional (ts) polymorphisms as transversional (tv) polymorphisms in Dm{alpha}E4a-{Psi} (6:14). Thus, the ts/tv ratio is substantially lower (although not significantly so; Gadj = 2.2, df = 1, P > 0.05) than that observed in the divergence data (fig. 4 ), and it is significantly different from that normally seen segregating among Drosophila coding regions (G = 11.1, df = 1, P < 0.01; Moriyama and Powell 1996Citation ).

One of the 14 nonsilent polymorphisms creates a stop codon, and only one of the others is biochemically conservative (where conservative substitutions are within the subsets GA, VIL, FYW, STC, DE, NQ, or KR; Genetics Computer Group 1994). Their distribution also does not follow the pattern of amino acid conservation observed in active esterases. Five occur among 82 sites classified as conserved, and nine occur among the 520 variable sites. These two proportions are equivalent (Gadj = 3.0, df = 1, P > 0.05), which is consistent with Dm{alpha}E4a-{Psi} being a pseudogene. By way of comparison, equivalent polymorphism data from a similar sample for the functional Esterase 6 enzyme of D. melanogaster show a highly significant difference, indicative of selective constraint (1 polymorphism in the conserved category and 22 in the variable category; Gadj = 28, df = 1, P < 0.001; data from Dr. W. A. Odgers, CSIRO Division of Entomology, personal communication).

The McDonald and Kreitman (1991)Citation G-test compares the ratio of replacement to silent-site polymorphisms within species with the ratio of replacement to silent site divergences between species. A significant test result is often interpreted as changes in the selective forces acting in current populations relative to those acting during divergence. When this test is applied to the Dm{alpha}E4a-{Psi} polymorphism data using Ds{alpha}E4a for the interspecific comparison, the test is not significant (G = 1.74, df = 1, P > 0.05). Similarly, the HKA test (Hudson, Kreitman, and Aguade 1987Citation ), which compares the intraspecific and interspecific sequence variation between two loci, is not rejected when Dm{alpha}E4a-{Psi} and Ds{alpha}E4a are compared with the Adh 5' regions of D. melanogaster and D. simulans ({chi}2 = 0.24, df = 1, P > 0.05; Kreitman and Hudson 1991Citation ). However, if the McDonald-Kreitman test is applied using Dy{alpha}E4a in the interspecific comparison, the neutral model is rejected (G = 6.8, df = 1, P < 0.01), as it is when Dm{alpha}E4a-{Psi} polymorphism is compared with the divergence between the apparently functional Ds{alpha}E4a and Dy{alpha}E4a (G = 9.2; P < 0.01). These results not only support the proposition that {alpha}E4a has become a pseudogene in the D. melanogaster lineage, but also suggest that it has been a pseudogene long enough to substantially influence the divergence of Dm{alpha}E4a-{Psi} and Ds{alpha}E4a.

The Dm{alpha}E4a-{Psi} Allelic Network
Since all of the sampled Dm{alpha}E4a-{Psi} alleles share three out of the six frameshift mutations, they coalesce (at these indel sites, at least) to an inactive ancestor that was descended from the original inactive Dm{alpha}E4a allele. The allelic network in figure 5 describes the relationship among Dm{alpha}E4a-{Psi} alleles based on the segregating sites, including indel data, which, in the absence of parallel or backward mutation (which are unlikely among alleles since they have diverged so little) or reticulate evolution (i.e., gene conversion or recombination—which may be expected), would represent the allele phylogeny. There is only one character change (that of indel 3) that occurs twice on the network, and this is most probably indicative of a reticulate event between alleles. Thus, there is a high level of linkage disequilibrium between the polymorphic sites scored in this sample. This is consistent with the position of Dm{alpha}E4a-{Psi} within cytological divisions 84D–E (Russell et al. 1995Citation ), which have been described as regions of low recombination (Aquadro, Begun, and Kindahl 1994Citation ), albeit a more formal indication of recombination rate, Hudson's (1987)Citation estimator of 4Nc (where N is the population size and c is the recombination rate), is estimated to be 13.3 for the sequenced region (i.e., 0.0123 per adjacent site), indicating a fairly typical recombination rate (Aquadro, Begun, and Kindahl 1994Citation ).



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 5.—Allelic network of Dm{alpha}E4a-{Psi} alleles. Strain names are enclosed in circles that represent haplotypes, and the coordinates of polymorphic sites are listed adjacent to crossbars, which represent the mutational steps separating haplotypes. The two places at which indel 3 (see figs. 2 and 4) occurs on the network are linked by a dashed line. Site 304 represents the inactivating mutation in the splice acceptor of intron II, and the arrow shows the direction of the mutation (i.e., from the active state to the inactive state)

 
When Was Dm{alpha}E4a-{Psi} Inactivated?
The {alpha}E4a data set is analogous to the {Psi}{alpha}3 globin pseudogene data set analyzed by Miyata and Yasunaga (1981)Citation and Li, Gojobori, and Nei (1981)Citation , in that it contains a pseudogene sequence and two functional homologs. Following their example, it is possible to estimate when Dm{alpha}E4a-{Psi} became inactivated relative to the divergence time of D. melanogaster and D. simulans by assuming a clocklike evolution for synonymous, nonsynonymous, and "pseudogene" sites (appendix). Thus, by using the silent site and replacement site divergences for all three pairwise comparisons, it can be calculated that Dm{alpha}E4a-{Psi} has been inactivated for 0.62T, where T is the time of divergence between D. melanogaster and D. simulans. Furthermore, if we take a maximum-likelihood approach to solving these equations (appendix), we can estimate 95% CIs for our point estimates of inactivation time, the time when the D. yakuba lineage diverged, and the substitution rates of the silent and replacement sites prior to inactivation and the rate for all sites (the "pseudogene rate") afterward.

Powell and DeSalle (1995)Citation used biogeographical evidence and the assumption of a molecular clock to estimate that the divergence of D. melanogaster and D. simulans occurred 2.5 MYA. If we use this in our maximum-likelihood model, we estimate that the silent site mutation rate of {alpha}E4a is 2.5 x 10-8 mutations per site per year (CI = 1.9 x 10-8, 3.7 x 10-8), the replacement mutation rate is 6.1 x 10-9 mutations per site per year (4.6 x 10-9, 9.4 x 10-9), the pseudogene rate is 1.6 x 10-8 mutations per site per year (1.1 x 10-8, 1.0 x 10-6), the divergence time between D. simulans and D. yakuba is 7.3 Myr (5.1, 8.9) and the inactivation time is 1.6 Myr (0.006, 2.5).

Thus, these calculations give the unsatisfying result that the 95% confidence limits for the inactivation time encompass almost all of the time from the divergence of the D. simulans and D. melanogaster lineages until the present time. We can slightly improve on the lower (i.e., younger) bound by estimating a minimum coalescence time of the D. melanogaster {alpha}E4a-{Psi} alleles. Thus, if we take the upper bound of the estimate of the pseudogene rate and the lower bound on our earlier estimate of (0.003), then we calculate that the alleles coalesced to a pseudogene at least 10,000 years ago.

The magnitudes of the estimates for silent and replacement site mutation rates seem biologically reasonable, although the upper bound on the pseudogene rate seems relatively high, given that Sharp and Li (1989)Citation estimated a rate of 1.6 x 10-8 mutations per site per year for the silent sites of four quickly diverging Drosophila genes, and Pritchard and Schaeffer (1997)Citation calculated the rate of substitution in Lcp{Psi} to be 6.8 x 10-8 mutations per site per year.

Note that an analysis of the relative rates of deletions and substitutions also suggests that the upper bound of pseudogene inactivation is more recent than the speciation of D. melanogaster and D. simulans. Among D. melanogaster alleles, there are 3 polymorphic deletions, 14 "nonsynonymous" substitutions, and 4 "synonymous" substitutions. Thus, while {alpha}E4a was evolving "neutrally" as a pseudogene there were 4.7 nonsynonymous changes per deletion and 1.3 synonymous changes per deletion. There is only one fixed deletion in the melanogaster lineage, and since it causes a frameshift, it must have occurred at or after the point at which Dm{alpha}E4a became a pseudogene. Extrapolating from the polymorphism data, we would therefore expect (very roughly) about 4.7 nonsynonymous-site fixations and 1.3 synonymous-site fixations to have occurred since {alpha}E4a became a pseudogene. However, we calculate that there were actually about 30 replacement changes and 14 synonymous changes in the melanogaster lineage. This would suggest that approximately 25 nonsynonymous fixations and 13 synonymous-site fixations occurred before the inactivation of the pseudogene. These calculations are limited by the large inaccuracies introduced by working with such a small number of deletions, but nevertheless they support the proposition that many substitutions occurred in the melanogaster lineage before {alpha}E4a was inactivated.

Discussion

Silent site divergences suggest that an ancestral {alpha}E4 gene duplicated to form {alpha}E4 and {alpha}E4a before the divergence of the D. melanogaster and D. willistoni lineages (Robin et al. 1996Citation ). Since that time, both the {alpha}E4 gene and the {alpha}E4a gene appear to have evolved, for a period at least, under the purifying selection that is typical of protein-encoding and, more specifically, esterase-encoding genes. Replacement site divergence is less than silent site divergence, exons have been more constrained than introns, and amino acid substitutions have occurred more frequently in positions aligning to the variable sites of other {alpha}-esterases. However, at some stage after the D. melanogaster lineage diverged from the D. simulans lineage, the Dm{alpha}E4a gene was inactivated, and since then, it has evolved in a neutral fashion. Consequently, interspecific comparisons involving Dm{alpha}E4a-{Psi} yield higher R/S and E/I ratios than do comparisons between the active Ds{alpha}E4a and Dy{alpha}E4a genes. Furthermore, at least seven inactivating mutations have been acquired among Dm{alpha}E4a-{Psi} alleles.

Does the apparently nonneutral evolution that preceded Dm{alpha}E4a inactivation mean that {alpha}E4a had a function? Potentially neutral molecular events such as gene conversion and unequal recombination could cause a functionless gene to evolve in a such way that replacement substitutions appeared less frequent than silent substitutions, etc., and, indeed, there are a few cases outside the Drosophila literature in which pseudogenes are purported to have been reactivated by such reticulate events (Nei 1987Citation ; Trabesinger-Ruef et al. 1996Citation ). However, there are no signs of reticulate evolution in Ds{alpha}E4a or Dy{alpha}E4a sequences (e.g., patchworks of regions more similar to paralogs than orthologs) or, more generally, between {alpha}E4 and {alpha}E4a (Robin et al. 1996Citation ). Therefore, it seems likely that selection on a functional esterase explains why there are relatively fewer replacement substitutions than silent substitutions in {alpha}E4a.

This does not necessarily mean that {alpha}E4a had its own function distinct from {alpha}E4. Perhaps active copies of both {alpha}E4 and {alpha}E4a were required to produce enough esterase for a particular function. Another possibility is that the products of the two duplicate genes form heterodimers and that selection against defective heterodimers has kept both genes active. Gottleib and Ford (1997)Citation proposed such a mechanism to explain the multiple independent "silencings" of a duplicate PGI gene in Clarkia wildflower species. However, these processes do not easily explain why the R/S value is twice as great in Ds{alpha}E4a-versus-Dy{alpha}E4a comparisons as it is in Dm{alpha}E4-versus-Ds{alpha}E4 comparisons. It seems that even before the {alpha}E4a gene was inactivated, it evolved with relaxed constraint relative to its paralog {alpha}E4.

Not much is known that sheds light on the nature of the functions of the active {alpha}E4/{alpha}E4a genes. Alleles of {alpha}E7, another gene in the {alpha}-esterase cluster, confer organophosphate insecticide resistance on L. cuprina and M. domestica (Newcomb et al. 1997Citation ; Claudianos, Russell, and Oakeshott 1999Citation ). Isozymes that are known to be encoded by the cluster (EST23 and EST9, which was once called EST C, in D. melanogaster and EST2 in D. buzzatii) exhibit high levels of allozyme polymorphism in a number of species (David 1982Citation ; Barker 1994Citation ), although nulls are rare (Langley et al. 1981Citation ). These isozymes are expressed in high concentrations in digestive tissues of the feeding life stages, and the D. melanogaster isozymes are also abundant in adult heads (Healy, Dumancic, and Oakeshott 1991Citation ). Therefore, it is tempting to suggest that the {alpha}E4 genes may also have roles in the digestion of dietary esters or xenobiotics. However, it is worth noting that all of the {alpha}E4 esterases have the same highly unusual residues around their catalytic sites (e.g., a histidine preceding the nucleophilic serine) that almost certainly make them functionally distinct from other esterases encoded by the {alpha}-esterase cluster (unpublished data).

A study similar to the one described here presents the sequences of 10 alleles of a larval cuticle protein pseudogene (Lcp{Psi}) and a single allele of its ortholog in D. simulans (Pritchard and Schaeffer 1997Citation ). Unlike Ds{alpha}E4a, the D. simulans ortholog of Lcp{Psi} is also a pseudogene and has at least three inactivating mutations. The evolution of this gene appears neutral in that divergence does not differ between nonsynonymous and synonymous sites. However, the divergence among Lcp{Psi} alleles is actually one of the lowest described for D. melanogaster genes ({pi} = 0.001). If this observation that pseudogenes had a lower heterozygosity than functional genes were general, it could be argued that the extra variation observed among alleles of functional genes could be due to diversifying or balancing selection. However, Lcp{Psi} differs in this respect from Dm{alpha}E4{alpha}-{Psi}, which has a relatively high level of heterozygosity ( = 0.006). Obviously, no general conclusions can be drawn from such a small sample of pseudogenes, especially in the absence of more detailed information about the effect of selection on neighboring sites.

Comparison of Dm{alpha}E4a-{Psi} alleles reveals a high frequency of polymorphic deletions. This is consistent with the studies of Petrov, Lozovskaya, and Hartl (1996)Citation , Petrov et al. (1998)Citation , and Petrov and Hartl (1998)Citation , who have compared defective copies of the transposable element Helena within the virilis group, within the melanogaster subgroup, and from the swallow pseudogene. In the first of these studies, 18 copies of Helena from 8 different species were examined, and 11 copies had unique deletions in the size range of 1–75 bp, with an average size of 24.3 bp. This translates to 0.16 deletions per nucleotide substitution. These authors estimated that it would take 11.8 Myr for a pseudogene to lose half of its DNA. In Dm{alpha}E4a-{Psi}, there are 3 polymorphic deletions and 20 nucleotide polymorphisms, giving a deletion/nucleotide substitution ratio of 0.15, and the average size of a deletion is 18 bp. If we assume that the pseudogene substitution rate is not slower than the lower bound estimated for the silent substitution rate in Dm{alpha}E4a (i.e., 1.9 x 10-8), then the average number of nucleotides lost per site per year would be 0.15 x 1.9 x 10-8 x 18 = 5.1 x 10-8. Thus, it would take about 10 Myr to lose half a gene.

Petrov et al. (1998)Citation noticed that approximately half of the deletions observed in their data were larger than 10 bp and that many had flanking short (2–7 bp) direct duplications. They speculated that the direct duplications may be footprints of some kind of homology-based mechanism by which the deletions form. In Dm{alpha}E4a-{Psi}, all deletions are larger than 10 bp, but none have the signs of small short direct duplications (albeit the 7-bp insertion that is fixed among all alleles is, in its entirety, a duplication of flanking sequence).

The overall similarity between the results presented here on Dm{alpha}E4a-{Psi} and those presented by Petrov, Lozovskaya, and Hartl (1996)Citation and Petrov and Hartl (1998)Citation for Helena elements and for the swallow pseudogene (Petrov et al. 1998Citation ) supports the argument that there is a general tendency among unconstrained sequences in Drosophila to have a high rate of DNA loss, regardless of where they are located or whether they are derived from transposable elements. This rate of gene loss for Drosophila is much greater than that for mammals. Graur, Shuali, and Li (1989)Citation used 52 human and rodent processed pseudogenes to calculate that it would take 400 Myr to lose half of their DNA. The difference is partly due to the higher frequency of deletions in Drosophila (Petrov and Hartl [1998Citation ] estimate it is 2.6 times that of mammals) and partly due to their larger size (approximately 7 times as large). The extent of the difference is such that Petrov, Lozovskaya, and Hartl (1996)Citation have argued that the inherent deletion rate contributes to genome size evolution in Drosophila, whereas Ophir and Graur (1997)Citation show that there is no significant correlation between the age of 156 murid and human pseudogenes and their decrease in size and conclude that the inherent deletion rate makes an insignificant contribution to genome size evolution in those organisms.

Could the relatively high rate at which unconstrained DNA is lost in Drosophila be due to selection for less DNA? Akashi (1995)Citation has shown that selection coefficients of one in a million can influence codon bias and, given the estimates of relatively large population size for many Drosophila species, Charlesworth (1996)Citation implies that a selective advantage for genomes slightly smaller than others could be effective in these species. In the case of Helena elements, Petrov and Hartl (1998)Citation have argued against this proposal by pointing out that there is not a positive correlation between the age of the element and the lengths of the deletions. It is conceivable, however, that it is not the size of the deletions that are selected but rather that they occur at all, and also possibly where they occur. For instance, it may be that deletions are subject to positive selection because they prevent ectopic protein expression which would interfere with normal cellular biochemistry (Hughes and Hughes 1993Citation ). Alternatively, it is possible that the function of {alpha}E6 is compromised by an active or full-length {alpha}E4a in its second intron. With a larger and more random sample of alleles, the methods of Tajima (1989)Citation and Fu and Li (1993)Citation could be employed to help establish whether there is actually positive selection acting on deletions. In fact, both of these tests were used on the nucleotide variation currently available for Dm{alpha}E4a-{Psi} alleles but failed to find any evidence for selection acting on the alleles (Tajima's D = 0.23, P > 0.1; Fu and Li's D = 0.35, P > 0.1).

Dm{alpha}E4a-{Psi} is one of the rare bona fide pseudogenes so far described for Drosophila. Everything suggests that it is evolving according to neutral expectations. It is, however, in a class of pseudogenes that are different from many of those described in the literature. Unlike the bulk of pseudogenes that litter the mammalian genome, there is no evidence that it was generated by reverse transcription. It is also not the typical pseudogene envisaged under Ohno's (1970)Citation model of gene duplication. In this model (termed the "Mutation During Non-functionality" model by Hughes [1994]Citation ), a pseudogene is generated when a duplicate gene fails to happen upon a function that selection maintains. Instead, it appears that {alpha}E4a actually had a function after the gene duplication event and apparently still has one in at least two species of the melanogaster group. It is, however, no longer required in D. melanogaster. It seems possible that it is not alone in this category and that there may be a suite of previously functional genes in Drosophila for which the fixation of inactivated alleles is a very recent or contemporary event. Candidates for this category include Est7 (Balakirev and Ayala 1996Citation ), Cec{Psi}1, Cec{Psi}2 (Ramos-Onsins and Aguade 1998Citation ), GstD22 (Toung, Hsieh, and Chen-Pei 1993Citation ), GstD26 (Toung, Hsieh, and Chen-Pei 1993Citation ), PGLYM (Currie and Sullivan 1994Citation ), and Adh-{Psi} from D. mercatorum (Sullivan et al. 1994Citation ).



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 6.—A phylogenetic tree showing the relationship of a gene in three species (A, B, and C), where the gene became a pseudogene in the B lineage. "Tx" represents the time intervals demarcated by the broken lines, and "Nx" represents the number of nucleotide substitutions in the branches demarcated by the shaded arrows

 
Acknowledgements

We thank Charles Langley for the discussions motivating this research, as well as subsequent ones, Chip Aquadro for providing the fly strains, and Charles Claudianos, Wendy Odgers, and Dave Rowell for their stimulating discussions. We also thank an anonymous reviewer for constructive suggestions.

Footnotes

Shozo Yokoyama, Reviewing Editor

1 Keywords: pseudogene esterase mutation rate Drosophila. Back

2 Address for correspondence and reprints: G. Charles de Q. Robin, 3347 Storer Hall, Center for Population Biology, University of California at Davis, One Shields Avenue, Davis, California 95616-8554. E-mail: gcrobin{at}ucdavis.edu Back

literature cited

    Akashi, H. 1995. Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Droso-phila DNA. Genetics 139:1067–1076.

    Aquadro, C. F., D. J. Begun, and E. C. Kindahl. 1994. Selection, recombination and DNA polymorphism in Drosophila. Pp. 46–66 in B. Golding, ed. Non-neutral evolution, theories and molecular data. Chapman and Hall, London.

    Balakirev, E. S., and F. J. Ayala. 1996. Is esterase-P encoded by a cryptic pseudogene in Drosophila melanogaster? Genetics 144:1511–1518.

    Barker, J. S. F. 1994. Sequential gel electrophoretic analysis of esterase-2 in two populations of Drosophila buzzatii. Genetica 92:165–175.

    Begun, D. J. 1997. Origin and evolution of a new gene descended from alcohol dehydrogenase in Drosophila. Genetics 145:375–382.

    Campbell, P. M., J. Trott, C. Claudianos, K.-A. Smyth, R. J. Russell, and J. G. Oakeshott. 1997. Biochemistry of esterases associated with organophosphate resistance in Lucilia cuprina with comparisons to putative orthologues in other Diptera. Biochem. Genet. 35:17–40.[ISI][Medline]

    Charlesworth, B. 1996. The changing size of genes. Nature 384:315–316.

    Claudianos, C., R. J. Russell, and J. G. Oakeshott. 1999. The same amino acid substitution in orthologous esterases confers organophosphate resistance on the house fly and a blowfly. Insect Biochem. Mol. Biol. 29:675–686.[ISI][Medline]

    Currie, P. D., and D. T. Sullivan. 1994. Structure, expression and duplication of genes which encode phosphoglyceromutase of Drosophila melanogaster. Genetics 138:352–363.

    David, J. R. 1982. Latitudinal variability of Drosophila melanogaster: allozyme frequency divergence between European and Afrotropical populations. Biochem. Genet. 20:747–761.[ISI][Medline]

    Dumancic, M. M., J. G. Oakeshott, R. J. Russell, and M. J. Healy. 1997. Functional conservation of the Drosophila melanogaster ESTP protein in drosophilids. Biochem. Genet. 35:251–271.[ISI][Medline]

    Fu, Y.-X., and W.-H. Li. 1993. Statistical tests of neutrality of mutations. Genetics 133:693–709.

    Genetics Computer Group. 1994. GCG. Version 8.0. GCG, Madison, Wis.

    Gloor, G. B., and W. R. Engels. 1992. Single-fly preps for PCR. Drosophila Inform. Serv. 71:148.

    Gottlieb, L. D., and V. S. Ford. 1997. A recently silenced, duplicate PgiC locus in Clarkia. Mol. Biol. Evol. 14:125–132.[Abstract]

    Graur, D., Y. Shuali, and W.-H. Li. 1989. Deletions in processed pseudogenes accumulate faster in rodents than in humans. J. Mol. Evol. 28:279–285.[ISI][Medline]

    Healy, M. J., M. M. Dumancic, and J. G. Oakeshott. 1991. Biochemical and physiological studies of soluble esterases from Drosophila melanogaster. Biochem. Genet. 29:365–388.

    Hudson, R. R. 1987. Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245–250.[ISI][Medline]

    Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116:153–159.

    Hughes, A. L. 1994. The evolution of functionally novel proteins after gene duplication. Proc. R. Soc. Lond. B Biol. Sci. 256:119–124.[ISI][Medline]

    Hughes, M. K., and A. L. Hughes. 1993. Evolution of duplicate genes in a tetraploid animal, Xenopus laevis. Mol. Biol. Evol. 10:1360–1369.[Abstract]

    Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, New York.

    Kreitman, M., and R. R. Hudson. 1991. Inferring the evolutionary histories of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127:565–582.

    Langley, C. H., R. A. Voelker, A. J. Leigh Brown, S. Ohnishi, B. Dickinson, and E. Montgomery. 1981. Null allele frequencies at allozyme loci in natural populations of Drosophila melanogaster. Genetics 99:151–156.

    Li, W.-H., T. Gojobori, and M. Nei. 1981. Pseudogenes as a paradigm of neutral evolution. Nature 292:237–239.

    Long, M., and C. H. Langley. 1993. Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260:91–95.

    McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652–654.

    Miyata, T., and T. Yasunaga. 1981. Rapidly evolving mouse alpha-globin related pseudogene and its evolutionary history. Proc. Natl. Acad. Sci. USA 78:450–453.

    Moriyama, E. N., and J. R. Powell. 1996. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261–277.[Abstract]

    Nei, M. 1987. Molecular evolutionary genetics. Columbia University Press, New York.

    Newcomb, R. D., P. M. Campbell, R. J. Russell, and J. G. Oakeshott. 1997. A single amino acid substitution converts a carboxylesterase to an organophosphate hydrolase and confers insectide resistance on a blowfly. Proc. Natl. Acad. Sci. USA 94:7464–7468.

    Newcomb, R. D., P. D. East, R. J. Russell, and J. G. Oakeshott. 1996. Isolation of the {alpha} esterase genes associated with organophosphate resistance in Lucilia cuprina. Insect Biochem. Mol. Biol. 5:211–216.

    Oakeshott, J. G., T. M Boyce, R. J. Russell, and M. J. Healy. 1995. Molecular insights into the evolution of an enzyme: esterase 6 in Drosophila. Trends Ecol. Evol. 10:103–110.

    Oakeshott, J. G., C. Claudianos, R. J. Russell, and G. C. Robin. 1999. Carboxyl/cholinesterases: a case study of the evolution of a successful multigene family. BioEssays 21:1031–1042.

    Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, Berlin.

    Ophir, R., and D. Graur. 1997. Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene 205:191–202.

    Petrov, D. A., Y.-C. Chao, E. C. Stephenson, and D. L. Hartl. 1998. Pseudogene evolution in Drosophila suggests a high rate of DNA loss. Mol. Biol. Evol. 15:1562–1567.[Free Full Text]

    Petrov, D. A., and D. L. Hartl. 1998. High Rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol. Biol. Evol. 15:293–302.[Abstract]

    Petrov, D. A., E. R. Lozovskaya, and D. L. Hartl. 1996. High intrinsic rate of DNA loss in Drosophila. Nature 384:346–349.

    Powell, J. R., and R. DeSalle. 1995. Drosophila molecular phylogenies and their uses. Pp. 88–137 in M. K. Hecht, ed. Evolutionary biology. Vol. 28. Plenum Press, New York.

    Press, W. H. 1988. Numerical recipes in C: the art of scientific computing. Cambridge University Press, New York and Cambridge, England.

    Pritchard, J. K., and S. W. Schaeffer. 1997. Polymorphism and divergence at a Drosophila pseudogene. Genetics 147:199–208.

    Ramos-Onsins, S., and M. Aguade. 1998. Molecular evolution of the cecropin multigene family in Drosophila: functional genes vs. pseudogenes. Genetics 150:157–171.

    Robin, G. C. de Q., K. M. Medveczky, R. J. Russell, and J. G. Oakeshott. 1996. Duplication and divergence of the genes of the alpha-esterase cluster of Drosophila melanogaster. J. Mol. Evol. 43:241–252.[ISI][Medline]

    Rozas, J., and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular analysis. Bioinformatics 15:174–175.

    Russell, R. J., G. C. Robin, P. Kostakos, R. D. Newcomb, T. M. Boyce, K. M. Medveczky, and J. G. Oakeshott. 1995. Molecular cloning of an {alpha} esterase gene cluster on chromosome 3R of Drosophila melanogaster. Insect Bio-chem. Mol. Biol. 26:235–247.

    Sharp, P. M., and W.-H. Li. 1989. On the rate of DNA sequence evolution in Drosophila. J. Mol. Evol. 28:398–402.[ISI][Medline]

    Spackman, M. E., J. G. Oakeshott, K.-A. Smyth, K. M. Medveczky, and R. J. Russell. 1994. A cluster of esterase genes on chromosome 3R of Drosophila melanogaster includes homologues of esterase genes conferring insecticide resistance in Lucilia cuprina. Biochem. Genet. 32:39–62.

    Sullivan, D. T., W. T. Starmer, S. W. Curtiss, M. Menotti-Raymond, and J. Yum. 1994. Unusual molecular evolution of an Adh pseudogene in Drosophila. Mol. Biol. Evol. 11:443–458.[Abstract]

    Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595.

    ———. 1993. Simple methods of testing the molecular evolutionary clock hypothesis. Genetics 135:599–607.

    Toung, Y.-P. S., T. S. Hsieh, and D. T. Chen-Pei. 1993. The glutathione S-transferase D genes: a divergently organized, intronless gene family in Drosophila melanogaster. J. Biol. Chem. 268:9737–9746.[Abstract/Free Full Text]

    Trabesinger-Ruef, N., T. Jermann, T. Zankel, B. Durrant, G. Frank, and S. A. Benner. 1996. Pseudogenes in ribonuclease evolution: a source of new biomacromolecular function? FEBS Lett. 382:319–322.

    Walsh, J. B. 1995. How often do duplicated genes evolve new functions? Genetics 139:421–428.

    Waterson, G. A. 1975. On the number of segregating sites in genetic models without recombination. Theor. Popul. Biol. 7:256–276.[ISI][Medline]

Accepted for publication December 13, 1999.