Evolution of the Eukaryotic Translation Termination System: Origins of Release Factors

Yuji InagakiGo, and W. Ford Doolittle

Program in Evolutionary Biology, Canadian Institute for Advanced Research, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Accurate translation termination is essential for cell viability. In eukaryotes, this process is strictly maintained by two proteins, eukaryotic release factor 1 (eRF1), which recognizes all stop codons and hydrolyzes peptidyl-tRNA, and eukaryotic release factor 3 (eRF3), which is an elongation factor 1{alpha} (EF-1{alpha}) homolog stimulating eRF1 activity. To retrace the evolution of this core system, we cloned and sequenced the eRF3 genes from Trichomonas vaginalis (Parabasalia) and Giardia lamblia (Diplomonada), which are generally thought to be "early-diverging eukaryotes," as well as those from two ciliates (Oxytricha trifallax and Euplotes aediculatus). We also determined the sequence of the eRF1 gene for G. lamblia. Surprisingly, the G. lamblia eRF3 appears to have only one domain, corresponding to EF-1{alpha}, while other eRF3s (including the T. vaginalis protein) have an additional N-terminal domain, of 66–411 amino acids. Considering this novel eRF3 structure and our extensive phylogenetic analyses, we suggest that (1) the current translation termination system in eukaryotes evolved from the archaea-like version, (2) eRF3 was introduced into the system prior to the divergence of extant eukaryotes, including G. lamblia, and (3) G. lamblia might be the first eukaryotic branch among the organisms considered.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
The translational apparatus—the ribosome and associated initiation, elongation, and termination factors—is likely the cell's most complex molecular machine. We have a good understanding of many of its functions in both bacteria and eukaryotes and a growing appreciation of the complex evolutionary relationships among several of its component molecules.

Translation termination—recognition of stop codons UAA, UAG, and UGA in the A site of ribosomes and hydrolysis of peptidyl-tRNA—is the least understood of these processes, from either mechanistic or evolutionary perspectives. Because this process is essential, we might expect the proteins (release factors) responsible to be ancient and highly conserved. However, neither of the two principal bacterial release factors, bacterial release factor 1 (RF1) (which recognizes stop codons UAA and UAG) or RF2 (which recognizes UAA and UGA), shows obvious similarity to the single principal release factor in eukaryotes, eukaryotic release factor 1 (eRF1) (which recognizes all three stop codons). Furthermore, although both bacteria and eukaryotes employ additional GTPase factors (RF3 and eRF3, respectively) that stimulate termination, the genes encoding these factors are not orthologous (Sogin 1997Citation ). It is remarkable that this essential system is so differently constituted in bacteria and eukaryotes.

All six completed archaeal genomes contain an archaeal RF (archael release factor 1; aRF1) clearly related to the principal eukaryotic release factor, eRF1 (Bult et al. 1996Citation ; Klenk et al. 1997Citation ; Smith et al. 1997Citation ; Kawarabayashi et al. 1998, 1999Citation ). Although there have been no biochemical studies on translation termination in archaea, aRF1 is presumed to bind all three stop codons, based on the sequence similarity to the eukaryotic factor (Dennis 1997). However, no eRF3 or RF3 homolog has been identified so far in any archaeon (Bult et al. 1996Citation ; Klenk et al. 1997Citation ; Smith et al. 1997Citation ; Kawarabayashi et al. 1998, 1999Citation ).

At the biochemical and genetic levels, the eukaryotic translation termination system has been studied most extensively in Saccharomyces cerevisiae. The eRF1 gene was first described as an omnipotent nonsense suppressor gene, sup45 (Breining and Piepersberg 1986Citation ). In 1994, the product of this gene (eRF1) was shown to recognize all three stop codons on ribosomes and catalyze peptidyl-tRNA hydrolysis (Frolova et al. 1994Citation ). Subsequently, the product of another nonsense suppressor gene, sup35, was found to assist eRF1. The Sup35 protein, termed eRF3, binds tightly to eRF1 and stimulates its translation termination activity in a GTP-dependent manner (Stansfield et al. 1995Citation ; Zhouravleva et al. 1995Citation ). Comparisons between eRF3 homologs from animals, fungi, and one plant revealed that this protein has an N-terminal domain of varying length and amino acid (aa) sequence and a C-terminal domain that is highly conserved, with strong similarity to elongation factor 1{alpha} (EF-1{alpha}). In yeast, the EF-1{alpha}-like C-terminal domain is necessary and sufficient for maintenance of translation fidelity and cell viability, but the N-terminal domain can be deleted (Ter-Avanesyan et al. 1993Citation ).

Interestingly, the N-terminal domain of yeast eRF3 is responsible for the propagation of a prion-like [{Psi}+] phenotype, which is inherited in a non-Mendelian manner (Kochneva-Pervukhova et al. 1998Citation ). Overexpression of this domain or mutations affecting it trigger protein aggregation both in vitro and in vivo, as do mutations in mammalian PrPSc (Kochneva-Pervukhova et al. 1998Citation ). The [{Psi}+] phenotype appears advantageous to yeast under stress conditions (Eaglestone, Cox, and Tuite 1999Citation ). However, no [{Psi}+]-like protein aggregation is known for other eRF3 proteins, and the primary and conserved function of the N-terminal domain of eRF3 remains unclear.

Humans, mice, Caenorhabditis elegans, and yeast contain another open reading frame (ORF), called HBS1 (or eRFS), showing strong sequence similarity to EF-1{alpha} and eRF3 (Garcia-Cantalejo et al. 1994Citation ; Wallrapp et al. 1998Citation ). Like eRF3, HBS1 proteins have an additional sequence at the N-terminus of the highly conserved domain corresponding to EF-1{alpha}. Although the two molecules share unique structural features, mammalian HBS1 cannot complement a temperature-sensitive (ts) mutation in yeast eRF3 and does not interact with eRF1 (Wallrapp et al. 1998Citation ). In yeast, the HBS1 gene is not essential for cell viability, but an increased copy number of this gene can suppress the growth defect of double mutation in the SSB1 and SSB2 genes (Nelson et al. 1992Citation ). SSB1 and SSB2, which are molecular chaperones of the HSP70 family, are believed to interact directly with nascent polypeptides on ribosomes and support their transport through the ribosome channel into the cytosol. Therefore, yeast HBS1 may escort aminoacyl-tRNA to the ribosome more efficiently than normal EF-1{alpha} and relieve "protein sticking" in the channel (Nelson et al. 1992Citation ).

We sought to retrace the evolution of the translation termination system, focusing here on the origins of eRF1 and eRF3. Considering eukaryotes other than animals, plants, and fungi, the GenBank database contains no eRF3 sequences and only two eRF1 sequences (Plasmodium falciparum, AE001492; Tetrahymena thermophila, AB026195). In this study, we amplified, cloned, and sequenced eRF3 genes from two ciliates, Euplotes aediculatus and Oxytricha trifallax, and two amitochondriate protists, Trichomonas vaginalis (Parabasalia) and Giardia lamblia (Diplomonada), as well as the G. lamblia eRF1 gene. In the public databases, we found previously unidentified HBS1 genes for Drosophila melanogaster, Schizosaccharomyces pombe, and Candida albicans. Phylogenetic analyses of these sequences allowed us to trace some of the early stages in the evolution of the eukaryotic version of the translation termination system and provided evidence for G. lamblia as one of the first-diverging lineages among eukaryotes.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Sequencing of eRF1 and eRF3 Genes
A fragment of the G. lamblia eRF1 gene (0.3 kb) was amplified with a set of degenerate PCR primers, 45F2A (CCTAAAAAGCATGGNCGNGGNGG; N = A, C, G or T) and 45R3A (TTAATGCCGTTYTCNCCNCCRTA; R = A or G; Y = T or C). A 0.2-kb region of the EF-1{alpha}-like domain of the eRF3 gene was amplified by degenerate PCR primers, 35F2A (ACAATCCTCGACGCNCCNGGNCAYAA) and 35R2A (ACTGTTGGRTCRTGCATYTTRTT). For amplification of these genes, the thermal cycle, consisting of denaturation at 94°C for 30 s, annealing at 50°C for 1 min, and extension at 72°C for 1 min, was repeated 35 times. These fragments were cloned and sequenced on both strands.

Exact-match primers based on these partial G. lamblia sequences were used for Uneven PCR (Chen and Wu 1997Citation ). The basic strategy of this method is to amplify unknown N- and C-terminal regions with an exact-match primer and a short primer with an arbitrary sequence (arbitrary primer) that can hybridize to many sites on a template DNA. For each amplified fragment, multiple clones were completely sequenced to avoid PCR artifacts. The sequenced regions for the G. lamblia eRF1 and eRF3 genes were 2.7 and 2.0 kb, respectively. The detailed conditions of Uneven PCR were as described in the original paper (Chen and Wu 1997Citation ). Arbitrary primers used in this study were provided by J. M. Logsdon (Dalhousie University, Halifax, Nova Scotia, Canada).

The EF-1{alpha}-like domains of the O. trifallax and E. aediculatus eRF3 genes were partially amplified with degenerate primers 35F2A and 35R2A. In the macronuclei of hypotrich ciliates (such as O. trifallax and E. aediculatus), each single gene is known to be composed of a chromosome with telomere ends (Prescott 1994Citation ). Therefore, using PCR with exact-match primers for ciliate eRF3 genes and a primer designed to bind the telomere sequence, 5'-(CCCCAAAA)n-3', we succeeded in amplifying the entire macronuclear chromosomes for the two ciliate eRF3 genes. These amplifications were carried out as follows: denaturation at 94°C for 1 min, annealing at 57°C for 30 s, and extension at 72°C for 30 s for 35 cycles. DNA sequences of these amplified fragments were confirmed by multiple clones. The macronuclear chromosomes containing the O. trifallax and E. aediculatus eRF3 genes were 3.3 and 2.6 kb, respectively. Primers for telomeric repeats were provided by D. M. Prescott (University of Colorado, Boulder).

The 0.4-kb region of the T. vaginalis eRF3 gene was obtained by PCR with primers 35F1A (GTCTTTATCGGCCAYGTNGAYGCNGG) and 35R2A. With this PCR product as a probe, a T. vaginalis genomic DNA library (provided by J. M. Logsdon and N. M. Fast, Dalhousie University) was screened, and one clone containing the entire eRF3 gene (1,764 bp) was isolated. Both DNA strands of this gene were completely sequenced by primer walking. All sequences determined in this study were deposited in GenBank under accession numbers AF198107AF198111.

Identifying HBS1 Genes in Databases
Homologs of the yeast HBS1 gene, which have an EF-1{alpha}-like domain at their C-terminus, have been described only for C. elegans and two mammals (Nelson et al. 1992;Citation Wallrapp et al. 1998Citation ). We found two previously unidentified HBS1 genes from D. melanogaster and S. pombe in the GenBank nonredundant (nr) database using TBLASTN (Altschul et al. 1997Citation ). The D. melanogaster homolog was found in the region 62A10–62B5 on chromosome 3L (positions 997–3009 in AC005557). We also identified several partial cDNA sequences for this gene (e.g., AA952043, AI388068, and AI455532). In S. pombe chromosome II cosmid c2G5 (AL033385), the HBS1 homolog was identified (positions 18575–20447). Searching the database for C. albicans at the Stanford DNA Sequence and Technology Center, the ORF found in the contig 3–3135 (positions 1–1629) appeared to encode an HBS1 homolog. Although the N-terminus of this ORF is truncated, the entire EF-1{alpha}-like domain of HBS1 is covered by this contig. Sequence data for C. albicans were obtained from the Stanford DNA Sequencing and Technology Center website at http://www-sequence.stanford.edu/group/candida.

Phylogenetic Analyses
Eleven eRF1 and six aRF1 sequences ({eRF1 + aRF1}) were aligned by CLUSTAL W (Higgins and Sharp 1988Citation ) and subsequently edited by eye. An unambiguous alignment (358 positions) was used for reconstruction of the eRF1 phylogeny. The best protein maximum-likelihood (ML) tree was chosen among 1,000 trees constructed by quick-add-OTU (heuristic) search using the JTT-F substitution model, implemented in PROTML, version 2.2 (Adachi and Hasegawa 1996Citation ). For further tests of the eRF1 phylogeny, we performed bootstrap analyses using PROTML. One hundred data sets were generated from the original alignment using SEQBOOT (Felsenstein 1993Citation ). For each data set, we searched for the best ML tree among 100 trees constructed by heuristic search with JTT-F. The bootstrap scores were generated from a consensus of the best ML trees obtained from the 100 resampling data sets using CONSENSE (Felsenstein 1993Citation ).

The EF-1{alpha}-like domains of eRF3 and HBS1 were manually added to an EF-1{alpha} alignment provided by S. L. Baldauf (University of York, U.K.) and edited by eye. The resultant unambiguous alignment (372 positions) includes 16 eRF3, 7 HBS1, 16 eukaryotic EF-1{alpha} (eEF-1{alpha}), and 9 archaeal EF-1{alpha} (aEF-1{alpha}) sequences. The best ML tree and resampling estimated log likelihood (RELL) values were obtained from 1,000 trees constructed by heuristic search with JTT-F inferred from this master alignment.

It is well documented that the choice of sequences used for an outgroup can significantly affect phylogenetic inferences (e.g., Roger et al. 1999Citation ). For the eRF3 phylogeny, eEF-1{alpha}, HBS1, and aEF-1{alpha} are available as potential outgroups. From the master alignment including 48 sequences, we generated data sets including 16 eRF3 sequences and 3 sequences for the outgroup, such as human (P04720), Porphyra purpurea (U08844), and G. lamblia (D14342) eEF-1{alpha} ({eRF3 + eEF-1{alpha}}); human (U87791), C. elegans (Z81098), and yeast (Z28309) HBS1 ({eRF3 + HBS1}); or Methanococcus jannaschii, (U67486), Archaeoglobus fulgidus (AE001039), and Desulfurococcus mobilis (X73582) aEF-1{alpha} ({eRF3 + aEF-1{alpha}}). For each of three data sets, the best ML tree was selected from a heuristic search, and bootstrap analyses were performed using resampling data sets.

In PROTML (Adachi and Hasegawa 1996Citation ), all sites are treated as freely changeable with the same evolutionary rate, although in reality such sites evolve at various rates. Therefore, some portion of constant sites in our data sets might be invariable and potentially violate phylogenetic assumptions made by PROTML. In fact, a previous study found that constant (or invariable) site removal from several data sets significantly reduced statistical support for early divergence of G. lamblia and T. vaginalis (Hirt et al. 1999Citation ). To examine the effect of constant-site removal on our analyses, we removed these sites from our data sets, {eRF3 + eEF-1{alpha}}, {eRF3 + HBS1}, and {eRF1 + aRF1}, and performed bootstrap analyses again.

For the calibration of site-by-site evolutionary rates in eRF1 and eRF3, we calculated a discrete {gamma} distribution (one invariable and eight rate categories) over the neighbor-joining (NJ) tree with the JTT using PUZZLE, version 4.0 (Strimmer and von Haeseler 1996Citation ), for each of the former data sets. Based on the ML distance matrices that were calibrated by {gamma} distributions, the NJ trees were constructed by NEIGHBOR (Felsenstein 1993Citation ). For bootstrap analyses, 500 resampling data sets were generated by SEQBOOT (Felsenstein 1993Citation ), and these ML distance matrices were subsequently calculated by PUZZLE (Strimmer and von Haeseler 1996Citation ) and PUZZLEBOOT, version 1.02 (A. J. Roger and M. E. Holder, http://members.tripod.de/korbi/puzzle/), using the portion of invariable sites and the shape parameter obtained from an original data set. Using these matrices, bootstrap scores were obtained by NEIGHBOR and CONSENSE (Felsenstein 1993Citation ).


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
Sequencing and Database Analyses
We determined the entire sequence of the G. lamblia eRF1 gene. This gene lacks introns and is predicted to encode a protein of 457 aa. The top BLASTX match against the GenBank nr database is with the Arabidopsis thaliana eRF1 (P35614), with an expected value (e-value) of 10-131. The G. lamblia eRF1 has an acidic C-terminal tail homologous to the yeast eRF1 C-terminal region, which has been shown to be critical for eRF3-binding (Ito, Ebihara, and Nakamura 1998Citation ) (fig. 1A ). Within 58 bp downstream of the stop codon of the eRF1 gene, the C-terminus of a recA-homologous gene was found (data not shown). We also found a putative ORF of 145 aa in the 5' upstream region (107 bp away), but no similarity to sequences in the database was detected (data not shown).



View larger version (103K):
[in this window]
[in a new window]
 
Fig. 1.—A, Comparison of the C-terminal regions of eRF1 and aRF1. Identical (black) and highly conserved (gray) positions among 11 eRF1 and 6 aRF1 sequences are highlighted. Sequences of the C-terminal tail of eRF1 are not aligned. Acidic residues in the C-terminal tail of eRF1, such as aspartic acid (D), glutamic acid (E), asparagine (N), and glutamine (Q), are highlighted as bold letters. B, Boundary between the N- and C-terminal domains of eRF3 and HBS1. The vertical line shows the predicted boundary of N- and C-terminal domains (Stansfield and Tuite 1994Citation ). Sequences of the N-terminal domain are not aligned. Identical (black) and highly conserved (gray) positions among 16 eRF3 and 7 HBS1 sequences are highlighted. Predicted N-terminal domain lengths are indicated on the right of each row. The N-terminal domain of C. albicans HBS1 is partial and given in parentheses

 
In the 2.0-kb region sequenced for the G. lamblia eRF3 gene, two putative ORFs, of 153 and 466 aa, were found, separated by a 66-bp spacer region. The 466-aa ORF matches the Nicotiana tabacum eRF3 (L38828), with e-value 4 x 10-87, in a BLASTX survey. The shorter ORF showed no similarity to sequences in the database. The G. lamblia eRF3 gene appeared to lack introns. Surprisingly, this gene encodes only the region corresponding to the EF-1{alpha}-like domain of other eRF3 proteins: its product would lack the N-terminal domain of previously described eRF3 proteins (fig. 1B ).

The genes for O. trifallax, E. aediculatus, and T. vaginalis eRF3 should encode 937-, 806-, and 588-aa proteins, respectively. No evidence for introns was found in these genes. The top BLASTX matches of the O. trifallax, E. aediculatus, and T. vaginalis sequences were to the N. tabacum eRF3 (L38828), with e-values of 10-120, 10-115, and 10-101, respectively. Like all eRF3 genes except that of G. lamblia, the O. trifallax, E. aediculatus, and T. vaginalis genes appeared to encode 412, 302, and 155 aa of N-terminal domains that are not found in EF-1{alpha} (fig. 1B ). While the N-terminal domains of the two ciliate eRF3 have no apparently repetitive sequence, we found 13 "PEPK" repeats in the T. vaginalis protein (data not shown). Compared with the yeast eRF3, unique C-terminal extensions of 87 and 67 aa were found in the O. trifallax and E. aediculatus proteins (data not shown).

We identified the D. melanogaster and S. pombe homologs of yeast HBS1 in the GenBank databases. The top BLASTX matches of the D. melanogaster and S. pombe sequences were human HBS1 (AAD00645), with e-values of 10-175 and 10-90, respectively. The D. melanogaster HBS1 (671 aa) has an N-terminal domain of 240 aa, as well as the EF-1{alpha}-like domain (fig. 1B ). The D. melanogaster gene seems to contain no introns. The S. pombe gene should encode a 592-aa protein, with an N-terminal domain of 170 aa. Comparison with the 5' end sequence of its cDNA (AU013219) reveals that there are two short introns (54 and 43 bp) in this gene (data not shown).

A partial gene sequence for HBS1 was also found in the C. albicans sequence database at the Stanford DNA Sequence and Technology Center. The top BLASTX match of the C. albicans sequence is the S. cerevisiae HBS1 (P32769), with an e-value of 3 x 10-81. Although the 5' end of this gene is truncated, the partial gene encodes 105 aa of an N-terminal domain and the entire EF-1{alpha}-like domain (422 aa) (fig. 1B ).

Phylogenetic Analyses
Bootstrap analysis for the {eRF1 + aRF1} data set statistically supported the initial divergence of G. lamblia and defined clades for A. thaliana and animal sequences (bootstrap scores 88, 99, and 76) in the inferred best ML tree (lnL = -6,900.46) (fig. 2A ). The removal of 37 constant sites did not significantly affect the support for the three branches mentioned above (fig. 2A ). We noticed that the number of invariable sites estimated by PUZZLE (Strimmer and von Haeseler 1996Citation ) was only 10, which is significantly different from the number of constant sites. Nevertheless, the bootstrap scores in the absence of 10 invariable sites did not drastically change (fig. 2A ). In the NJ tree based on the ML distance matrix, G. lamblia was inferred as the basal eukaryote (data not shown), although the bootstrap score for this node was only 30 (fig. 2A ). Support for the animal clade also dropped (bootstrap score 43), and the only node supported by a bootstrap score over 50 was for the A. thaliana clade (bootstrap score 88) (fig. 2A ).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 2.—The eRF1 and eRF3 phylogenies. A, The phylogenetic tree inferred from 11 eRF1 and 3 aRF1 seqeunces. The best maximum-likelihood (ML) tree (lnL = -6,900.46) was selected from 1,000 trees constructed by a heuristic search using the JTT-F model, implemented in PROTML, version 2.2 (Adachi and Hasegawa 1996Citation ). Support values over 50 are indicated. For the position of Giardia lamblia and the two clades of animals and Arabidopsis thaliana clades, bootstrap scores inferred from the data set in the absence of constant (CSR) or invariable sites (ISR) and with the neighbor-joining (NJ) method calibrated by a discrete {gamma} distribution (NJ-{gamma}) are shown. B, The phylogenetic tree inferred from the data set including 16 eRF3 and 3 eukaryotic EF-1{alpha} (eEF-1{alpha}) ({eRF3 + eEF-1{alpha}}) sequences. The best ML tree (lnL = -9,044.10) from 1,000 trees constructed by heuristic search using JTT-F was shown. Support values over 50 are indicated. For the position of G. lamblia and the animals-fungi clade, bootstrap scores obtained using all sites (Full), in the absence of constant sites (CSR) and with the NJ method calibrated by a discrete {gamma} distribution (NJ-{gamma}) are listed for three outgroups (eEF-1{alpha}, HBS1, and archaeal EF-1{alpha}). For the {eRF3 + archael EF-1{alpha}} data set, only bootstrap analysis using all sites was performed

 
The best ML tree inferred from the {eRF3 + eEF-1{alpha}} data set (lnL = -9,044.10) is shown in figure 2B. The tree inferred from the {eRF3 + HBS1} data set (lnL = -9,692.87) has the same topology (data not shown). The bootstrap analyses with the two data sets supported the branching of G. lamblia at the base of the eRF3 clade (bootstrap scores 74 and 72). In the absence of 47 and 42 constant sites in the {eRF3 + eEF-1{alpha}} and {eRF3 + HBS1} data sets, this protist was the only taxon suggested as the first branch with a bootstrap score over 50. The bootstrap scores for this node decreased from 74 to 62 for the {eRF3 + eEF-1{alpha}} data set and from 73 to 71 for the {eRF3 + HBS1} data set (fig. 2B ). Furthermore, the NJ trees based on the ML distance matrices for the two data sets suggest G. lamblia as the first eukaryotic branch, with bootstrap scores of 72 and 74, respectively (fig. 2B ).

The best ML tree based on the {eRF3 + aEF-1{alpha}} data set (lnL = -9,125.38) had a different topology than the two trees mentioned above (data not shown); the first eRF3 branch was T. vaginalis, with G. lamblia being connected to ciliates. However, the bootstrap analysis for this data set poorly supported the basal separation of T. vaginalis (bootstrap score 45) and the node for the G. lamblia–ciliates clade (bootstrap score 35). We did not perform bootstrap analysis in the absence of constant sites or with calibration of rate heterogeneity across sites for this data set.

Regardless of the outgroup or method used for the reconstruction of eRF3 phylogeny, the clades for animals, fungi, and ciliates were highly supported (partially listed in fig. 2B ), but the phylogenetic positions for ciliates and T. vaginalis were unresolved. Interestingly, N. tabacum was most closely related to the animals-fungi clade in the ML distance trees using the {eRF3 + eEF-1{alpha}} and {eRF3 +HBS1} data sets (bootstrap scores 60 and 52).

The best ML tree (lnL = -20,460.3) inferred from the data set including 48 sequences confirmed monophyly of eRF3, HBS1, eEF-1{alpha}, and aEF-1{alpha} sequences with RELL values at or near 100 (fig. 3 ). The three HBS1 sequences (newly identified in this study) formed a robust clade with the previously reported HBS1 orthologs. Consistent with the previous report (Wallrapp et al. 1998Citation ), the eRF3 and HBS1 clades form a robust sister relationship (RELL value 100).



View larger version (32K):
[in this window]
[in a new window]
 
Fig. 3.—The phyogenetic tree inferred from 16 eRF3, 7 HBS1, 16 eukaryotic, and 9 archaeal EF-1{alpha} sequences. The best maximum-likelihood tree (lnL = -20,460.3) was selected from 1,000 trees constructed by a heuristic using the JTT-F model in PROTML, version 2.2 (Adachi and Hasegawa 1996Citation ). Resampling estimated log likelihood values are indicated except for the interior nodes of the two EF-1{alpha} clades. EF-1{alpha} sequences used for this analysis were human (P04720), Caenorhabditis elegans (U40935), Schizosaccharomyce pombe (D82571), Podospora anserina (X74799), Dictyostelium discoideum (X55972), Physarum polycephalum (AF016243), Trypanasoma brucei (U10562), Euglena gracilis (X16890), Arabidopsis thaliana (X14631), Nicotiana tabacum (D63396), Porphyra purpurea (U08844), Tetrahymena pyriformis (D11083), Plasmodium knowlesi (AJ224153), Trichomonas vaginalis (AF058282), Giardia lamblia (D14342), Hexamita inflata (U37081), Sulfolobus solfataricus (X70701), Sulfolobus acidocaldarius (X52382), Desulfurococcus mobilis (X73582), Pyrococcus horikoshii (AP000006), Thermococcus celer (X52383), Archaeoglobus fulgidus (AE001039), Halobacterium maris-mortui (P48863), Methanobacterium thermoautotrophicum (AE000877), and Methanococcus jannaschii (U67486). The GenBank accession numbers for the eRF3 and HBS1 used in this analysis are listed in figure 1 .

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
The Position of G. lamblia
Phylogenetic analyses based on small-subunit ribosomal RNA (SSU rRNA) initially suggested that certain anaerobic protist lineages (among them, those containing G. lamblia and T. vaginalis) diverged early from the rest of the eukaryotes (the "crown eukaryotes") (Sogin 1997Citation ). This result was consistent with the notion that their lack of mitochondria was a primitive condition. However, mitochondrial-like cpn60 homologs have been described from G. lamblia and T. vaginalis (Roger, Clark, and Doolittle 1996Citation ; Roger et al. 1998Citation ). The most parsimonious explanation for these observations is that the two protists once harbored a mitochondrial endosymbiont (or a closely related {alpha}-proteobacterium) and transferred the cpn60 genes to their nuclear genomes. Furthermore, it has been claimed that the divergence of G. lamblia and/or T. vaginalis prior to that of the crown eukaryotes in some phylogenetic trees may be artifactual, a consequence of "long-branch attraction" (Embley and Hirt 1998Citation ; Philipé and Adoutte 1998Citation ). The results of careful analyses accounting for the rate heterogeneity seem to confirm the claim (e.g., Hirt et al. 1999Citation ; Stiller and Hall 1999Citation ).

Nevertheless, it seems reasonable to suppose that some eukaryotic lineages diverge before the crown radiation. One might hope to identify them by their possession of "more primitive," or at least differently composed, versions of molecular machinery than are known to be conserved among crown eukaryotes. For instance, early-diverging eukaryotic lineages might have only one or two tubulins (an outgroup to {alpha}, ß, and/or {gamma}) or exhibit a translation termination system more like that of archaea, with no eRF3 and eRF1 lacking a C-terminal tail for eRF3 binding (fig. 1A and below). Our analyses show that the EF-1{alpha} gene duplication which gave rise to eRF3 is found in those protist lineages generally thought the deepest, such as G. lamblia and T. vaginalis. Unless some as-yet-uncharacterized protist group claims that honor, we must conclude that this gene duplication occurred at the "base of the eukaryotes."

However, we provide two lines of evidence for G. lamblia being the most basal branch examined so far. First, there is reasonable bootstrap support for its branching earlier than T. vaginalis in eRF3 trees rooted by eEF-1{alpha} and HBS1 (fig. 2B ). The support for this node is not drastically reduced by removal of constant sites (fig. 2B ). Our results are in contrast to the recent report that the support for G. lamblia and T. vaginalis as branching prior to the crown radiation in the trees of the largest subunit of RNA polymerase II or EF-2 greatly depends on constant (or invariable) sites in the data set that potentially disturb phylogenetic inferences using PROTML (Hirt et al. 1999Citation ). In our case, even if the rate heterogeneity across sites in these data sets was fully compensated (using a discrete {gamma} distribution), the basal position of G. lamblia was statistically supported (fig. 2B ). We have no evidence that the position of G. lamblia deduced from the eRF3 phylogenies is artifactual, resulting from long-branch attraction.

The bootstrap scores obtained from the eRF1 phylogeny inferred from ML distance analysis (the NJ method considering a {gamma} distribution) supported only the A. thaliana clade, while, in contrast, the earliest branching of G. lamblia and the animal clade were robust in the ML analyses (fig. 2A ). The eRF3 phylogeny rooted with aEF-1{alpha} suggested no strongly supported basal taxon (see Results). Therefore, these trees did not provide further support for or against the G. lamblia position that was inferred from the former analyses.

Second, our data from the structure of the G. lamblia eRF3 gene, which differs from all other eRF3 genes in lacking an N-terminal domain (fig. 1B ), are most consistent with the basal divergence of this protist. If this is a primitive feature (i.e., the ancestral eRF3 lacked the N-terminal domain), then parsimony would dictate that G. lamblia is deeper than T. vaginalis (or any other eukaryote which has this additional domain). Further exploration of informative molecular characters (e.g., indels, presence/absence of molecular machineries) is essential to determine the relative branching points of G. lamblia and T. vaginalis among eukaryotes.

At this point, we cannot, unfortunately, be certain of the ancestral structure of eRF3. The available HBS1 genes all show an N-terminal domain not found for EF-1{alpha}, and our phylogenetic analysis indicates a close relationship between the two gene families (fig. 3 ). If HBS1 and eRF3 are products of a gene duplication occurring after the acquisition of this domain, the parsimony argument does not hold. However, there is no obvious sequence similarity between the N-terminal (non–EF-1{alpha}) portions of HBS1 and eRF3. The branches for eRF3 and HBS1 are extremely long (fig. 3 ), and thus the sister relationship of HBS1 and eRF3 may be a long-branch attraction artifact. More HBS1 data (particularly from protists) are very much needed to examine this possibility.

Evolution of the Translation Termination System
The ribosome and associated proteins arguably comprise the cell's most complex (and possibly oldest) macromolecular machine. Several of the constituent molecules responsible for its conserved and essential functions are, not surprisingly, conserved among bacteria, archaea, and eukaryotes. There are, however, substantial differences in function and molecular composition within and between domains, and homology of components has not always been easy to demonstrate (Kyrpides and Woese 1998a, 1998bCitation ).

Translation termination is the probably the least well understood facet of the translation process in either functional or evolutionary terms. Figure 4 presents a summary of the possible relationships between release factors based on our results and those of others (Keeling and Doolittle 1995Citation ; Ito et al. 1996Citation ). The principal release factors (bacterial RF1/RF2, archaeal/eukaryotic RF1) are probably all homologous to each other. Although we failed to detect any similarity between RF1/RF2 and eRF1 with sensitive local alignment programs such as TBLASTN and PSI-BLAST (Altschul et al. 1997Citation ), Ito et al. (1996)Citation have presented convincing alignments of the bacterial and eukaryotic sequences and suggested further that both are homologous to EF-G. Similarity between aRF1 and eRF1 is much more apparent (fig. 2A ). Archaeal translation termination, like most archaeal replication, transcription, and translation functions, is probably most like the corresponding process in eukaryotes. It also seems simplest to imagine that bacterial RF1 and RF2 are products of a bacterial domain–specific duplication (fig. 4 , numeral 3). However, it is formally possible that the use of a single factor to recognize all three stop codons is a derived condition for eukaryotes (and archaea).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 4.—A scheme for the evolution of elongation and release factors in Bacteria, Archaea, and Eukarya. Numeral 1 indicates the gene duplication of EF-Tu/1{alpha} and EF-G/2 in the last universal common ancestor. Numeral 2 suggests the emergence of ancestral release factors from EF-G based on the hypothesis of Ito et al. (1996). The ancestral molecule of the release factor in bacteria probably duplicated to produce RF1 and RF2 (3). RF3 arose from a duplication of the EF-G gene in the bacterial lineage (4). Numeral 5 indicates a gene duplication of EF-1{alpha} gene in the ancestral eukaryote. Corresponding to the emergence of eRF3, an acidic amino acid tail might have been added to the C-terminus of eRF1 (5'). We suggest that G. lamblia might have diverged (6) prior to the acquisition of the eRF3-specific N-terminal domain (*) and the subsequent divergence of eukaryotes (7)

 
In bacteria and eukaryotes, the principal release factors function in conjunction with a GTPase (RF3 or eRF3, respectively). These appear to be of independent origin, with the bacterial factor being a homolog of EF-G and the eukaryotic protein being a specific relative of EF-Tu (fig. 4 , numerals 4 and 5). (These elongation factors are themselves, of course, the products of an even more ancient gene duplication [fig. 4 , numeral 1]). It is reasonable to suppose that a similar GTPase plays a similar role in archaea. However, aRF1 lacks the C-terminal tail found on all eRF1 proteins and known in yeast to be necessary for eRF3 binding (fig. 1A ). Furthermore, six sequenced archaeal genomes contain no obvious homologs to either RF3 or eRF3 and no translation-related GTPase other than EF-1{alpha} or EF-2 (Bult et al. 1996Citation ; Klenk et al. 1997Citation ; Smith et al. 1997Citation ; Kawarabayashi et al. 1998, 1999Citation ). Therefore, it is possible that aRF1 works alone. In vitro termination assays show that eRF1 alone is capable of hydrolyzing peptidyl-tRNA in the absence of eRF3 and GTP (Frolova et al. 1994Citation ; Zhouravleva et al. 1995Citation ). Only when eRF3 is added does GTPase hydolysis (catalyzed by an eRF-eRF3-GTP-ribosome complex) accompany peptidyl-tRNA hydrolysis (Frolova et al. 1996Citation ). Similarly, in vivo studies show that an S. pombe eRF1 mutant protein lacking the ability to interact with eRF3 can complement ts mutations in yeast eRF1 (Ito, Ebihara, and Nakamura 1998Citation ).

Our results most directly address the evolution of the translation termination system in eukaryotes after their divergence from prokaryotes. All examined eukaryotes, including T. vaginalis and G. lamblia, have eRF3. All described eRF1 proteins, including that of G. lamblia, have the C-terminal tail necessary for interaction with eRF3. Until eukaryotes branching before the divergence of G. lamblia are identified and shown to lack these features, it seems reasonable to suppose that the duplication of eEF-1{alpha} which produced eRF3 and the addition of the eRF3-binding domain to eRF1 occurred at the base of the eukaryotes (fig. 4 , numerals 5 and 5'). It is possible, although not proven here, that G. lamblia diverged prior to the acquisition by eRF3 of its N-terminal ("prion") domain (fig. 4 , numeral 6 and asterisk).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 
We thank J. M. Archibald for the genomic DNA samples of G. lamblia, M. Muller for the genomic DNA sample of T. vaginalis, D. M. Prescott for the macronuclear DNA samples of O. trifallax and E. aediculatus, and A. Stoltzfus for mol2con.pl. We also thank A. J. Roger for helping with ML distance analyses and members of the Doolittle lab for helpful discussion and critical review of this manuscript. Sequencing of C. albicans was accomplished with the support of the NIDR and the Burroughs Wellcome Fund. Y.I. was supported by a research fellowship from the Japanese Society for the Promotion of Science for Young Scientists, abroad. This work was supported by grant ML4465 from the Canadian Medical Research Council to W.F.D.


    Footnotes
 
Masami Hasegawa, Reviewing Editor

1 Abbreviations: aRF1, archaeal release factor 1; eEF-1{alpha}, eukaryotic EF-1{alpha}; aEF-1{alpha}, archaeal EF-1{alpha}; eRF1, eukaryotic release factor 1; eRF3, eukaryotic release factor 3; ML, maximum likelihood; PCR, polymerase chain reaction; RF1, bacterial release factor 1; RF2, bacterial release factor 2; RF3, bacterial release factor 3; ts, temperature-sensitive. Back

2 Keywords: translation termination eukaryotic release factor EF-1{alpha} HBS1 gene duplication Giardia. Back

3 Address for correspondence and reprints: Yuji Inagaki, Program in Evolutionary Biology, Canadian Institute for Advanced Research, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada B3H 4H7. E-mail: yinagai{at}is.dal.ca Back


    literature cited
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 literature cited
 

    Adachi, J., and M. Hasegawa. 1996. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Comput. Sci. Monogr. 28:1–150.

    Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402.[Abstract/Free Full Text]

    Breining, P., and W. Piepersberg. 1986. Yeast omnipotent supressor SUP1 (SUP45): nucleotide sequence of the wildtype and a mutant gene. Nucleic Acids Res. 14:5187–5197.[Abstract]

    Bult, C. J., O. White, G. J. Olsen et al. (40 co-authors). 1996. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273:1058–1073.

    Chen, X., and R. Wu. 1997. Direct amplification of unknown genes and fragments by Uneven polymerase chain reaction. Gene 185:195–199.

    Dennis, P. P. 1997. Ancient ciphers: translation in Archaea. Cell 89:1007–1010.

    Eaglestone, S. S., B. S. Cox, and M. F. Tuite. 1999. Translation termination efficiency can be regulated in Saccharomyces cerevisiae by environmental stress through a prion-mediated mechanism. EMBO J. 18:1974–1981.[Abstract/Free Full Text]

    Embley, T. M., and R. P. Hirt. 1998. Early branching eukaryotes? Curr. Opin. Genet. Dev. 8:624–629.[ISI][Medline]

    Felsenstein, J. 1993. PHYLIP (phylogeny inference package). Distributed by the author, Department of Genetics, University of Washington, Seattle.

    Frolova, L., X. Le Goff, H. H. Rasmussen et al. (12 co-authors). 1994. A highly conserved eukaryotic protein family possessing properties of polypeptide chain release factor. Nature 372:701–703.

    Frolova, L., X. Le Goff, G. Zhouravleva, E. Davydova, M. Philippe, and L. Kisselev. 1996. Eukaryotic polypeptide chain release factor eRF3 is an eRF1- and ribosome-dependent guanosine triphosphatase. RNA 2:334–341.

    Garcia-Cantalejo, J., V. Baladron, P. F. Esteban, M. A. Santos, G. Bou, M. A. Remacha, J. L. Revuelta, J. P. Ballesta, A. Jimenez, and F. Del Rey. 1994. The complete sequence of an 18,002 bp segment of Saccharomyces cerevisiae chromosome XI contains the HBS1, MRP-L20 and PRP16 genes, and six new open reading frames. Yeast 10:231–245.

    Higgins, D. G., and P. M. Sharp. 1988. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73:237–244.

    Hirt, R. P., J. M. Logsdon Jr., B. Healy, M. W. Dorey, W. F. Doolittle, and T. M. Embley. 1999. Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. Proc. Natl. Acad. Sci. USA 96:580–585.

    Ito, K., K. Ebihara, and Y. Nakamura. 1998. The stretch of C-terminal acidic amino acids of translational release factor eRF1 is a primary binding site for eRF3 of fission yeast. RNA 4:958–972.

    Ito, K., K. Ebihara, M. Uno, and Y. Nakamura. 1996. Conserved motifs in prokaryotic and eukaryotic polypeptide release factors: tRNA-protein mimicry hypothesis. Proc. Natl. Acad. Sci. USA 93:5443–5448.

    Kawarabayashi, Y., Y. Hino, H. Horikawa et al. (25 co-authors). 1999. Complete genome sequence of an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1. DNA Res. 6:83–101, 145–152.[Medline]

    Kawarabayashi, Y., M. Sawada, H. Horikawa et al. (30 co-authors). 1998. Complete sequence and gene organization of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3. DNA Res. 5:55–76.[Medline]

    Keeling, P. J., and W. F. Doolittle. 1995. Archaea: narrowing the gap between prokaryotes and eukaryotes. Proc. Natl. Acad. Sci. USA 92:5761–5764.

    Klenk, H. P., R. A. Clayton, J. F. Tomb et al. (51 co-authors). 1997. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390:364–370.

    Kochneva-Pervukhova, N. V., S. V. Paushkin, V. V. Kushnirov, B. S. Cox, M. F. Tuite, and M. D. Ter-Avanesyan. 1998. Mechanism of inhibition of Psi+ prion determinant propagation by a mutation of the N-terminus of the yeast Sup35 protein. EMBO J. 17:5805–5810.[Abstract/Free Full Text]

    Kyrpides, N. C., and C. R. Woese. 1998a. Archaeal translation initiation revisited: the initiation factor 2 and eukaryotic initiation factor 2B alpha-beta-delta subunit families. Proc. Natl. Acad. Sci. USA 95:3726–3730.

    ———. 1998b. Universally conserved translation initiation factors. Proc. Natl. Acad. Sci. USA 95:224–228.

    Nelson, R. J., T. Ziegelhoffer, C. Nicolet, M. Werner-Washburne, and E. A. Craig. 1992. The translation machinery and 70 kd heat shock protein cooperate in protein synthesis. Cell 71:97–105.

    Philipé, H., and A. Adoutte. 1998. The molecular phylogeny of Eukaryota: solid facts and uncertainties. Pp. 25–56 in G. H. Coombs, K. Vickerman, M. A. Sleigh, and A. Warren, eds. Evolutionary relationship among protozoa. Chapman and Hall, London.

    Prescott, D. M. 1994. The DNA of ciliated protozoa. Microbiol. Rev. 68:233–267.

    Roger, A. J., C. G. Clark, and W. F. Doolittle. 1996. A possible mitochondrial gene in the early-branching amitochondriate protist Trichomonas vaginalis. Proc. Natl. Acad. Sci. USA 93:14618–14622.

    Roger, A. J., O. Sandblom, W. F. Doolittle, and H. Philippé. 1999. An evaluation of elongation factor 1 alpha as a phylogenetic marker for eukaryotes. Mol. Biol. Evol. 16:218–233.[Abstract]

    Roger, A. J., S. G. Svard, J. Tovar, C. G. Clark, M. W. Smith, F. D. Gillin, and M. L. Sogin. 1998. A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria. Proc. Natl. Acad. Sci. USA 95:229–234.

    Smith, D. R., L. A. Doucette-Stamm, C. Deloughery et al. (37 co-authors). 1997. Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics. J. Bacteriol. 179:7135–7155.[Abstract]

    Sogin, M. L. 1997. History assignment: when was the mitochondrion founded? Curr. Opin. Genet. Dev. 7:792–799.[ISI][Medline]

    Stansfield, I., K. M. Jones, V. V. Kushnirov, A. R. Dagkesamanskaya, A. I. Poznyakovski, S. V. Paushkin, C. R. Nierras, B. S. Cox, M. D. Ter-Avanesyan, and M. F. Tuite. 1995. The products of the SUP45 (eRF1) and SUP35 genes interact to mediate translation termination in Saccharomyces cerevisiae. EMBO J. 14:4365–4373.

    Stansfield, I., and M. F. Tuite. 1994. Polypeptide chain termination in Saccharomyces cerevisiae. Curr. Genet. 25:385–395.

    Stiller, J. W., and B. D. Hall. 1999. Long-branch attraction and the rDNA model of early eukaryotic evolution. Mol. Biol. Evol. 16:1270–1270.[Free Full Text]

    Strimmer, K., and A. von Haeseler. 1996. Quartet puzzling: a quartet maximum likelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13:964–969.[Free Full Text]

    Ter-Avanesyan, M. D., V. V. Kushnirov, A. R. Dagkesamanskaya, S. A. Didichenko, Y. O. Chernoff, S. G. Inge-Vechtomov, and V. N. Smirnov. 1993. Deletion analysis of the SUP35 gene of the yeast Saccharomyces cerevisiae reveals two non-overlapping functional regions in the encoded protein. Mol. Microbiol. 7:683–692.[ISI][Medline]

    Wallrapp, C., S. B. Verrier, G. Zhouravleva, H. Philippe, M. Philippe, T. M. Gress, and O. Jean-Jean. 1998. The product of the mammalian orthologue of the Saccharomyces cerevisiae HBS1 gene is phylogenetically related to eukaryotic release factor 3 (eRF3) but does not carry eRF3-like activity. FEBS Lett. 440:387–392.[ISI][Medline]

    Zhouravleva, G., L. Frolova, X. Le Goff, R. Le Guellec, S. Inge-Vechtomov, L. Kisselev, and M. Philippe. 1995. Termination of translation in eukaryotes is governed by two interacting polypeptide chain release factors, eRF1 and eRF3. EMBO J. 14:4065–4072.[Abstract]

Accepted for publication February 7, 2000.