(Received for publication, March 19, 1996, and in revised form, December 12, 1996)
From the Department of Biology, University of Toledo, Toledo, Ohio 43606-3390 and the § Department of Biochemistry, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106-4935
Initiation of translation in eukaryotes is mediated by a set of initiation factors. Mammalian initiation factor 3 is composed of at least 8 subunits, with the largest being about 180 kDa in size. Here we report the cloning of the p180 subunit of human eukaryotic translation initiation factor (eIF) 3. The amino acid sequence deduced from the cDNA agrees with the sequences of CNBr fragments of eIF-3, confirming the identity of the clone. The 1382 amino acid open reading frame contains a high percentage of charged residues (48%) and an unusual repetitive domain near the carboxyl terminus composed of 25 repeats of 10 amino acids each. Data base searches identified related sequences found in members of the plant and fungal kingdoms as well as in other mammals and the nematode Caenorhabditis elegans. These sequences share significant identity with the human clone and probably represent the homologues of the p180 subunit in these organisms. This is the first report identifying the sequence of the large subunit of eIF-3.
Eukaryotic translation initiation factor (eIF)1 3 is the largest of the protein synthesis initiation factors, with a size of about 650 kDa. eIF-3 purified from rabbit reticulocyte lysate consists of at least eight individual polypeptide chains (1, 2). Originally, eIF-3 was identified as a factor that binds to the 40 S ribosomal subunit and thereby prevents the association of the 40 and 60 S subunits with one another. This results in a pool of 40 S subunits, which are then able to participate in the initiation process.
However, eIF-3 has also been implicated in a number of additional roles. One example is the association of eIF-3 with eIF-4F, where the interaction is sufficiently stable such that 0.5 M KCl is required to separate the two initiation factors (3). eIF-3 interacts with a number of initiation factors in addition to eIF-4F. This suggests that eIF-3 may be the major protein which aligns the factors so that the mRNA is correctly positioned for initial binding to the 40 S subunit and the subsequent identification of the initiating AUG.
Of the many associations of eIF-3 with other translational components,
most appear to be via the p180 subunit. The ability of eIF-3 to bind to
and stabilize the ternary complex (eIF-2·GTP·Met-tRNAi) is dependent on the p180 subunit, since preparations that have been
depleted of p180 fail to promote the formation of the ternary complex
(4). The p180 subunit of eIF-3 has also been shown to interact with
eIF-4B using "Far Western" blotting.2
As noted above, eIF-3 interacts with eIF-4F. The site of interaction between eIF-3/eIF-4F has been mapped to the middle region (amino acids
480-886) of the subunit of eIF-4F (5), but the binding site in
eIF-3 has not yet been identified. The limited biochemical evidence
indicates that the p180 subunit is important for all of these
interactions and processes.
Many mammalian factors can replace their yeast counterparts in vivo (1), suggesting extensive homology between the factors. For eIF-3, the only detailed comparison that can be made is between mammalian and wheat germ eIF-3, and, in this instance, the extensive similarity seen between the translation factors appears to break down (1). The differences in pI and apparent molecular weight have made it difficult to compare eIF-3 from different sources or infer functions of the subunits.
At present, much of what is known about eukaryotic protein synthesis has been learned from fractionated, cell-free systems. With the availability of cloned genes and cDNAs, it has been possible to begin to manipulate the translation factors either by altering the levels of expression or by site-directed mutagenesis. To date, cDNA clones have been obtained for all the known initiation factors except eIF-3 and eIF-6. Just recently, cDNA clones have been identified for two subunits of the yeast Saccharomyces cerevisiae eIF-3 (6-9). As an initial effort to better understand the role and regulation of eIF-3, we have obtained full-length cDNA clones for the p180 subunit of human eIF-3. Data base searches revealed sequences from S. cerevisiae, the nematode Caenorhabditis elegans, and the plant Nicotiana tabacum with significant similarity to the human cDNA. The sequences likely represent the corresponding subunits of eIF-3 in these organisms.3
Hybridoma 116, a cell fusion
between a rat spleen cell and mouse myeloma Sp2/0-Ag14 (10), was
isolated while searching for monoclonal antibodies (mAb) against
cadherin-related proteins. The immunoglobulin secreted by this
hybridoma is called mAb 116. Acrylamide and bisacrylamide were from
Bio-Rad; molecular biology grade urea and DNA size markers were from
Life Technologies, Inc. Size markers and other electrophoresis reagents
were from Sigma. [-32P]dCTP and
[
-35S]dATP were purchased from Amersham Corp.
Tran35S-labelTM was from ICN Biochemicals. The human liver,
human keratinocyte, and HL-60 cell (11) cDNA libraries in
gt11
were from Clontech (Palo Alto, CA). Fetal calf serum was purchased from
HyClone Laboratories (Logan, UT). Other tissue culture reagents were
from Sigma. Components of bacteriological media were
purchased from Difco. Fluorescein-conjugated anti-rat IgG was from
Organon-Teknika (Duram, NC).
Normal human keratinocytes were isolated and
cultured as described (12). JAR choriocarcinoma cells (13) and A-431
cervical carcinoma cells (14) were cultured in Dulbecco's modified
Eagle's medium supplemented with penicillin/streptomycin and 10%
fetal calf serum. For metabolic labeling, JAR cells were starved in methionine-free medium for 4 h and labeled for 2 h with 1 mCi of Tran35S-label/3 ml of medium. Cells were extracted in 10 mM Tris acetate, pH 8.0, 0.5% Nonidet P-40, 1 mM CaCl2; the resulting homogenate was spun for
30 min at 15,000 × g; and the supernatant was used for
immunoprecipitation or SDS-gel electrophoresis. For immunofluorescence, cells were grown on glass coverslips, fixed in 4% paraformaldehyde, and permeabilized with methanol at 20 °C. Incubation in hybridoma supernatant was followed by incubation in fluorescein-conjugated anti-rat IgG. The cells were viewed in a Zeiss Axiophot microscope equipped with epifluorescence and photographed on Kodak T-MAX 3200 film.
Screening gt11 libraries with mAb
116 was performed with slight modifications (15) to established
procedures (16). Lysogens were prepared and induced as described (16).
Induced lysogens were lysed by repeated freeze/thaw, extracted with
Nonidet P-40, and debris removed by centrifugation. Aliquots of
clarified extract were passed through an anti-
-galactosidase column.
Bound material was eluted with 50 mM diethylamine, dialyzed
against phosphate-buffered saline, and injected into a rabbit using
standard techniques (17). Screening with 32P-labeled probes
was performed as described (18). Phage clones were precipitated from
plate lysates with ammonium sulfate (19). Subcloning of inserts into
pUC18/19 (20), exonuclease III deletions, and sequencing were performed
as described (21). Sequence comparisons were made using the facilities
at the National Center for Biotechnology Information and the European
Molecular Biology Laboratory (GenBank release 92). Sequence alignments
were made using the PILEUP and GAP programs of the University of
Wisconsin Genetics Computer Group. Consensus sequences were identified
using the PROSITE data base (22) available on the EMBL
server.4 Molecular weight and isoelectric point
calculations were performed using the ExPASy molecular biology server
of the Geneva University Hospital and the University of
Geneva.5
Proteins were resolved on SDS-polyacrylamide
gels (23). For immunodetection, proteins resolved on SDS gels were
transferred to nitrocellulose (24). Blots were blocked with bovine
serum albumin and then incubated sequentially with primary and
alkaline-phosphatase-conjugated secondary antibodies. Positive bands
were detected with nitro blue tetrazolium and
5-bromo-4-chloro-3-indolyl phosphate. Immunoprecipitations were
performed as described (25). Protein molecular size markers and their
designated molecular masses were: myosin, 205 kDa; -galactosidase, 116 kDa; phosphorylase b, 97 kDa; bovine serum albumin, 68 kDa; ovalbumin, 45 kDa; and carbonic anhydrase, 30 kDa.
eIF-3 was purified from the 0.5 M KCl salt wash of rabbit reticulocyte polysomes as described previously (3). Briefly, the steps involved in purification were: batch chromatography on phosphocellulose (150-450 mM KCl, Whatman P-11 phosphocellulose), sucrose density-gradient centrifugation in 500 mM KCl, gradient elution from DEAE-cellulose, and gradient elution from phosphocellulose. eIF-3 activity was monitored using the hemoglobin synthesis assay and protein purity was monitored using SDS-gel electrophoresis of column fractions.
Amino Acid Sequencing of eIF-3Preparations of eIF-3 were adjusted to 70% formic acid, and a 100-fold excess of CNBr was added to the solution. After an overnight incubation in the dark at room temperature, additional CNBr was added and the incubation continued for a total of 24 h. The sample was then diluted with 10 volumes of water and evaporated to dryness. Twice the sample was rehydrated and dried. Following the third lyophilization, the sample was dissolved in SDS sample buffer with heating at 90 °C for 10 min. Peptide fragments were then resolved by SDS-gel electrophoresis in an 18% gel. Peptide fragments were transferred to a polyvinylidene difluoride membrane (ImmobilonTM, Millipore Corp., Bedford, MA), stained with Coomassie Blue, and destained with 7% acetic acid, 5% methanol. After extensive destaining to remove both unbound Coomassie Blue and any Tris and glycine, stained bands were cut out and subjected to protein sequencing. Amino acid sequencing was performed with an Applied Biosystems model 477A microsequencer with on-line phenylthiohydantoin analysis in the Molecular Biology Core Laboratory at Case Western Reserve University.
mAb 116 was isolated while attempting to produce antibodies
against cadherin-related proteins. A protein fraction containing E-cadherin and its associated proteins was used as the antigen. The mAb
was used for immunofluorescence of confluent monolayers of JAR
epithelial cells (Fig. 1), where it recognized a
cytoplasmic antigen. mAb 116 also recognized a cytoplasmic antigen in
other normal and transformed cells (data not shown). In each case,
there was a conspicuous lack of signal both in the nucleus and at the periphery of the cells. Immunoblots of proteins separated by SDS-PAGE showed that mAb 116 recognized a protein of about 160 kDa in cell extracts (Fig. 2, lane 3). The faster
migrating bands are presumed degradation products of the 160-kDa band
(see also Fig. 5). Immunoprecipitation using extracts of
35S-labeled cells revealed only a diffuse signal in the
high molecular mass portion of the gel. However, several distinct bands
whose migration on SDS gels suggested sizes of approximately 116, 60, 47, 37, and 35 kDa (Fig. 2, lane 2) were
co-immunoprecipitated, suggesting that the 160-kDa antigen was part of
a protein complex.
In order to further characterize the antigen, a human liver gt11
library was screened with mAb 116. Two positive clones were found in
the approximately 2·106 plaques that were screened.
Lysogens of both clones were prepared in Escherichia coli
Y1089 (16). mAb 116 recognized fusion proteins of about 170 kDa in
extracts of the lysogens; the synthesis of these proteins was induced
with isopropyl-1-thio-
-D-galactopyranoside (data not
shown). The
-galactosidase fusion protein was prepared from one of
the clones by affinity chromatography on an anti-
-galactosidase column, and rabbit antiserum was raised against the fusion protein. This antiserum reacted with a single protein in immunoblots of extracts
of tissue culture cells; this protein co-migrated with the antigen
recognized by mAb 116 (data not shown). These data strongly suggested
that the cDNA clones encoded the antigen recognized by mAb 116.
The inserts in the two antibody-positive clones were removed from
purified bacteriophage gt11 DNA and subcloned into pUC19. Sequencing
the ends of the clones revealed that the clones overlapped (Fig.
3). Data base searches showed that the two sequences
were unique. Additional clones were isolated from hybridization screens of several cDNA libraries. The longest clone (4.3 kb, Fig. 3) was
found in an HL-60 cell library. The 2.7- and 1.5-kb EcoRI fragments from this clone were completely sequenced on both strands, using a combination of exonuclease III deletions and restriction fragments. When the 5
and 3
ends of the insert in this clone were
sequenced using
gt11 primers, it was seen that the 5
and 3
EcoRI sites were authentic cDNA sites; thus, the clone
was not full-length on either end. An EcoRI fragment from an
independent HL-60 clone extended to a polyadenylation consensus
sequence (Fig. 3), but no clone was identified that extended further in
the 5
direction. A randomly primed human keratinocyte library was then screened with a probe from the 5
end of the 2.7-kb EcoRI
fragment. This screen resulted in the isolation of a number of clones
that extended further 5
. The longest of these was chosen for
sequencing.
The resulting composite cDNA sequence (Fig.
4A) is 5256 nucleotides long. The first ATG
in the sequence is predicted to be the start codon. It is preceded by
an in-frame stop codon and inaugurates a 1382-amino acid open reading
frame (ORF). This ATG is preceded by a purine in the 3 position, so
it has some of the characteristics of an optimal start codon (26). The
5
-untranslated region (UTR) in this clone is thus predicted to be 113 nucleotides in length. The predicted stop codon is quickly followed by
additional stop codons in all three reading frames. The resulting
3
-UTR is about 1 kb in length. Six ATTTA sequences that can be
associated with rapid turnover of mRNA (27, 28) are present in this
region, as are two potential polyadenylation signals (AATAAA; Ref. 29). The clone that extended furthest in the 3
direction terminated soon
after the second potential polyadenylation signal. The deduced amino
acid sequence (Fig. 4B) represents a protein of 166 kDa and
pI of 6.4 that is highly charged; 664 residues (48%) are acidic (Asp
and Glu) or basic (Lys and Arg). There is an unusual repetitive domain
near its carboxyl terminus (amino acids 925-1172, shown in
lowercase letters in Fig. 4B). This is addressed
further under "Discussion."
The complex identified by mAb 116 had a composition similar to that
reported for mammalian eIF-3 (1). When a purified preparation of rabbit
reticulocyte eIF-3 (see "Materials and Methods") was probed with
the antiserum raised against the -galactosidase fusion protein, the
serum recognized the p180 band in the eIF-3 preparation. This
reticulocyte protein co-migrated with the band recognized by the
antiserum in extracts of human cultured cells (Fig.
5).
In order to confirm that the cDNA encoded the p180 subunit of eIF-3, rabbit eIF-3 was subjected to CNBr fragmentation. The fragments were resolved by SDS-gel electrophoresis, and amino-terminal sequences were determined. Three of the fragments produced relevant amino acid sequences (Table I). Where positive identifications were made, 56 out of 56 matched the amino acid sequence deduced from the cDNA. Five residues were consistent with, but not proof of, the residues observed in the deduced amino acid sequence. If peptides 1 and 2 are placed in the deduced sequence, methionine precedes the first amino acid of each peptide, consistent with the specificity of CNBr cleavage. One inferred difference was observed when the third peptide was placed in the deduced amino acid sequence. In the human sequence, the residue just before lysine 824 is leucine, not methionine. It is possible that amino acid 823 in rabbit p180 is methionine, but this has not been determined. These data confirm the identity of the cDNA clone as encoding the large subunit of human initiation factor 3.
|
When the composite human cDNA sequence was compared to the data bases, a mouse sequence called centrosomin was identified (Ref. 30; accession nos. X17373[GenBank] and X84651[GenBank] for centrosomin A and B, respectively). As discussed below, centrosomin A and B probably represent partial cDNA clones for the p180 subunit of mouse eIF-3. In addition to centrosomin, 30 human, 2 mouse, and 1 rat expressed sequence tags (EST) were identified that were almost identical to portions of the sequence presented in Fig. 4A. Sequences from the nematode C. elegans, the dicotyledenous plant N. tabacum (tobacco), and the yeast S. cerevisiae also showed short stretches of homology with the human sequence. All of these homologies are discussed further below.
When the protein sequence derived from the human cDNA was used to
search the protein data bases, three sequences in addition to the
deduced sequence of centrosomin were identified. One was an ORF found
in a cosmid prepared from S. cerevisiae.6 No introns were predicted to
interrupt this open reading frame. Another ORF was found in a cosmid
prepared from C. elegans.7 In this
case, the computer algorithm used to identify putative introns and
exons identified 6 introns. Support for the position of one of these
predicted introns is found in the partial sequence of a C. elegans cDNA clone.8 The deduced amino
acid sequence of the C. elegans protein presented in Fig.
6 is derived from the conceptual splicing of the exons. The third ORF was from a cDNA clone isolated from N. tabacum.9 The S. cerevisiae,
C. elegans, and N. tabacum open reading frames are 964, 1076, and 958 residues, respectively. A comparison of all the
deduced amino acid sequences is presented in Fig. 6.
A monoclonal antibody was isolated that recognized a cytoplasmic protein of approximately 160 kDa. The antigen co-immunoprecipitated with several other proteins, suggesting it resided in a protein complex. The composition of the complex was similar to that reported for eIF-3. This identification was confirmed by cross-reactivity of the anti-fusion protein serum with the p180 subunit of rabbit eIF-3 and by aligning the sequence of peptides derived from purified rabbit eIF-3 with that of the sequence deduced from cDNA clones. Thus, we have isolated cDNA clones encoding the largest subunit of human eIF-3.
Several items are consistent with the human cDNA sequence being full-length. The predicted start codon is preceded by an in-frame stop codon. The ORF is 1382 amino acids and encodes a protein with a calculated molecular mass of 166 kDa that is, within experimental error, the same as the relative molecular mass of the p180 subunit suggested by SDS gels. Finally, the human sequence can be aligned with related sequences from a yeast, a nematode, and a plant starting with virtually the first residues in the ORFs (Fig. 6).
When the deduced amino acid sequence was run against the PROSITE data base (22), sites for protein kinases A and C, casein kinase II, and tyrosine kinases were identified. Although it is known that p180 is a phosphoprotein, no studies have determined either the site(s) phosphorylated or the kinases and phosphatases involved.
As mentioned earlier, the human sequence has similarity to a mouse clone called centrosomin (30). The extent of the similarity is shown in Fig. 4B, where the centrosomin B nucleotide sequence has been translated in varying reading frames in order to maximize the similarity with the deduced human protein. Twenty of the ESTs identified in the nucleotide homology search include sequences corresponding to the coding region of eIF-3. These ESTs provide support for the nucleotide sequence reported in Fig. 4A; they also support the reading frame changes introduced into the centrosomin sequence corresponding to amino acids 421 and 468, as well as the inclusion of glycine 921 (see Fig. 4B). The other changes introduced into the centrosomin sequence are not covered by any of the ESTs.
The polypeptide deduced from the centrosomin nucleotide sequence is almost identical to the human sequence, making it likely that centrosomin represents a partial cDNA clone for mouse p180. The high degree of sequence identity between human, rabbit, and (what is likely) mouse p180 is characteristic of the mammalian protein synthesis translation factors, which often show greater than 98% identity (31).
The repetitive domain of the deduced amino acid sequence of p180 is
shown in detail in Fig. 7. Of the 248 residues included in the domain, 144 (58%) are charged. All 68 of the basic residues are
arginine; 64 of the 76 acidic residues are aspartate. The sequence
deduced from the centrosomin clone also spans this region (Fig.
4B). However, it was necessary to introduce gaps into the centrosomin sequence in order to maximize the alignment of identical residues, as well as to maintain the spacing patterns of those repeats
that are not exactly 10 residues in length. This suggests there is
variability in the number of repeats in vertebrates. The sequences from
more distant organisms do not contain a repetitive domain (Fig. 6). The
existence of a repeating element in the p180 subunit is reminiscent of
the 10 copies of the pseudo repeat DRYR in eIF-4B (32). Since this
portion of eIF-4B interacts with the p180 subunit of
eIF-3,2 the repeated elements in both proteins may be
responsible for their interaction. However, it should be noted that the
repeated elements are lacking in both yeast eIF-4B (33, 34) and the large subunit of yeast eIF-3 (Fig. 6).
The biochemical data available for mammalian and plant eIF-3 can be compared to characteristics inferred from the cDNA clones. The predicted molecular mass values roughly correspond to determinations by SDS-gel electrophoresis, although in both cases the SDS estimates are larger (166 versus 180 kDa for mammalian and 111 versus 116 kDa for plant). Although not as well characterized, it appears the SDS molecular mass of the large subunit of yeast eIF-3 (130 kDa; Refs. 6, 7, and 9) also exceeds that deduced from the nucleotide sequence (110 kDa). These discrepancies may reflect the high percentage of charged residues in the proteins. The isoelectric points calculated for the human (6.4) and tobacco (9.4) proteins are consistent with those determined for the rabbit (6.7) and wheat (greater than 8) proteins. The biochemical data reveal heterogeneity in the sizes of the subunits comprising eIF-3 and suggest that eIF-3 of varying compositions may be found when the factor is isolated from other organisms. The ORFs found in the tobacco and nematode sequences are consistent with heterogeneity in the size of the large subunit of eIF-3 (Fig. 6).
Fig. 6 shows a comparison between all four deduced protein sequences as well as the consensus; Table II shows the results of pairwise comparisons of the sequences. Compared to other translation factors, the identity between the yeast and human eIF-3 subunits is somewhat lower (31). In all four sequences, the NH2-terminal regions show more similarity than the COOH-terminal regions. The dissimilarities become more pronounced when the repetitive domain of the human sequence is reached.
|
The data in Figs. 4 and 6 together with the protein studies on rabbit eIF-3 suggest that the mammalian p180 subunits are unusually large when compared to the subunits of other organisms. It has been reported that yeast eIF-3 will replace HeLa eIF-3 in the methionyl-puromycin assay (6), again suggesting a general conservation of the pathway for initiation of protein synthesis (35). However, this is surprising given the size difference, relatively low sequence identity, and lack of a repetitive domain in the yeast large subunit. It may be that the function(s) of the repetitive domain found in the mammalian subunit is provided by a different subunit in other organisms. In this context, it is noteworthy that wheat germ eIF-3 is reported to contain more subunits than mammalian eIF-3 (1). However, yeast eIF-3 appears to contain the same number of subunits, with each being 10-20% smaller than its mammalian counterpart (7, 8). The availability of genes for subunits of eIF-3 combined with the power of yeast genetics make it likely that, in the near future, more will be known of the function of the high molecular weight complex eIF-3.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U78311[GenBank].
We thank Rebecca Castle for technical assistance and Dr. Margaret Wheelock for helpful discussion and assistance throughout the project.
As this manuscript was under review, unidentified sequences of a human cDNA clone (accession no. D50929[GenBank], submitted by N. Nomura) and a partial human genomic clone (accession no. U58047[GenBank], submitted by J. K. Scholler) were placed in the data bases. These sequences correspond to the large subunit of eIF-3. In addition, the unidentified sequence of a mouse cDNA clone (accession no. U14172[GenBank], submitted by R. Fisher) corresponding to the large subunit of eIF-3 has been placed in the data bases.