(Received for publication, January 13, 1997)
From the Department of Biochemistry, Dartmouth Medical School, Hanover, New Hampshire 03755 and the § Harvard Microchemistry Facility, Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02318
A 60-kDa protein that undergoes rapid tyrosine phosphorylation in response to insulin and then binds phosphatidylinositol 3-kinase has been previously described in adipocytes and hepatoma cells. We have isolated this protein, referred to as pp60, from rat adipocytes, obtained the sequences of tryptic peptides, and cloned its cDNA. The predicted amino acid sequence of pp60 reveals that it contains an N-terminal pleckstrin homology domain, followed by a phosphotyrosine binding domain, followed by a group of likely tyrosine phosphorylation sites, four of which are in the YXXM motif that binds to the SH2 domains of phosphatidylinositol 3-kinase. The overall architecture of pp60 is thus the same as that of insulin receptor substrates 1 and 2 (IRS-1 and IRS-2), and furthermore both the pleckstrin homology and phosphotyrosine binding domains are highly homologous (about 50% identical amino acids) to these domains in both IRS-1 and IRS-2. Thus, pp60 is a new member of the IRS family, which we have designated IRS-3.
The insulin receptor is a tyrosine kinase that is activated upon insulin binding. Signaling from this receptor proceeds primarily by its tyrosine phosphorylation of substrate proteins, which then act as docking proteins for one or more SH21 domain-containing proteins. Docking of these proteins in turn activates specific signal transduction pathways. The substrate docking proteins that have been molecularly characterized to date are the closely related IRS-1 and IRS-2 as well as SHC (reviewed in Refs. 1 and 2). One protein of this type that hitherto has not been cloned is a 60-kDa protein described in rat adipocytes and rat hepatoma cells. This 60-kDa protein, referred to as pp60, is rapidly tyrosine phosphorylated in response to insulin and in this form associates with the 85-kDa regulatory subunit of PI 3-kinase (3, 4). The interaction between these two proteins involves the binding of Tyr(P) to either or both of the SH2 domains of the 85-kDa subunit, because each SH2 domain by itself binds the Tyr(P) form of pp60 (3, 4). In the present study, we have isolated pp60 from insulin-treated rat adipocytes and then cloned its cDNA. The predicted amino acid sequence shows that pp60 is a new member of the IRS family.
GST itself and the GST fusion protein with the N-terminal SH2 domain of the 85-kDa subunit of PI 3-kinase (GST-NSH2) were prepared as described (5). These were covalently attached via sulfhydryl groups on the GST (the SH2 domain itself has no sulfhydryl groups) to activated thiopropyl Sepharose 6B beads (Sigma). The beads with coupled GST (1.3 mg/ml) or GST-NSH2 (1.8 mg/ml) were placed in columns, washed with 10 volumes of 5 mM dithiothreitol in 120 mM Tris-HCl, 1 mM EDTA, pH 6.8, to cleave the remaining 2-pyridyl disulfide, and then washed with 30 volumes of 5 mM N-ethylmaleimide in this buffer to block the free sulfhydryl groups. Protein not covalently bound to the beads was released with 25 volumes of 4% SDS, 2 mM N-ethylmaleimide, 100 mM Tris-HCl, 1 mM EDTA, 10% glycerol, pH 6.8. Finally, the columns were washed extensively with 150 mM NaCl, 10 mM sodium phosphate, pH 7.4. The SDS treatment released only about 10% of protein, and subsequently the GST-NSH2 exhibited full activity in binding pp60.
Isolation of pp60 and Sequencing of PeptidesRat adipocytes were prepared and treated with insulin as described (3). The cells were lysed in hot SDS buffer, the lysate was diluted with a buffer containing nonionic detergent, and particulate matter removed by centrifugation and filtration, exactly as described in Ref. 3, with the exception that the nonionic detergent was nonylethylene glycol dodecyl ether (ThesitTM from Boehringer Mannheim) rather than octylethylene glycol dodecyl ether. The cell extract (350 ml) from the adipocytes of 150 rats was passed at 0.14 ml/min through a 1.5-ml column of immobilized GST and then through a 0.2-ml column of immobilized GST-NSH2. Once the extract was applied, the GST column was disconnected, and a 0.2-ml portion of it was treated exactly as the GST-NSH2 column to serve as the control. The columns were washed with 20 ml of 1% Thesit in 20 mM Tris-HCl, 150 mM NaCl, 1 mM sodium vanadate, pH 7.4, with protease inhibitors (2 µg/ml aprotinin, 2 µM leupeptin, 0.2 nM pepstatin A) and then with 20 ml of 0.1% Thesit in the same buffer. The beads from each column (about 0.2 ml) were transferred to low protein-binding microfuge tubes, and a hole was pierced in the bottom of each using a 26 gauge needle. Adherent liquid was removed by centrifuging briefly with each tube inside a second tube. Bound proteins were then eluted from the beads in an SDS buffer (4% SDS, 1 mM EDTA, 1 mM sodium vanadate, 10% glycerol, 100 mM Tris-HCl, pH 6.8, with the protease inhibitors given above) by the same method. Beads were eluted successively with two 90-µl portions of SDS buffer, followed by two 180-µl portions. The eluates are referred to in order of elution as P (combined 90-µl eluates), P1 and P2 from the GST-NSH2 and similarly, G, G1, and G2 from the GST. To estimate the yield of pp60, samples containing the original extract, the depleted extract, and the eluate fractions were quantitatively immunoblotted for Tyr(P) as described (3). Approximately 90% of the purified pp60 was in fraction P, with most of the remainder in fraction P1.
Eluate fractions P and G were each separated on single lanes of a 5-12% acrylamide gradient gel. The pp60 in the lane with fraction P was detected by copper staining for protein (Bio-Rad); this area along with the corresponding area from the lane with G was excised. After S-carboxyamidomethylation in a gel, the bands were subjected to tryptic digestion in a gel as described in Ref. 6 without the addition of 0.02% Tween. The resulting peptide mixture was separated by microbore high performance liquid chromatography using a Zorbax C18 1.0 mm by 150-mm reverse-phase column on a Hewlett-Packard 1090 HPLC/1040 diode array detector. Optimum fractions were chosen based on differential UV absorbance at 205, 277, and 292 nm, and the sequences of eight peptides unique to the P fraction were determined by automated Edman degradation on an Applied Biosystems 494A or 477A sequencer. The average initial amino acid yield for the peptides sequenced was 820 ± 310 fmol. Strategies for peak selection, reverse-phase separation, and Edman microsequencing have been previously described (7). Complementary peptide sequence information was obtained on 10% of the digest mixture by collisionally induced dissociation using microcapillary HPLC electrospray ionization/tandem mass spectrometry on a Finnigan TSQ7000 triple quadrupole mass spectrometer (8).
pp60 cDNATotal RNA was obtained from rat adipocytes
using the Trizol reagent (Life Technologies), and mRNA was
subsequently purified from it using the Fast-Track kit (Invitrogen).
The adipocytes of 24 rats yielded approximately 4 µg of twice
purified mRNA. An oligo(dT) primed cDNA library of the Marathon
ReadyTM type was prepared for us from this mRNA by
Clontech. Tryptic peptide g (see Fig.
1B) served as the basis for the design of a mixed sense oligonucleotide containing deoxyinosine (I) at positions of high degeneracy (5-TTYYTICCIGGICCIYTITAYTAYGARTT-3
: where Y is T or C and
R is A or G). 3
RACE was performed with the Marathon Ready cDNA
using this primer (20 µM) and the AP1 primer (2 µM) of the Marathon Ready kit, according to the
manufacturer's instructions. A major 700-bp product was obtained that
was reamplified and then gel purified. After filling the 5
and 3
ends
with Klenow DNA polymerase, the piece was digested with NotI
(a site introduced during cDNA synthesis) and cloned into
NotI/EcoRV digested pBluescript II (SK
)
(Stratagene). The insert was sequenced (nt 1409-1969, see Fig. 2) and
was found to encode tryptic peptide h.
The 5 end of the cDNA was obtained by 5
RACE with the
Marathon-Ready cDNA and a combination of the AP1 primer and an
antisense primer derived from the 3
RACE product (nt 1565-1589). Two
major PCR products of approximately 1600 and 1800 bp were generated in
the initial amplification and gel purified as a mixture.
Reamplification of the mixture with nested primers (AP2 of the
Marathon-Ready kit and an upstream antisense primer (nt 1531-1556))
again generated a mixture of two PCR products. This mixture of PCR
products was directly sequenced from its 3
end and found to be
identical from nt 619 to 1530. Upstream of nt 619 the sequence was a
mixture, indicating that the two PCR products diverged at this point.
To obtain sequence upstream of nt 619, the mixture of PCR products was
subcloned into pBluescript II (SK
) and parts of the inserts from some
clones were sequenced. Primers based upon this sequence were then used
to sequence directly the 5
end of the mixture of the two PCR products.
These gave a single sequence at the most 5
end (nt 1-618) and a
mixture of sequences downstream of nt 618; this indicated the presence
of an intervening sequence of about 200 bp between nt 618 and 619 in
the larger PCR product. To confirm the cDNA sequence, overlapping
PCR fragments were generated from the Marathon Ready cDNA with
appropriate primers (encompassing nt 97-559, 359-689, 640-1265,
1211-1589, and 1333-1925). As expected, in the case of amplification
of the nt 359-689 fragment, a second fragment of 170 bp larger size
was also obtained. Each of the PCR products was gel purified, and both
strands were directly sequenced. The 170-bp sequence that was present
in only some of the cDNA molecules is probably an unspliced intron,
because its very 5
and 3
sequences are those for splice junctions (GT
and AG, respectively) and because it contains an in-frame stop codon. DNA sequencing was performed on the Applied Biosystems 373 DNA sequencing system using the Perkin-Elmer DNA sequencing kit; data were
analyzed with the Applied Biosystems software. Homology searches were
performed with the BLAST program (25).
The method for the purification of pp60
was based on our previous finding that pp60 is efficiently adsorbed
from extracts of insulin-treated adipocytes by the N-terminal SH2
domain of the 85-kDa subunit of PI 3-kinase as a GST fusion protein
(3). An extract of insulin-treated adipocytes from 150 rats was passed sequentially through a column of immobilized GST alone and then through
a column containing the GST-NSH2 fusion protein. After the adsorption
step, the columns were separated, each was washed, and adsorbed
proteins were eluted with SDS. Fig. 1A
(lanes 1-4) shows the eluted Tyr(P) proteins as detected by
anti-Tyr(P) immunoblotting. The major Tyr(P) proteins had mobilities
corresponding to those expected for pp60 and IRS-1. Smaller amounts of
Tyr(P) proteins at approximately 97 and 145 kDa were also present. The
97-kDa protein is most likely the subunit of the insulin receptor, which is known to bind to the N-terminal SH2 domain of PI 3-kinase (9);
the identity of the 145-kDa protein is unknown. There was specific
binding to the GST-NSH2 column; no Tyr(P) proteins were present in the
eluate from the GST column (compare lane 1 with 3). Protein
staining with colloidal gold showed that two major proteins were eluted
specifically from the GST-NSH2 column (Fig. 1A, lanes
5-8); these co-migrated with the Tyr(P) forms of pp60 and IRS-1.
From quantitative immunoblotting of the adipocyte lysate and the SDS
eluate fractions of the column for Tyr(P) (data not shown), we
determined that approximately 30% of the Tyr(P) form of pp60 was
recovered in the purification. In addition, from this data and that in
Fig. 1A, we estimate that approximately 500 ng (8 pmol) of
pp60 were isolated from the adipocytes of 150 rats.
To obtain peptides from pp60, the bulk of the SDS eluate from the GST-NSH2 column (about 90%, with the remainder used for the analyses described above) was run in a single lane on a gradient gel, and the gel slice containing pp60 was treated with trypsin. Tryptic peptides were isolated by HPLC, and the sequences of eight peptides were determined (Fig. 1B). A search of the data base using the BLAST program revealed no significant matches with sequences in known proteins.
cDNA Encoding pp60Initially a PCR product encoding the
3 end of pp60 was generated in a 3
RACE reaction using a degenerate
primer based upon the sequence of peptide g and a Marathon-Ready
cDNA library from rat adipocytes. Subsequently, the 5
end of the
pp60 cDNA was obtained by a 5
RACE procedure. The nucleotide
sequence and predicted amino acid sequence of pp60 are presented in
Fig. 2. An open reading frame extending from nt 189 to
1673 encodes a 494-amino acid polypeptide that contains all eight of
the sequences found for the pp60 tryptic peptides. It is virtually
certain that the ATG codon at nt 189-191 initiates translation,
because upstream there is no in-frame ATG codon between it and an
in-frame stop codon at nt 123-125, because the next downstream ATG (nt
615-617) is beyond the PH domain (see below), and because the sequence
just upstream of nt 189-191 conforms to the rat Kozak consensus
sequence for translation initiation (10). The predicted molecular mass
of pp60 is 55.3 kDa, a value that is smaller than the size of
approximately 60 kDa for the Tyr(P) form estimated by SDS gel
electrophoresis. The explanation for this difference most likely is an
aberrantly low mobility on electrophoresis, which is frequently the
case for phosphorylated proteins.
Several types of evidence establish that the predicted protein is the 60-kDa protein that undergoes tyrosine phosphorylation in response to insulin. First, as noted above, the isolation of the cloned protein was based on a known binding property of pp60. Second, as described below, the structure of the protein is that expected for a substrate of the insulin receptor; the predicted sequence contains, as expected, several potential PI 3-kinase binding motifs. Third, we have prepared affinity purified rabbit antibodies against the C-terminal peptide (14 amino acids) of the predicted protein and shown that these react with pp60. The tyrosine phosphorylated form of pp60 was isolated from a lysate of insulin-treated adipocytes by adsorption with GST-NSH2 or with antibodies against Tyr(P), as described in Ref. 3. Immunoblotting of each adsorbate with the antibodies against the C terminus detected only a 60-kDa protein. When this experiment was performed with a lysate of basal adipocytes, no protein was detected (data not shown).
Domains and Tyr(P) Motifs in pp60The amino acid sequence of
pp60 was compared with the protein data base using the BLAST P program
and also was examined for potential sites of tyrosine phosphorylation.
This revealed that pp60 contains in the following order from its N
terminus: a PH domain that is highly homologous to the PH domain in
IRS-1 and IRS-2, a PTB domain that is highly homologous to the PTB
domain in IRS-1 and IRS-2, and, distributed over the C-terminal third of the protein, a number of likely tyrosine phosphorylation sites in
motifs that can bind SH2 domain-containing proteins (Fig.
3 and see below). The architecture of pp60 is strikingly
similar to that of IRS-1 and IRS-2. Although the latter two proteins
are larger (1231 and 1321 amino acids, respectively), each contains an
N-terminal PH domain, followed by a PTB domain, followed by a group of
tyrosine phosphorylation sites at which a variety of SH2-domain
signaling proteins dock (2, 11). Thus, pp60 is a new member of the IRS
family, and henceforth we refer to it as IRS-3.
The PH domain of IRS-3 consists of 100 amino acids (residues 32-131) and exhibits 50 and 45% identity with this domain in IRS-1 and IRS-2, respectively (Fig. 3B). This high degree of homology is notable, because the sequences of PH domains generally show a great deal of variation (12) and suggests that there is a common function for the PH domain in the three IRSs. In this regard, the PH domain of IRS-1 is necessary for its efficient in vivo tyrosine phosphorylation by the insulin receptor, although it does not appear to interact directly with the receptor (13-15).
The PTB domain in IRS-3 consists of 115 amino acids (residues 160-274) and exhibits 48 and 53% identity with this domain in IRS-1 and IRS-2, respectively (Fig. 3B). The IRS PTB domain was originally identified as a region of approximately 160 amino acids that is highly homologous in IRS-1 and IRS-2 and that binds to the tyrosine phosphorylated insulin receptor and to phosphopeptides mimicking the sequence surrounding Tyr960 in the receptor (11, 16). More recently, the minimal PTB domain in IRS-1 has been delineated both functionally by deletion analysis and structurally by x-ray crystallography and has been found to be somewhat smaller, extending over 105 amino acids (residues 161-265 in human IRS-1, corresponding to residues 156-260 in rat IRS-1) (17). The region of homology between IRS-1 (residues 157-255) and IRS-3 corresponds almost exactly with this minimal PTB domain; the immediately flanking sequences in IRS-1 and IRS-2 show little homology with IRS-3. The crystal structure of the IRS-1 PTB domain complexed with a 9-residue Tyr(P) peptide similar to the sequence surrounding Tyr960 in the insulin receptor has been determined (17). Remarkably, 14 of the 19 amino acids in IRS-1 that interact with the bound peptide (Fig. 6 of Ref. 17) are identical in IRS-3, including the two arginines whose guanidinium groups contact the phosphate of the Tyr(P) residue; the remaining five differences are conservative substitutions. This suggests that IRS-3 will also be found to bind via its PTB domain to the activated insulin receptor by association with the segment containing Tyr(P)960.
Outside of the PH and PTB domains there are no regions of extended homology between IRS-3 and IRS-1/2. Although IRS-1 and IRS-2 contain a region just downstream of the PTB domain referred to as the SAIN domain, which participates in the interaction with the insulin receptor, and IRS-2 also contains a domain even further downstream (residues 591-733) that also interacts with the receptor (15, 18-20), neither of these are present in IRS-3.
Several of the potential tyrosine phosphorylation sites in IRS-3 lie
within motifs that conform to the established recognition specificities
of SH2 domains (21, 22). Most notably, there are four YXXM
motifs (Tyr343, Tyr352, Tyr362, and
Tyr392); this is the motif to which each SH2 domain of the
PI 3-kinase 85-kDa subunit binds. Given the strong association of the
Tyr(P) form of IRS-3 with both SH2 domains, one or more of these sites is almost certainly phosphorylated in vivo. The occurrence
of a linear array of four YXXM motifs suggests that tandem
motifs are phosphorylated and then bind simultaneously to the two SH2 domains on the 85-kDa subunit; such a bidentate interaction has been
shown to result in very high affinity binding (23). Among the other
potential tyrosine phosphorylation sites of IRS-3, there is one
(Tyr321) that would be expected to bind to the SH2 domain
of Grb2, the adaptor for SOS (the GDP-releasing factor for Ras), and
another (Tyr466) that could bind to N-terminal SH2 domain
of either the Tyr(P) phosphatase SHP2 or phospholipase C. It remains
to be determined whether these or other SH2 domain proteins are
associated with the Tyr(P) form of IRS-3. Because the Tyr(P) forms of
IRS-1 and IRS-2 function as docking/effector proteins for PI 3-kinase,
Grb2, and SHP2 (2), the similarity of IRS-3 with IRS-1/2 extends to at
least one and probably several interactions with SH2 domain proteins.
The rapid tyrosine phosphorylation of IRS-3 in response to insulin and the identification of it as a member of the IRS family strongly indicates that it is a substrate for the insulin receptor. However, this remains to be demonstrated. Besides the insulin receptor, a variety of other receptors, including the related receptor for insulin-like growth factor I, signal through tyrosine phosphorylation of IRS-1/2 (1, 2). Thus, IRS-3 may also participate in signal transduction from other receptors. A major challenge now is to elucidate the role that each IRS plays in insulin action.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U93880[GenBank].
We thank Susanna Keller for guidance in recombinant DNA methods and critical reading of the manuscript, Nicholas Morris for preparation of the mRNA, Renee Robinson, John Neveu, and Terri Addona for expertise in the HPLC, peptide sequencing, and mass spectrometry, respectively, and Mary Harrington for expert secretarial assistance.