(Received for publication, June 19, 1995; and in revised form, September 28, 1995)
From the
We have shown previously that the active form of avian sarcoma virus integrase (ASV IN) is a multimer. In this report we investigate IN multimerization properties by a variety of methods that include size exclusion chromatography, chemical cross-linking, and protein overlay assays. We show that removal of the nonconserved C-terminal region of IN results in a reduced capacity for multimerization, whereas deletion of the first 38 amino acids has little effect on the oligomeric state. Binding of a full-length IN fusion protein to various IN fragments indicates that sequences in both the catalytic core (residues 50-207) and a C-terminal region (residues 201-240) contribute to IN self-association. We also observe that the isolated C-terminal fragment (residues 201-286) is capable of self-association. Finally, a single amino acid substitution in the core domain (S85G) produces a severe defect in multimerization. We conclude from these analyses that both the catalytic core and a region in the nonconserved C terminus are involved in ASV integrase multimerization. These results enhance our understanding of integrase self-association determinants and define a major role of the C-terminal region of ASV integrase in this process.
The integration of viral DNA into the genome of the host cell is
a unique and vital step in the normal life cycle of retroviruses (for
recent reviews, see (1) and (2) ). The retroviral
integrase (IN) ()is both necessary and sufficient for the
integration of a linear DNA with viral ends into a target DNA in
vitro(3, 4) . Our knowledge of retroviral
integrase structure and function continues to be refined by
identification of domains of the enzyme that contribute to the various
functions required during integration (Fig. 1). These include DNA binding, catalytic activities, and the multimerization required for the coordinated joining of
processed viral ends to the site of integration in host DNA. Several
investigations have mapped a nonspecific DNA binding domain to
C-terminal portions of both ASV and HIV-1
IN(5, 6, 7, 8, 9) .
Catalytic functions have been localized to the central core domain,
which is resistant to limited proteolysis(10) , and contains a
conserved triad of acidic residues [D,D(35)E] that is
presumed to bind the divalent cations that are required for catalytic
activity(5, 11) . The isolated catalytic domain is
competent to perform a concerted cleavage-ligation activity (12, 13, 14) and contributes to the
recognition of conserved CA residues present at the 3` ends of
retroviral long terminal repeats and other transposable
elements(15) . However, it cannot perform the viral DNA end
processing and joining reactions required for insertion of viral DNA
into target DNA sequences. The crystal structures of the catalytic
cores of both HIV-1 IN and ASV IN have been solved recently (16, 17) .
Figure 1:
Domain structure of retroviral
integrases. The figure shows the relative positions of the major
domains along the linear sequence of retroviral integrases. The scale
of amino acid numbering is indicated at the top (ASV IN
= 286 amino acids; HIV-1 IN = 288 amino acids). The
catalytic core is an evolutionarily conserved region among retroviral
INs and certain transposases; the acidic residues (D,D(35)E) presumed
to bind the divalent cations required for activity have been positioned
to scale. Another conserved motif, the HHCC region, is located in the
N-terminal region. It contains appropriately spaced histidines and
cysteines characteristic of several Zn binding
domains. The amino acid sequence in the C-terminal region (amino acids
beyond 200) is not highly conserved among retroviral INs. It has been
shown to contain determinants for nonspecific DNA binding and, in
addition to the core region, likely plays a role in substrate binding
during catalysis. The location of the two self-association regions for
ASV IN identified in this report are also indicated. The five ASV IN
truncated proteins used in this report are diagrammed below, with the shaded box indicating the portion of IN encoded. The construct
name includes the end points of amino acids contained in each
polypeptide.
Many enzymes that catalyze DNA recombination require the formation of multimeric protein-DNA complexes. Detailed structural information is available for some of these(18, 19, 20, 21, 22) . However, important questions concerning the stoichiometry of IN protein and DNA substrates in the nucleoprotein complex that is competent for integration remain unanswered. In addition, the role of protein multimerization in substrate binding and catalysis is yet to be clearly defined.
Both biochemical and genetic studies indicate that
multimerization is a functionally important property of retroviral
integrases. ASV IN, purified from viral particles, and bacterially
expressed HIV-1 IN were observed to migrate in glycerol gradients at a
position consistent with a dimer molecular
weight(23, 24) . Gel filtration studies also suggest
that purified HIV-1 IN forms dimers(25) . We have used
sedimentation and kinetic studies to demonstrate that purified ASV IN,
which exhibits a reversible mass action between monomer dimer
tetramer, must multimerize to perform its catalytic
function(26) . Similar sedimentation analyses have been
performed with HIV-1 IN(27) . The coordinated action of Moloney
murine leukemia virus IN on both ends of viral DNA was demonstrated by
mutagenesis studies and analysis of intermediates produced in
vivo(28) . Our laboratory has recently obtained similar
results with ASV IN in vitro, using DNA substrates that link
two viral DNA end sequences(29) . Multimerization has also been
inferred from the enzymatic complementation of two defective HIV-1 IN
mutants when incubated together(30, 31) . Finally, a
yeast transcriptional reporter system was used to analyze HIV-1 IN
homomeric interactions and to determine a minimal domain required for
self-association(32) .
The work presented here includes physical analyses of the multimeric state of purified ASV IN and various IN fragments using size exclusion chromatography (SEC) and chemical cross-linking techniques. To further delineate the regions of this enzyme that contribute to self-association, we have also employed a modification of a protein overlay blot technique which tests directly for binding between two potential associating proteins. In addition to demonstrating a role for the conserved catalytic core domain in ASV IN dimerization, our results uncover an important determinant for multimerization located in the less conserved C-terminal domain.
The IN(201-286)st clone, which expressed the IN fragment 201-286, included a streptavidin epitope tag at the C terminus. It was constructed in three steps as follows. 1) A DNA duplex fragment encoding a kinase/strep-tag sequence (35) of 15 amino acids (Arg-Arg-Ala-Ser-Val-Ser-Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly) with compatible ends was inserted into BamHI/HindIII-digested pSE380(36) . 2) A polymerase chain reaction fragment encoding IN amino acids 201-286 (using primers with flanking NcoI and BamHI sites) was inserted into the pSE380 strep-tag derivative. 3) The IN-strep-tag encoding DNA fragment was purified and ligated to NcoI/HindIII-digested expression vector pET-20b (Novagen). The polypeptide expressed from this construct includes a leader sequence (pelB) from the pET-20b vector, which is intended for periplasmic targeting of overexpressed proteins. However, the most effective purification of IN(201-286)st was from whole cell lysates. The apparent molecular weight from SDS gels indicate that this leader is cleaved by a signal peptidase. The IN sequences of all clones were confirmed by sequencing, and fusion proteins of expected molecular weight were synthesized in all cases.
For
labeling purposes, glutathione beads with bound fusion protein were
washed with kinase buffer (20 mM Tris-Cl, pH 7.5, 0.1 M NaCl, 12 mM MgCl, 10 mM
dithiothreitol). Then, 10 to 20 units of protein kinase (catalytic
subunit from bovine heart, Sigma) and 330 µCi of
[
-
P]ATP were added, and the mixture was
incubated with agitation for 30-60 min at 4 °C. After
quenching, the glutathione resin was washed 4 times with NETN buffer
prior to elution of the labeled fusion protein as described above.
The SEC analyses presented do not attempt to account for the dissociation kinetics of IN multimers. The observation of discrete monomer and dimer peaks in some chromatograms (for example, Fig. 4, traces 4 and 5) indicates that the dissociation rate of dimer to monomer must be slow relative to the column run time. The intent of the experiments presented here was to compare the behavior of different polypeptides and not to establish absolute quantitative constants for IN self-association. Sedimentation equilibrium analysis is more suited to the quantitative determination of association constants, and the results of such studies (in progress) will be reported separately.
Figure 4:
Size exclusion chromatography of
full-length IN and truncated IN proteins. Size exclusion chromatography
was performed as described under ``Materials and Methods.''
The chromatograms display the absorbance at 220 nm as a function of
elution time. The trace numbers noted in the text are found to the left of each chromatogram. For reference, the elution
positions of 2 (of a total of 7) globular molecular mass standards are
indicated with dotted vertical lines. The loading
concentration of each polypeptide was approximately 30 µM. Lightly shaded traces indicate proteins that are competent for
multimerization; hashed traces indicate proteins partially
defective in multimerization, and darkly shaded traces indicate proteins most defective in multimerization. We note that
full-length IN elutes with an apparent molecular mass that can be best
assigned as a dimer (57.8 kDa), yet at a time slightly later than
expected for an ideal globular dimer. This behavior is similar to that
observed with HIV-1 IN(25) . The slightly retarded elution time
most likely reflects nonspecific interaction of IN with the column
matrix. This has been observed with other SEC media used for ASV IN
(data not shown) and was observed for the catalytic core of HIV-IN,
where inclusion of CHAPS in the running buffer decreased elution
time(43) .
The experiment in Fig. 6uses GST-fusion proteins for both probe and targets. GST has been shown to be a dimer in the active state(37) . Although we included excess unlabeled, nonfused GST as competitor, this assay may detect some GST-GST interactions between probe and targets. The level of probe binding to GST alone is indicated and should be considered the background for each probe. Note that the binding of the gIN(1-286) probe to the full-length target is greater than 20-fold above binding to GST alone, demonstrating an adequate signal-to-noise ratio under our assay conditions. Experiments were repeated with several of the constructs using nonfused protein targets (data not shown), and similar results were obtained.
Figure 6: Mapping determinants of IN self-association. The left part of the figure shows the set of truncated proteins generated to test for probe binding. The dark bars indicate the portions of IN included in each GST-fusion, and the corresponding name is listed in the first column of the table to the right. Dotted lines delineate the catalytic core and C-terminal regions proposed to be involved in ASV IN self-association. The variable shading of the C-terminal self-association region (here and in Fig. 1) indicates the relative contribution to multimerization conferred by the sequences within this region. The last two columns of the table present the normalized data for binding of two different probes, full-length IN fusion, gIN(1-286), and gIN(201-236), to each target protein. ND indicates not done.
Figure 2:
Chemical cross-linking of IN subunits. A, titration of the cross-linking reagent DSP. Reactions
containing 7 µM IN (wild type, nonfused) and various
concentrations of DSP were incubated for 30 min at 22 °C. They were
then quenched by addition of a molar excess of glycine and an equal
volume of SDS sample buffer without a reducing agent (except in lane 8 where 2-mercaptoethanol was included). Samples were
analyzed by 12% SDS-PAGE and silver staining. The positions of the
molecular mass markers are noted to the left; positions of IN
multimers are indicated to the right (1-IN =
IN monomer, 2-IN = IN dimer, 4-IN = IN
tetramer). The concentrations of DSP in lanes 2, 3, 4, 5, 6, and 7 were 2
µM, 20 µM, 0.2 mM, 1 mM, 2
mM, and 20 mM, respectively. Lane 8 contained 2 mM DSP. The presence of trace amounts of
dimer in the lane with no cross-linker is due to fortuitous disulfide
bonds formed between native IN monomers that are not reduced, since the
gel sample buffer could not contain reducing reagents with this
cross-linker. Protein in a secondary band (estimated 28 kDa) below
full-length IN monomer (32 kDa) is derived from full-length IN, since
it is recognized in Western blots with monoclonal antibodies directed
against ASV IN (data not shown). Cross-linking of this fragment can
account for some of the broadness of the dimer band seen in this figure
and Fig. 3. B, cross-linking with the reagent EDC.
Reactions containing 10 µM IN, 650 mM NaCl, and
increasing concentrations of EDC were incubated for 30 min at 22
°C. Labeling is as in A. The concentrations of EDC in lanes 1, 2, 3, 4, and 5 were 8 µM, 40 µM, 0.2 mM, 1
mM, and 5 mM, respectively. C, chemical
cross-linking with lower concentrations of IN. Concentrations of IN in lanes 2, 3, 4, and 5 were 7
µM, 3.5 µM, 70 nM, and 350
nM, respectively. The assignment of the band labeled 4-IN (tetramer) is more apparent in lower percentage gels. Reaction
mixtures contained 0.5 M NaCl, 0.2 mM BS and were incubated for 15 min at 22 °C. No cross-linker was
added in lane 1. Labeling is as in A.
Figure 3:
Comparison of multimerization by
full-length IN and truncated IN proteins. A, comparison of
full-length IN and IN(1-207). Lanes 2-5 contained
3.5 µM full-length IN (wild type), and lanes 6-9 contained 4.8 µM IN(1-207). Concentrations of
BS were 20 µM in lanes 3 and 7, 0.2 mM in lanes 4 and 8, and 2
mM in lanes 5 and 9. Incubations were for 15
min at 22 °C. Labeling is as described in Fig. 2. B, comparison of truncated IN proteins with full-length IN for
formation of covalent multimers using BS
. Reactions and
labeling of the gel are as described in A. Protein
concentration in each case is 5 µM, and the absence and
presence of cross-linker is indicated above each lane. In the absence
of cross-linker, a minor amount of nonreduced dimer for
IN(39-286) and IN(1-207) persists despite the inclusion of
reducing agent in the loading buffer.
Other reagents with different chemistries (e.g. glutaraldehyde and dimethyl 3,3`-dithiobispropionimidate) were also examined for their ability to covalently cross-link IN multimers. Multimeric complexes similar in composition and amount to those observed with DSP and EDC were detected with these reagents (data not shown), suggesting that a variety of reactive residues (basic, acidic, and others) must be present at or near the interface between monomer subunits.
Experiments in which
the IN concentration was decreased while the cross-linker concentration
was held constant showed that dimers are present at IN concentrations
as low as 70 nM (Fig. 2C). These data also
reveal a concentration dependence for tetramer formation (Fig. 2C, compare lanes 2 and 3).
These results are consistent with previous estimates of a K (monomer-dimer) in the 1-5 µM range(26) .
The
cross-linker BS was used to determine the ability of these
three truncated forms of IN to form covalently linked dimers in
solution (Fig. 3). The results showed significantly less
cross-linked dimer with IN(1-207) (Fig. 3A, lanes 7-9), compared to full-length IN (Fig. 3A, lanes 3-5). At a concentration
of 2 mM BS
in which all full-length IN was
covalently linked in multimeric forms, most of IN(1-207) remained
monomeric. We conclude that IN(1-207) is deficient in
multimerization. Analysis of IN(52-207), which also lacks amino
acids from the N terminus, showed a similar deficiency (Fig. 3B, compare lane 2 with lane
8). In contrast, deletion of residues 5-38 from the
N-terminal segment alone did not cause reduction in multimerization (Fig. 3B, compare lane 2 and lane 4).
No heterodimerization was observed in mixtures of full-length and IN(1-207) polypeptides in this assay (data not shown). It is possible that under these experimental conditions, exchange of monomer subunits proceeds too slowly for heterodimers to form. However, this could also reflect the fact that the affinity of an IN(1-207) subunit for the full-length IN is significantly lower than two full-length IN monomers for each other.
In the course of site-directed mutagenesis studies, our laboratory has prepared a number of altered ASV IN proteins that contain single amino acid substitutions in residues that are highly conserved in retroviruses and certain other transposable elements(11) . A number of these proteins (D64E, T66A, F126A, D121E, L163A, H9N, K206A, R227A) were examined by SEC (data not shown). Only one of these, S85G, exhibited a significant difference when compared with wild type protein. As illustrated in Fig. 4(trace 2), S85G eluted exclusively as a monomer.
The results of these SEC analyses
are in general agreement with those of the chemical cross-linking
studies. Both sets of data indicate that the catalytic core domain of
ASV IN can dimerize, but with reduced efficiency compared to the
full-length protein. Addition of the C-terminal region appears to
restore full multimerization capability. We conclude that
self-association determinants are located in both the core and
C-terminal regions of ASV IN. The inability of the S85G mutant to
dimerize suggests that substitutions in this residue alter the
catalytic core structure, or the way in which the core interacts with
the C-terminal domain. Analysis of the crystal structure of ASV
IN(52-207) reveals that the side chain of this residue
participates in a network of hydrogen bonds in a tight turn between two
-strands (17) .
Figure 5: Protein-protein association detected by labeled protein overlay: comparison of full-length IN and IN(1-207). Nonfused, full-length IN(1-286) and IN(1-207) were blotted, renatured, and tested for binding to the labeled probe of full-length IN fused to GST, gIN(1-286), as described under ``Materials and Methods.'' Bovine serum albumin (BSA) and other molecular mass standards (MW) were included as controls. The binding of the labeled probe to some of these molecular mass proteins is lower than the background level of binding to the membrane alone. A shows a Coomassie-stained gel (12.5%) loaded identically to the gel blot in B. The position of molecular mass markers are noted at the left.
The C-terminal truncation protein showed reduced binding (Fig. 5B, lane 4), even though equal molar amounts of this polypeptide were used in the assay (Fig. 5A, compare lanes 3 and 4). The relative amount of probe bound was quantitated by radioanalytic imaging and normalized to the amount of protein present on the filter. These calculations indicated that the C-terminal truncation protein bound to the probe with 5- to 10-fold lower efficiency than the full-length IN. Since these results were consistent with those from our physical assays of IN multimerization, we used this method to screen a series of nested N-terminal truncated IN proteins. For ease and uniformity of purification, these truncated proteins were constructed as GST-fusion proteins (see Fig. 6and ``Materials and Methods''). The fusion proteins were expressed in E. coli, affinity-purified, and used as targets in the protein overlay assay. The relative binding capacity of these proteins was tested with two probes, full-length gIN(1-286) and gIN(201-236), and quantitated as described above. Fig. 6provides a map of the IN deletion proteins tested and a summary of the results of this quantitation expressed as percent of probe bound to a full-length IN target protein.
Deletion of the N-terminal HHCC domain did not significantly change the binding with full-length probe (compare the gIN(1-286) and gIN(60-286) targets), consistent with results from SEC and cross-linking with the IN(39-286) protein. Further N-terminal truncation, which removed part [gIN(120-286)] or all [gIN(156-286)] of the D,D(35)E region, reduced binding to an intermediate level, 40-60%. Continued truncation did not change binding significantly until the deletion extended into the C-terminal domain which includes amino acids downstream of residue 201. After that, binding of the probe continued to decrease [gIN(223-286)] until the deletion included amino acid 239 [gIN(240-286)], when it reached a background level, equivalent to binding to the GST alone control.
These results delineate two regions critical to IN self-association. The first corresponds to the central catalytic D,D(35)E domain, and the second lies in a C-terminal domain including, but perhaps not limited to, amino acids 201-240 of ASV IN. Results with targets which are truncated from the C terminus were consistent with this designation. The target protein gIN(1-236) bound the full-length probe with approximately 70% efficiency, whereas the efficiency with IN(1-207) was only 26%.
The association properties of the C-terminal domain were further investigated by probing the same set of target proteins with the gIN(201-236) fusion protein. The results with this probe revealed a pattern distinct from that of the full-length probe. All of the N-terminal truncation proteins up to and including gIN(223-286) bound the gIN(201-236) probe with approximately 70-80% the efficiency of the full-length target (Fig. 6, last column). The C-terminal truncation IN(1-236) bound this probe as well as or better than the full-length IN, but no binding above background was observed to the IN(1-207) target.
It is possible that this method could detect both quaternary interactions between IN monomers and tertiary interactions that reflect the folding of domains within an IN monomer polypeptide. However, results with the gIN(201-236) probe, which showed equivalent binding to all targets that contained residues 201-240, make it unlikely that this reaction is simply mimicking tertiary interactions. They suggest, instead, that this C-terminal peptide is capable of specific association with the homologous region in another target polypeptide.
Figure 7:
Self-association of the C-terminal IN
fragment IN(201-286)st. SEC was performed using the C-terminal
fragment IN(201-286)st as in Fig. 4. The chromatogram
shows two peaks: the first eluted in the void volume of the column, and
the second (labeled IN(201-286)) eluted at a position consistent
with a multimer of the IN(201-286)st fragment. The multimer peak
was confirmed to contain the IN(201-286)st fragment by SDS-PAGE
analysis of the fractions. The 260:280 nm absorbance ratio of the void
peak was consistent with the presence of both nucleic acid and protein
components in this fraction, which could account for the observed
aggregation. The inset shows results from a chemical
cross-linking experiment with the IN(201-286)st fragment and
cross-linker BS under the conditions described in Fig. 3B. The positions of covalently linked dimers,
trimers, and tetramers are noted.
Results obtained with these nonequilibrium methods are
consistent with independent estimates of dissociation constants
determined from sedimentation equilibrium experiments. Whereas
full-length ASV IN has a K (monomer-dimer) of
1-5 µM(26) , ASV IN(52-207) has a K
(monomer-dimer) in excess of 500
µM. (
)The HIV-1 catalytic core (residues
50-212) has been reported to have a stronger association than
that of the analogous ASV fragment, and dimers have been observed with
chemical cross-linking, SEC, and sedimentation analysis of the HIV-1 IN
core fragment(27, 30, 43) .
Both the cross-linking and SEC methods record the behavior of the majority of molecules in the protein preparations, whereas multimerization inferred from enzymatic complementation (30, 31) or a transcriptional reporter system (32) could reflect the activity of a small fraction of protein present. In addition, the latter assays include DNA substrates which could facilitate the formation of multimers of IN. We have performed cross-linking experiments in the presence of various DNA substrates and have failed to detect enhancement of ASV IN multimerization (data not shown). However, this could reflect the adverse affects of the high salt conditions required to keep the protein soluble. We note also that enzymatic complementation cannot identify regions of IN that contribute to self-association if they do not include the minimal region necessary for catalytic activity. The analyses reported here do not require overlap with catalytic regions and represent the first evidence that important determinants of self-association reside in the C-terminal region of a retroviral integrase.
Since multimerization is required for IN function, inhibitors of self-association may be of potential use in antiviral therapy. The protein overlay method may be particularly suited for the identification of peptide inhibitors that interfere with this property and presumably viral integration. There is precedent for such an inhibitor strategy with the retroviral protease(44, 45) .
It is still unclear whether retroviral integrase functions as a dimer or a tetramer. Formation of a tetramer might require interactions across two separate protomer interfaces, one for dimerization and a second for the association of two dimers into a tetramer. Currently, it is not possible to conclude that the ASV IN C-terminal domain is involved in either of the postulated interfaces. However, it is noteworthy that C-terminal truncated proteins, IN(52-207) and IN(1-207), do not form tetramers in chemical cross-linking experiments as do full-length IN, IN(39-286), and IN(201-286) ( Fig. 2and Fig. 6). Whether this is due to the absence of the C-terminal domain remains to be investigated.
The topology of their folding places the retroviral integrases in a family of nucleases that includes RNase H, the RuvC resolvase, and the MuA transposase(46) . Despite sharing a similar fold, and probably similar reaction chemistries, these diverse nucleases differ in substrate specificity, multimeric structure, and the requirement for coordination of cleavages performed. For example, RNase H is known to act as a monomer, whether as an independent domain or in the context of HIV-1 reverse transcriptase(47) , and, correspondingly, its function does not require coordination of multiple cleavages. In contrast, RuvC is known to act as a dimer and performs two DNA cleavages to resolve a Holliday junction, but is not involved in joining of DNA strands(48) . It is apparent from comparison of their crystal structures that the RuvC dimeric interface (49) differs from that of the IN core structures. A primary challenge for the future will be to determine how aspects of protein sequence and structure give rise to the specific quaternary interactions that allow each of these proteins to perform their specialized functions. Further study of ASV IN self-association should help to identify the relevant distinguishing features of integrase structure.