From the Department of Biochemistry and Molecular
Biophysics and ** Howard Hughes Medical Institute, Columbia University,
New York, New York 10032, the § Division of Human
Retrovirology, Dana-Farber Cancer Institute, Boston, Massachusetts
02115, the ¶ Department of Pediatrics, Tulane University School of
Medicine, New Orleans, Louisiana 70112, and
SmithKline Beecham
Pharmaceuticals, King of Prussia, Pennsylvania 19406
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() |
---|
The extensive glycosylation and conformational
mobility of gp120, the envelope glycoprotein of type 1 human
immunodeficiency virus (HIV-1), pose formidable barriers for
crystallization. To surmount these difficulties, we used probability
analysis to determine the most effective crystallization approach and
derive equations which show that a strategy, which we term variational
crystallization, substantially enhances the overall probability of
crystallization for gp120. Variational crystallization focuses on
protein modification as opposed to crystallization screening. Multiple
variants of gp120 were analyzed with an iterative cycle involving a
limited set of crystallization conditions and biochemical feedback on protease sensitivity, glycosylation status, and monoclonal antibody binding. Sources of likely conformational heterogeneity such as N-linked carbohydrates, flexible or mobile N and C termini,
and variable internal loops were reduced or eliminated, and ligands such as CD4 and antigen-binding fragments (Fabs) of monoclonal antibodies were used to restrict conformational mobility as well as to
alter the crystallization surface. Through successive cycles of
manipulation involving 18 different variants, we succeeded in growing
six different types of gp120 crystals. One of these, a ternary complex
composed of gp120, its receptor CD4, and the Fab of the human
neutralizing monoclonal antibody 17b, diffracts to a minimum Bragg
spacing of at least 2.2 Å and is suitable for structural analysis.
In conventional crystallizations of biological macromolecules, the
protein or other macromolecular subject is treated as a fixed entity to
be tested in a multitude of crystallization conditions. Despite
advances such as sophisticated screening procedures (1, 2) and
crystallization robots (3, 4), this approach often fails for components
from complex biological systems. One of these, the subject of this
study, is the HIV-11 exterior
envelope glycoprotein, gp120. In such cases, success may follow if the
protein itself is varied. There are, however, many options in this vein
and it is not clear how they might be prioritized. By way of background
for this study, we first consider various options for the
crystallization of conformationally complex macromolecules and then
describe the characteristics of gp120.
For the more difficult crystallization challenges, which can be defined
as those for which conventional screening fails, one typically tries to
vary or modify the protein while maintaining biologically important
properties. Meaningful results are obtained since the integrity of
internal structure and functional properties can often tolerate
variation at the molecular surface where lattice contacts are made. The
probability for success in crystallization is enhanced because flexible
or heterogeneous surface features may be removed or because of the
fortuitous introduction of lattice interactions. A prescient example
that pre-dates the powerful methods of modern molecular biology was
Kendrew's (5) screening of myoglobins from many different organisms
until he found one, from sperm whale, that crystallized well. Indeed,
human myoglobin requires a Lys to Arg substitution in order to produce
crystals suitable for structural analysis (6). Conversely, crambin
forms exceptionally well ordered crystals despite being a mixture of two isoforms with sequence variation at internal residues (7).
There are many notable examples of variation or modification in the
crystallization of macromolecules. Systematic variation in the species
of origin, as pioneered with myoglobin (5), was also instrumental in
the crystallization of the transcription initiating TATA-binding
protein (8). Proteolysis is often used to define crystallizable
fragments, following the early examples from enzymatic digestions of
antibodies that produced crystallizable fragments (reviewed in Ref. 9)
and the bromelain release of hemagglutinin from the influenza virus
membrane (10). Variation of recombinant constructs, often inspired by
proteolytic definition, is now commonplace with the widespread use of
molecular biology tools. Systematic variation in the length of DNA
oligomers has proved essential in the structural studies of
protein-nucleic acid complexes. The work of Jordan and Pabo (11) on HIV-1 induces acquired immunodeficiency syndrome in humans (17, 18).
The gp120 glycoprotein helps to mediate virus entry into cells through
sequential recognition of two cellular receptors, the surface
glycoprotein CD4 (19, 20) and a chemokine receptor (primarily CXCR4 or
CCR5, depending on viral strain) (21-26). These high affinity
interactions are attractive targets for mimetic drug design. Although
the structure of the gp120-binding domain of CD4 and the identity of
residues critical to its interaction with gp120 have been known for
several years (27, 28), this has not been sufficient for design of
potent antagonists (29-31). As the major virus-specific antigen
accessible to neutralizing antibodies, knowledge of the gp120 structure
could also impact considerably on vaccine design. Despite this interest
and considerable effort for several years with pure soluble protein,
available in quantities as a by-product in part from vaccine trials,
gp120 has resisted crystallographic analysis.
The mature gp120 glycoproteins of different HIV-1 strains typically
have 470-490 amino acid residues (32). Extensive N-linked glycosylation at 20-25 sites accounts for roughly half of the gp120
mass (32, 33). Sequences from many different viral isolates show that
gp120 has five variable regions (V1-V5) interspersed between relatively
conserved regions (C1-C5) (32, 34) and nine conserved disulfide bridges
(33). Except for limited N- and C-terminal cleavage, proteolytic
digestion does not reveal a subdomain structure. Indeed, even after
extensive proteolytic cleavage, the unreduced protein runs near its
native molecular weight on
SDS-PAGE.2
The gp120 glycoprotein likely exhibits conformational flexibility. Some
of the variable regions, the V2 and V3 loops in particular, are known
to be exposed on the surface of the native protein and probably assume
multiple conformations. The potential of gp120 to undergo
conformational change is also evidenced by shedding, the CD4-induced
dissociation of gp120 from the surface of the virus, by ligand-induced
variations in monoclonal antibody binding (35, 36), and by complex
CD4-gp120 binding kinetics (37). These changes may be related to the
functional role of gp120 in virus entry.
The extensive glycosylation and conformational heterogeneity of gp120
suggested that merely screening the protein through ever more exotic
crystallization conditions would not produce well diffracting crystals.
We have analyzed the effectiveness of optimizing different
crystallization factors given the specific characteristics of gp120.
This led us to a strategy employing radical modification of the protein
surface, primarily to reduce heterogeneity but also to create new
potential lattice contacts. We derive equations which show that this
strategy, which we term variational crystallization, substantially
enhances the overall probability of crystallization for gp120. An
iterative process, involving both biochemical and molecular biological
techniques, was used to detect and remove chemical and conformational
heterogeneity. In addition, protein ligands, namely CD4 and the Fab
fragments of several monoclonal antibodies, were used to restrict
conformational mobility. Progressive trials of 18 different gp120
crystallization variants yielded six different crystals, at least one
of which is suitable for structural analysis. This paradigm of
crystallization, with a focus on protein modification rather than on
crystallization screening, may aid in the structural analysis of other
conformationally complex proteins.
Much of the crystallization literature is anecdotal, reflective
perhaps of the diverse nature of proteins. Systematic quantitative studies have necessarily focused on robust, well characterized systems
(38). If a particular protein fails to crystallize, one is faced with a
bewildering array of options based on experience with other often quite
different proteins. In the absence of a comprehensive crystallization
theory it is difficult to know how to proceed. Here, we devise an
approximate theoretical underpinning for such decisions based on the
ratio comparing crystallization probabilities before
(pi) and after (pf)
a modifying procedure. We define the enhancement in crystallization
probability as In evaluating different crystallization strategies, one important
consideration is effectiveness. Many factors affect crystallization, and a suitable crystallization approach depends on identifying and
dealing with those that are most limiting. For example, if a protein
were only 30% pure, the crystallization probability associated with
such protein purity would be low and a purification strategy would be
key; if a protein were 98% pure, further purification would most
likely have little impact on the overall probability of
crystallization. Factors that might be expected to affect the crystallization of gp120 are listed in Table
I, along with estimates of the effect of
optimizing each factor given the specific characteristics of gp120.
INTRODUCTION
Top
Abstract
Introduction
References
repressor sets the example for transcription factors, and the principle extends to other complexes such as the nucleosome (12). The use of
protein ligands to stabilize another protein of interest for
crystallization has also been effective as in the study of actin
through its complex with DNase I (13) and more generally through
complexes with antigen-binding Fab fragments of antibodies (reviewed in
Ref. 14). The principle that the detergent-solubilized lipid interface
of membrane proteins is generally unavailable for lattice contacts has
led to the concept that crystallizability will be enhanced if the
non-variable surface area is increased, and this was demonstrated in
practice in the crystallization of a bacterial cytochrome
oxidase in complex with an antibody Fv fragment (15). Similarly,
the anticipated conformational and compositional heterogeneity in
carbohydrate moieties of glycoconjugates is expected to interfere with
crystallization, and deglycosylation has proved essential for heavily
glycosylated proteins such as human chorionic gonadotropin (16).
THEORY
= pf/pi
1, whereby
= 0 for no change and can reach a maximum,
max = 1/pi
1, that depends on the
inverse of the initial probability.
Factors affecting the crystallization of gp120
Although identification of limiting crystallization factors can establish rough guidelines as to the appropriateness of a particular crystallization strategy, a better way to evaluate effectiveness (or perhaps to judge the progress of a specific crystallization effort) is by quantitative assessment of the en- hancement in crystallization probability. For example, if 80% of all crystallizable proteins crystallize from a core set of 50 conditions (2), a strategy that involves screening ever larger arrays of crystallization conditions could at most enhance the probability of crystallization by only 25% over that for the first 50 conditions; further screening would yield increasingly diminishing returns. With this screening example, the quantitative enhancement of probability is straightforward to calculate, but it is not immediately apparent for the strategy of variational crystallization, which focuses on protein modification. Here we consider two kinds of such modifications: those designed to reduce heterogeneity and those related to expanding the number of crystallization candidates.
Enhancement of Surface Homogeneity-- Crystalline order is explicitly dependent on lattice homogeneity. Reducing heterogeneity can be thought of as increasing the proportion of surface area available for formation of lattice contacts, increasing the probability of crystallization. The probability that a single lattice contact between two molecules may form is in part related to the fraction of surface area that is homogeneous on one molecule multiplied by the fraction homogeneous on the other,
![]() |
(Eq. 1) |
![]() |
(Eq. 2) |
Given a reduction in surface heterogeneity, what is the change in crystallization probability? Surface area is correlated with molecular mass (M) by the power law: surface area = 6.3 × M0.73, which on average predicts surface area to within 4% for monomeric proteins (40). The fraction of homogeneous surface can thus be approximated as a ratio of molecular masses of the total and of the homogeneous portion of the protein,
![]() |
(Eq. 3) |
![]() |
(Eq. 4) |
![]() |
(Eq. 5) |
Another variant of Equation 4 can be used to estimate the impact of
adding a ligand of fixed structure to a molecule that contains
heterogeneous portions. This expands the surface available for lattice
contacts and effectively dilutes the heterogeneous component. It may be
an approach of choice when the heterogeneity is essentially
unremovable, such as at the lipid interface of detergent-solubilized
membrane proteins. One faces the difficulty of estimating the extent of
heterogeneity to use Equation 4, but this might be done by summing
residual variable components or by topographical estimates for a
membrane protein. (For example, for a sphere embedded symmetrically in
a membrane of thickness h, 1 H = area(heterogeneous)/area(total) = h/[6M
/(
No)]1/3, where M is molecular mass,
is partial specific volume, and No is Avogadro's
number. Thereby, 1-H = 0.62 for h = 30 Å and M = 50 kDa.) Then the enhancement (
a) in probability on addition of a fixed component becomes,
![]() |
(Eq. 6) |
The accuracy of the quantification is only as good as the
approximations, and several of the approximations used here call for
further scrutiny. The approximation of molecular mass for surface area
was used for the initial protein prior to heterogeneity removal. This
is probably an underestimate since the completely heterogeneous
portions of the protein would not be expected to fold as compactly as
the homogeneous portions. In addition, the approximation that
0, tends to underestimate the
deleterious influence of heterogeneity on crystallization. Both of
these assumptions show an underestimation, but the equations still
should predict the correct general trend. For some assumptions,
however, the effect is more subtle. For example, the equations were
generated assuming one molecule per asymmetric unit. If one considered
a tight complex of molecules, the same equations would hold as long as
the complex did not have internal symmetry (complexes with internal
symmetry show a different average contact number). Finally the category
of heterogeneity is quite broad, and there are some situations, such as
with segmental flexibility where these equations may be invalid. For
example, in the case of two rigid domains connected by a flexible
linker, one would have to consider the possibility that one domain
could be fixed relative to the other with a single appropriate contact.
Increase of Molecular Variants-- Another aspect of variational crystallization, the use of multiple variants of the same protein, also increases the probability of crystal formation. In this case, the overall probability of crystallization is exponentially related to the number of variants. Assuming independence of variants (a reasonable assumption with different protein ligands; not as valid with minor changes) with n variants and a probability of crystallization for each variant of pi, the overall probability pT is,
![]() |
(Eq. 7) |
The enhancement in overall probability for successful crystallization
from a set of n variants can then be calculated relative to
the probability for a single variant. If we assume that the probability
for crystallization of this individual variant, i, is
typified by the average for all variants, pi pave, the enhancement factor is,
![]() |
(Eq. 8) |
![]() |
(Eq. 9) |
![]() |
EXPERIMENTAL PROCEDURES |
---|
Constructs of gp120-- The various recombinant gp120 glycoproteins used for crystallization trials were produced in stable Drosophila Schneider 2 lines under the control of an inducible promoter as described previously (41) (Table II). Genetic constructs containing various deletions and substitutions were made during the course of dissecting the gp120 domain structure. The procedures for making these constructs and the biological properties of the corresponding protein products are described elsewhere (see references in Table II).
|
Protein Production and Purification-- The N-terminal two domains of CD4 (D1D2), residues 1-183, were produced in Chinese hamster ovary cells and purified as described previously (27). Human monoclonal antibodies 17b, A32, C11, and F105 (derived from HIV-infected individuals) (42, 43) and mouse monoclonal antibodies L71 and 178.1 (44, 45) were purified by protein-A affinity chromatography. Secreted gp120 from Drosophila cells was purified by affinity chromatography with the F105 antibody covalently coupled to Sepharose. Following extensive washing with phosphate-buffered saline containing 0.5 M NaCl, gp120 protein was eluted with 0.1 M glycine, pH 2.8, followed by immediate neutralization with Tris buffer.
Protease Digestion-- Fab fragments were produced by papain digestion of monoclonal antibodies. Briefly, the antibody was reduced in 100 mM dithiothreitol, 100 mM NaCl, 50 mM Tris, pH 8.0, for 1 h at 37 °C, and dialyzed (4 °C), first in phosphate-buffered saline to reduce the dithiothreitol concentration to ~1 mM, then in alkylating solution (phosphate-buffered saline with 2 mM iodoacetamide, pH 7.5, 48 h), and subsequently in phosphate-buffered saline without iodoacetamide. The reduced and alkylated antibody was concentrated to at least 2 mg/ml and digested with papain using a commercial protocol (Pierce). An additional gel filtration chromatographic step on a Superdex S-200 column (Pharmacia, fast protein liquid chromatography) was added to ensure oligomeric homogeneity.
The gp120 proteins were subjected to digestion with papain, elastase, and subtilisin (Boehringer-Mannheim) to assay for proteolytic susceptibility. In these assays, the gp120 concentration was kept constant and the protease diluted serially (3.3 ×) from a ratio of 1:10 to 1:1000. The digestion mixture was incubated for 1 h at 37 °C and quenched by addition of 1% SDS (1:10 ratio) with immediate heating in boiling water for 2 min. Digestion products were analyzed with SDS-PAGE with and without dithiothreitol reduction.
Carboxypeptidase Y digestion was used to analyze the C terminus of gp120. A 1:10 ratio of carboxypeptidase Y (Boehringer-Mannheim) to gp120 was incubated for 1 h at 37 °C, pH 7.0. Even though digestion could not be easily seen by SDS-PAGE, the C terminus of gp120, HXBc2 strain, contains a number of positively charged amino acids, and the extent of the reaction could be monitored by native-PAGE.
Deglycosylation-- Drosophila produced gp120 proteins were deglycosylated enzymatically. Briefly, 0.5 mg/ml gp120 was incubated with various deglycosylating enzymes (singly or in combination) in 0.5 M NaCl, 100 mM sodium acetate, pH 5.7, for 10 h at 37 °C. Endoglycosidase D was used at a concentration of 0.1 unit/ml, endoglycosidase F at 0.25 unit/ml, endoglycosidase H at 0.25 unit/ml, and glycopeptidase F at 0.1 unit/ml (all from Boehringer-Mannheim). For crystallization variants involving the CD4·gp120 complex, the addition of D1D2 (which lacks carbohydrate) to the deglycosylation mixture was found to enhance gp120 solubility. The deglycosylation reactions were monitored by following the reduction in molecular weight on SDS-PAGE. Deglycosylation was nearly complete within 30 min and plateaued after 3 h. The extent of deglycosylation was judged by matrix-assisted laser desorption-mass spectroscopy, carbohydrate analysis, affinity for concanavalin-A, and mobility and bandwidth on SDS-PAGE. Protein aggregation was assayed by native-PAGE, dynamic light scattering, and gel filtration chromatography.
Monoclonal Antibody Binding Assay--
The various gp120
glycoproteins were assessed for recognition by a variety of monoclonal
antibodies directed against both linear and discontinuous gp120
epitopes by either immunoprecipitation (46) or by enzyme-linked
immunosorbent assay (47). The enzyme-linked immunosorbent assay was
performed with both fully glycosylated and deglycosylated V1/2
V3
glycoproteins immobilized on enzyme-linked immunosorbent assay plates
using a capture antibody specific for the gp120 C terminus, 6205 (International Enzymes) (47).
Binary and Ternary Complex Purification--
To ensure proper
stoichiometry and oligomeric homogeneity, all complexes were purified
by gel filtration chromatography on a Superdex S-200 column (Pharmacia,
fast protein liquid chromatography). This column exhibited good
resolution with routine separation of samples that differed by only
30% in molecular weight. Individual components were first purified
separately to ascertain their monomeric status. Components were then
combined to form complexes, which were repurified on the same column. A
buffer of 0.35 M NaCl, 5 mM Tris/Cl, pH 7.0, 0.02% NaN3 was used throughout. Peak fractions were
concentrated over Centricon-30 (Amicon) to a final protein concentration of ~10 mg/ml and either aliquoted and stored at 80 °C or used directly for crystallization.
Crystallization-- The vapor-diffusion hanging-droplet technique was used for all crystallizations. Small volumes, 0.5 µl of protein solution + 0.5 µl of reservoir solution, were used for most crystallizations, screenings, and final optimizations.
Screening-- The Crystal Screen I (Hampton Research) was used, augmented by approximately 20 conditions which tested high protein concentrations (vapor diffusion concentration of the protein at various pH values) as well as mixtures of organic additives (2-5% 2-methyl-2,4-pentanediol, PEG 400, or PEG 4000) combined with high ionic strength (2-4 M NaCl, (NH4)2SO4, or Na/KPO4) at pH 5.5-9.5. For each gp120 crystallization variant, a subset of 12 different conditions was analyzed in depth to establish the approximate precipitation point of the protein for a variety of different precipitants. The factorial solutions were then individually adjusted to target the observed precipitation point and a full screen of ~70 conditions was set up at 20 °C. After at least 1 week of constant daily observation, screening solutions were recalibrated to account for the observed 20 °C precipitation point and another full screen at 4 °C was set up. If no crystals were observed, the Crystal Screen II (Hampton Research) was set up at 20 °C.
Optimization--
In addition to the standard single variable
optimization of crystallization conditions, a factorial-like procedure
was used to determine if small amounts of different additives increased crystal quality. Type E crystals were grown from the following conditions: protein (82
V1/2*
V3
C5 gp120, two-domain CD4
(D1D2), Fab 17b purified as a ternary complex on the Superdex S-200); droplet (0.5 µl of protein solution consisting of ~10 mg/ml protein in gel filtration buffer + 0.4 µl droplet mixture containing 0.1 M sodium citrate, 0.02 M NaHepes, 10%
isopropyl alcohol, 10.5% PEG 5000 monomethylether (Fluka), 0.0075%
SeaPrep-agarose (FMC BioProducts), pH 6.4; Reservoir (0.35 M NaCl, 0.1 M sodium citrate, 0.02 M Hepes, 10% isopropyl alcohol, 10.5% PEG 5000 monomethylether, pH 6.4). The droplet mixture was kept at 37 °C to
ensure the agarose solubility, and the crystallization setup at room
temperature. Clumps of crystals appeared within 2 weeks of incubation
at 20 °C and grew for several months to maximal size.
X-ray Diffraction Characterization--
All data were collected
at beamline X4A of the National Synchrotron Light Source, Brookhaven
National Laboratory. The type E crystals were cross-linked with the
vapor diffusion technique of Lusty (48) by placing a crystallization
bridge (Hampton Research) with a 25-µl sitting droplet of 1%
glutaraldehyde (Sigma) in the reservoir of a standard hanging-droplet
vapor diffusion crystallization setup for 1 h at room temperature.
The cross-linked crystal was washed with stabilizer (reservoir solution
with only 50 mM NaCl) containing 10% ethylene glycol.
After approximately 24 h, the external liquid surrounding the
crystal was replaced with paratone-N (Exxon), the crystal mounted in an
ethylene loop (Hampton Research) (49), and flash-cooled in the nitrogen
stream of a cryostat (details are provided in (50)). Oscillation data
were processed with DENZO (51) and scaled with SCALEPACK (51).
![]() |
RESULTS AND DISCUSSION |
---|
To address the many problems associated with the crystallization of HIV-1 gp120, we exploited the mutability of the macromolecular surface using tactics that involved protein modification and conformational restriction (Table III). Several of these tactics contain novel features and are detailed here.
|
Variant Constructs of the gp120 Protein-- Variants of gp120 were developed through an iterative cycle which strove to eliminate heterogeneity. The cycle involved recombinant production of gp120 variants, deglycosylation, and then assessment of heterogeneity and flexibility by examinations of glycosylation status, monoclonal antibody binding, and protease sensitivity, leading to the design of new constructs. For example, protease digestion monitored by PAGE indicated susceptibility at the C terminus, and a form with 15-20 residues removed by carboxypeptidase Y retained CD4 binding activity. A homogeneous product was difficult to make by this method, and primer-based polymerase chain reaction mutagenesis and recombinant expression were used to generate a homogeneous gp120 derivative with a 19-residue C-terminal deletion. At the N terminus, sequencing of the initial constructs showed the expected signal cleavage at +31, with four additional amino acids, Gly-Ala-Arg-Ser, added from the signal peptide (a consequence of different processing of the cloning vector signal peptide with gp120). Protease digestion gave a product at +40, indicating flexibility in the N terminus. Progressive genetic truncation and biochemical analysis identified +83 as a variant that was recognized by conformation-dependent gp120 ligands, whereas +94 exhibited some conformational disruption (46). Thus much of the apparently flexible region at the N terminus of gp120 could be removed without disrupting the global conformation of the protein.
To further reduce flexibility, variable loops, V1, V2, and V3, were
deleted and replaced with shorter segments, as reported earlier (52,
53). Little effect was found on CD4 binding activity (47, 53). Three
constructs were made which contained deletions of the V1, V2, and V3
loops (Table II). In the V1/2
V3 construct, the entire base and
stem of the variable loops V1, V2, and V3 were excised. In the
V1/2*
V3 protein, the conserved stem of the V1/V2 stem-loop
structure was retained, restoring the CD4-induced antibody epitopes in
the presence of soluble CD4. In the
V1/2*
V3* protein, the base of
the V3 loop was retained as well, fully restoring CD4-induced antibody
epitopes, even in the absence of soluble CD4.
Deglycosylated Forms of gp120--
The asparagine-linked
carbohydrate on the gp120 glycoprotein produced in
Drosophila cells was analyzed. Dionex chromatography showed
that the carbohydrate on this protein consisted of
(N-acetylglucosamine)2 (fucose)F
(mannose)M, with F = 0 or 1 and
M = 3 to 9.3
Deglycosylation with enzymes such as glycopeptidase F (or
endoglycosidase F at pH 5.0), which cleave the glycosidic linkage and
convert the N-linked asparagine into an aspartic acid,
resulted in gp120 aggregation, although it remained soluble. Cleavage
of the 1-4 -bonds in the chitobiose core with endoglycosidases D or
H, leaving only a single N-acetylglucosamine residue and,
potentially, a 1-6 fucose attached to any of the glycosylated
asparagine residues, appeared to leave the protein intact as judged by
a panel of conformationally sensitive monoclonal antibodies (47).
Digestion of full-length constructs with endoglycosidase H, which has
specificity for oligosaccharides with 5-9 mannose residues, removed
roughly 60% of the carbohydrate, and addition of endoglycosidase D,
which cleaves oligosaccharides with 3 or 4 mannose residues, removed up
to 90% of the carbohydrate. For the variable loop-deleted constructs,
all mannose residues were removed with the endoglycosidase D/H
combination as judged by carbohydrate analysis and by the inability of
concanavalin A to bind to the deglycosylated protein. Mass spectroscopy
of the deglycosylated
82
V1/2*
V3
C5 gp120 showed a molecular
mass of 39,000 ± 50 Da, consistent with a mass of 35.4 kDa for
the protein (based on the DNA sequence) and 3.6 kDa for the remaining carbohydrate. Carbohydrate analysis showed only fucose and
N-acetylglucosamine sugars to be present, in a ratio of
1:3.05 ± 0.02, respectively. These results suggest that, of the
18 potential asparagine glycosylation sites in the
82
V1/2*
V3
C5 gp120, five are unused, nine are modified
with N-acetylglucosamine, and four with
N-acetylglucosamine (1-6)-fucose.
Complexes with gp120 Ligands-- Protein ligands, CD4, and the Fab fragments of monoclonal antibodies, were used in an attempt to reduce mobility in the overall surface of the protein and, hence, in the potential crystal lattice. This was complicated by the internal mobility of these ligands: CD4 has a flexible juncture between the second and third extracellular domains (54), and Fabs have a conformationally mobile "elbow bend" between their variable and constant domains (55). For CD4, we used a construct containing the N-terminal two domains (1-182), for which we had previous success in structure determination (27). Fabs of the monoclonal antibodies were screened individually, even though combinations of Fabs were possible.
Initial trials with the Fab 178.1, which recognizes a linear epitope in
V3 of both free and CD4 bound gp120 (44), gave only crystalline
precipitates at best. We also tested the Fab of the anti-CD4 antibody
L71, which recognizes the CDR3-like loop in domain D1 (45), but had
difficulties preparing ternary complexes, probably due to a
destabilization of the CD4-gp120 interaction. Subsequently, we focused
on gp120-directed antibodies with discontinuous epitopes, which were
more likely to recognize conformationally rigid portions of gp120.
Complexes of gp120 proteins with Fabs of C11, which recognizes an
epitope spanning C1 and C5 (42), and F105, whose epitope lies within
C2, C3, C4, and C5 (overlapping the CD4 binding site) (43) gave only
poor crystals (Table IV). We had greater
success with 17b, which not only recognizes a discontinuous epitope but
discriminates between different conformational states of gp120 (36).
The Fab of 17b did not bind the initial gp120 constructs, requiring the
restoration of the stem of the V1/V2 loop (constructs V1/2*
V3 or
V1/2*
V3*).
|
Crystallization-- We screened 18 different combinations of gp120 variants and ligands (Table IV), using a limited factorial based crystallization screen. Factorial screening was originally devised as a method for deducing the essential crystallization factors from combinations of different conditions (1). The empirical observation, however, that most crystallizable macromolecules are able to crystallize from a limited set of common conditions, has validated an entirely different process: crystallization screening with a small but diverse collection of fixed conditions (2). A high probability of success has been reported with as few as 6 different conditions at 4 different concentrations (56), and commercial kits are available with 50-100 conditions (Hampton Research).
In conjunction with the limited crystallization screen, small volume droplets were used, typically 0.5 µl of protein per crystallization trial. With small volumes, 1-2 mg of protein was sufficient to evaluate each gp120 crystallization variant. Smaller volumes were also more efficient at nucleation than larger droplets, perhaps due to higher surface tension effects which may result in a greater range of precipitant concentrations for each droplet to sample. Indeed, droplets that were "spread-out" also showed enhanced nucleation. This explanation may also account for the well known observation that crystals frequently nucleate from the edges of crystallization droplets.
The initial crystallization screens produced six different types of crystals (Fig. 1, Table V). For crystal types A-D, extensive optimization was unable to produce single crystals large enough to be characterized. For crystal types E and F, single crystals of needle morphology could be grown. The growth of single crystals of type E, however, required the addition of agarose, which was identified during optimization by the additive screening process. Trials with a variety of agaroses found that SeaPrep, with a gelling point near room temperature, gave the best results. Despite considerable effort, further crystallization optimization failed to produce large single crystals, and the best typical crystals were rods with a cross-section of only 30 × 40 µm. A closely related crystallization variant, which retained 10 additional amino acids in the stem of the V3 loop, failed to crystallize (Table IV).
|
|
Characteristics of gp120 Crystals-- Single crystals of type E and F were analyzed for diffraction in capillary mounts. Only type E crystals showed diffraction. The needle axis of type E crystals proved to coincide with the a axis, and the rhombohedral cross-section perpendicular to the needle axis proved to be bounded by faces of the form (0 1 1). These could be distinguished from type F crystals, where the cross-section was hexagonal. Gel electrophoresis of type E crystals demonstrated that they contained all the elements of the ternary complex: gp120, D1D2, and Fab 17b (Fig. 2).
|
We were unable to flash-cool the type E crystals with standard cryoprotectants. Satisfactory results were found with a procedure that (i) fortified the crystals with vapor-diffusion glutaraldehyde cross-linking (48), (ii) permeated the crystals with 10% ethylene glycol, and (iii) used an immiscible oil, paratone-N, to replace the external solution around the crystals prior to flash-cooling (50) Cryopreserved crystals diffracted to Bragg spacings of better than 2 Å, although the diffraction was anisotropic, with higher mosaicity along the 88 Å b-axis.
Type E crystals were orthorhombic, space group P2221 with
unit cell parameters, a = 71.25 Å, b = 88.11 Å, and c = 196.44Å ( =
=
= 90°).
Solvent content analysis yielded a solvent content of 58% for one
ternary complex in the crystallization asymmetric unit (assuming
partial specific volumes of 0.73 for protein and 0.65 for carbohydrate
and the observed total molecular mass of 108.3 kDa for the complex of
which 3.6 kDa is carbohydrate). Diffraction data have been collected to
a limit of 2.2-Å spacings (Table
VI).
|
Conclusions-- Our success with gp120 demonstrates the power of variational crystallization. We have derived equations that quantify the effect of this strategy on the overall probability of crystallization and have calculated the corresponding probability enhancements for several of the biochemical and molecular biological manipulations employed in this study. As can be seen (Table III), the probability of crystallization can be strongly influenced by reducing molecular surface heterogeneity. The influence of using multiple variants is more difficult to quantify since it depends on the individual probability of crystallization for each variant. Nonetheless, our theoretical analysis shows that the effect of multiple variants is greatest for proteins least likely to crystallize.
While the variational approach with gp120 did involve extensive effort, this was primarily a consequence of the difficulty in producing the gp120 glycoprotein, which involved expression levels of only a few mg of gp120 per liter of eukaryotic cell culture. While future advances in molecular biology will no doubt make such projects less arduous, if proteins are expressed bacterially, present day recombinant techniques coupled to affinity or "tag" purifications make the generation of variants straightforward. A recent example, involving the generation of 11 different variants in the crystallization of an ionotropic glutamate receptor (57), required only a 6-month effort.4
The resistance of gp120 to crystallization may be related in part to its functional role in eluding the immune system; the mechanisms evolved to prevent the formation of specific immune system: gp120 contacts, might also thwart formation of the homogeneous gp120:gp120 contacts needed for crystallization. Perhaps relevant to this, the protein modifications that most greatly reduced heterogeneity (and thus enhanced the crystallization probability), removal of carbohydrate and substitution of the variable loops (Table III), have been shown to enhance the generation or binding of neutralizing antibodies (58, 59).
It is difficult to evaluate the predictions of the crystallization algorithms derived here in a statistically significant manner. The failure of proteins to crystallize is rarely reported in the literature, and our own results comprise too small a sample to be statistically meaningful. Nonetheless, we note that for gp120 the algorithms predict that crystals are most probable with deglycosylation, variable loop removal, and addition of an ordered protein ligand. Consistent with prediction, for the 6 crystallization variants that did have all of these modifications, three (or 50%) produced crystals, whereas for the 12 variants that did not have these modifications, no crystals (0%) were grown. In addition, theory predicts that well ordered crystals are most probable when the overall probability of crystallization is highest; Table IV shows that the crystallization variant that produced the only well ordered crystals appeared to have the greatest probability of crystallization, producing three different crystal forms whereas the best of the other variants only produced one form each.
The crystallization literature is replete with examples of protein manipulation, from proteolytic digestion, to variation in solvating detergent, to screening of DNA oligonucleotides (38). What distinguishes our efforts is the derivation of a theoretical foundation, which allows the probabilistic assessment of the most effective crystallization approach. Because of the conformational complexity of gp120, we focused on surface modification, to eliminate heterogeneity and to present new crystallization variants, coupled to a limited screen of crystallization conditions. The types of crystallization problems embodied in gp120 (Table III) are not so different from many of the typical problems facing present day crystallographers; both from a theoretical or from a practical perspective, the strategy of probability analysis coupled to variational crystallization may be broadly applicable.
Subsequent to the submission of this manuscript, the structure
determination of type E crystals was reported (63).
![]() |
ACKNOWLEDGEMENTS |
---|
We thank Mary Ann Gawinowicz and Andrew Pound for N-terminal sequencing and carbohydrate analysis, Craig Ogata for beamline assistance, and past and present members of the Hendrickson group, especially Arno Pähler for his maxim, "The most important variable in a protein crystallization is the protein itself." We thank the Biopharmaceuticals Division of SmithKline for contributions to the expression, production, and purification of gp120 and CD4 proteins, particularly M. Strohsacker and D. Kokolis. X-ray diffraction data were collected at beamline X4A, National Synchrotron Light Source, Brookhaven National Laboratory.
![]() |
FOOTNOTES |
---|
* This work was supported by grants from the Aaron Diamond Foundation, the National Institutes of Health, and the G. Harold and Leila Mathers Foundation, the Friends 10, the late William F. McCarty-Cooper, and Douglas and Judith Krupp.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The abbreviations used are: HIV, human immunodeficiency virus; PAGE, polyacrylamide gel electrophoresis; PEG, polyethylene glycol; Fab, antigen binding fragment of an antibody.
2 P. D. Kwong, unpublished data.
3 J. S. Culp, unpublished data.
4 E. Gouaux, personal communication.
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() |
---|