(Received for publication, July 9, 1995; and in revised form, August 11, 1995)
From the
Major histocompatibility complex (MHC) class I molecules are
cell-surface glycoproteins that bind peptides and present them to T
cells. The formation of a peptide-MHC complex is the initial step in
specific, T cell-mediated immune responses. But, unlike other
receptor-ligand systems, peptides are essential for a stable
conformation of the MHC proteins. To investigate the contribution of
every amino acid of octapeptides to the stability and antigenic
integrity of MHC proteins, complex octapeptide libraries with one
defined amino acid and mixtures of 19 amino acids in the remaining
seven positions were synthesized and tested for their capacity to
stabilize the conformation of the mouse MHC class I molecule
H-2K. Peptide transporter-deficient RMA-S cells were
employed in this study. Amino acid preferences found for the eight
sequence positions reveal constitutional, volumetric, and steric
constraints that govern peptide selection by MHC molecules. The pattern
of amino acid preferences indicates that the peptides behave as
integral parts of the MHC proteins and follow rules established for the
interrelationship of primary sequence and the conformation and
stability of proteins in general.
T cell-mediated immunity is centered around molecules encoded by
highly polymorphic genes of the major histocompatibility complex (MHC). ()These molecules bind fragments of protein antigens to form
complexes that are the ligands for the specific antigen receptor of T
cells (Barber and Parham, 1993). MHC class I (MHC-I) molecules are
composed of two subunits, MHC-encoded heavy chains and light chains
(
-microglobulin), plus peptides of mostly 8 or 9 amino
acids. The peptide-binding site of MHC-I molecules formed by the heavy
chains is a groove framed by two
-helices that are positioned on
top of a
-pleated sheet (Bjorkmann et al., 1987a; Fremont et al., 1992). The groove is lined with some of the most
polymorphic amino acids of the MHC molecules (Bjorkmann et
al., 1987b). The peptides adapt extended conformations. In the
case of MHC-I molecules, their orientation inside the groove is defined
by conserved MHC side chains that compensate the carboxyl- and
amino-terminal charges (Fremont et al., 1992; Madden et
al., 1993). The peptide-binding domain is supported from beneath
the
-plate by the
-microglobulin. The proximal
domain of the heavy chain and
-microglobulin, in
contrast to the peptide-binding region, fold like immunoglobulin
domains.
At the physiological temperature of 37 °C, MHC-I molecules are stable only when proper peptides are incorporated into their structures (Fahnestock et al., 1992). Naturally occurring MHC-associated peptides have been analyzed by pool sequencing (Falk et al., 1991) and high performance liquid chromatography combined with mass spectrometry (Jardetzky et al., 1991; Hunt et al., 1992) and were found to be mostly octa- or nonapeptides that conform to MHC allele-specific sequence motifs. Production and incorporation of peptides into the MHC-I structure inside cells involve complicated processing machineries. Proteasomes are believed to generate suitable peptides by proteolytic degradation of protein precursors, ABC transporters to deliver the peptides to the site of MHC biosynthesis in the endoplasmic reticulum and molecular chaperonins to assist in the assembly of the trimeric complex (Germain and Margulies, 1993). Cells with a genetic defect in the genes coding for the peptide transporter proteins have drastically reduced surface MHC-I levels. MHC-I expression can be restored by the addition of synthetic peptides that exhibit the epitope sequence motif for the tested MHC-I allomorph (Townsend et al., 1989). Also, incubation at 26 °C results in increased cell-surface expression of MHC-I proteins, which then appear to be free of peptides (Ljunggren et al., 1990). These ``empty'' MHC-I molecules can be loaded externally with synthetic peptides and are thereby stabilized (Schumacher et al., 1990). Unloaded MHC-I molecules denature when the cells are cultured at 37 °C.
The first specific step
toward T cell-mediated immune responses is the creation of stable MHC-I
protein conformations by the incorporation of peptides into the MHC
molecules. This implies a close relationship between peptide sequence
and MHC molecule. The dominant allele-specific sequence motifs found
for peptides eluted from MHC molecules give an indication for this
relationship (Falk et al., 1991). In this study, we analyzed
the contribution of every amino acid in every sequence position of
octapeptides to the stability of MHC-I molecule structures. We employed
complex peptide libraries (Dooley and Houghten, 1993; Jung and
Beck-Sickinger, 1992) in which the effects of the individual amino
acids in the different positions of octapeptides were specified by
combining single defined with seven randomized sequence positions
(Udaka et al., 1995). ()The synthesis of these
libraries either with one defined amino acid or with a premixed set of
19 amino acids (all common proteinogenic amino acids except for
cysteine) in the different positions of octapeptides was optimized to
approach equimolar representation of the individual peptides in the
libraries. The qualities of the preparations were confirmed by
electrospray mass spectrometry, amino acid analysis, and pool
sequencing. The completely random X
peptide
library (mixtures of 19 amino acids in all eight positions) and 152
OX
sublibraries (one defined and seven randomized
positions) were employed in MHC-I stabilization assays to elucidate the
rules for amino acid preferences in peptides bound by the mouse MHC-I
molecule H-2K
.
The levels of cell-surface
expression were analyzed by flow cytometry using a FACScan apparatus
(Becton Dickinson, Heidelberg, Germany). Samples were gated according
to forward and sideward scattering, and fluorescence data were
collected with a logarithmic mode setting. Data were transferred to a
personal computer using the FAST488 system (JTES BioTec, Freienwill,
Germany), transformed to linear fluorescence values, and averaged to
obtain mean fluorescence intensities (MFI) with the help of the MFI
software (E. Martz, University of Massachusetts, Amherst, MA). The
titration curves were compared at the inflection points. The peptide
concentrations required for half-maximal H-2K expression
were calculated by employing formalisms of the occupancy concepts
(Moyle et al., 1978) and linear regressions over plots with
logit(p) = ln[p/(1 - p)] versus log[peptide], where p = (MFI
-
MFI
)/(MFI
- MFI
).
Results are expressed as log[stabilization index], with the
stabilization index (SI) being the concentration required to achieve
50% maximal effect with X
divided by the
corresponding concentration needed in the case of the indicated test
peptides (Udaka et al., 1995). All peptide libraries were
tested in duplicates and in two to three independent experiments.
The X peptide library and all 152
OX
sublibraries contain peptides that bind to and
stabilize H-2K
as indicated by an increased number of
conformationally intact MHC-I molecules detectable with monoclonal
antibody B8.24.3 (Udaka et al., 1995). With all the peptide
libraries, the same maximal level of H-2K
expression was
obtained, and this level was identical to the levels achieved with
defined H-2K
-binding cytotoxic T lymphocyte epitopes like
SIINFEKL or RGYVYQGL (data not shown; see also Udaka et al. (1995)). The dose-response curves, however, were shifted with
respect to the X
curve. Depending on whether a
particular side chain has a positive or negative effect on binding of
the peptides and on the stability of the resulting peptide-MHC
complexes, lower or higher concentrations of the OX
sublibraries were required. Stabilization assays done with a
panel of defined peptides demonstrated that detection of H-2K
by monoclonal antibody B8.24.3 is not influenced by the
particular sequence of the peptide bound (data not shown). The peptide
concentrations required for a half-maximal stabilization effect varied
slightly with the experiments. These variations were caused by the
complexity of the assay system. Differences in the fidelity of the
cells and variations in MHC expression are the most likely sources. To
compensate for these variations and to allow direct comparison of
results from different experiments, stabilization indices were
calculated as the reciprocal ratios of the concentrations of test
library peptides required for half-maximal H-2K
expression
and the corresponding concentration of X
library
peptides tested in the same experiment. These stabilization indices are
expressed in logarithmic form and are shown in Fig. 1(a-h) for the sequence positions P1 through
P8, respectively, in the order of decreasing stabilization efficiency
of the amino acids.
Figure 1:
Binding of
OX peptide libraries to H-2K
. MHC
class I molecules stabilized by peptide libraries were detected with
the conformation-dependent monoclonal antibody B8.24.3. The results are
shown as logarithms of the stabilization indices calculated for the 152
OX
peptide sublibraries and are given separately
in the order of decreasing stabilization power for the eight sequence
positions. a, position 1; b, position 2; c,
position 3; d, position 4; e, position 5; f,
position 6; g, position 7; h, position 8.
Hydrophobicity values were taken from Roseman(1988). The side chain
volumes were calculated from the volumes of amino acids given by
Zamyatnin(1972) by substracting the volume of
glycine.
The average of the absolute values of
log[SI] for all 19 OX sublibraries of
one sequence position gives a measure of the tolerance to amino acid
variations in this position (Fig. 2). A position that exhibits
absolute tolerance to amino acid variations would accept all amino
acids equally well. This means that the concentrations required for
half-maximal MHC-I expression would be the same for all 19
OX
peptide sublibraries and for the completely
random X
peptide library, which includes all
peptides of the OX
sublibraries. The SI values
would be 1, their logarithms 0, and the average of the absolute values
of log[SI] also 0. Tolerance to amino acid variations can be
restricted because the properties of particular side chains are
favorable or because the properties of other side chains are
unfavorable. The dose-response curves for the corresponding
OX
peptide libraries would be shifted with respect
to the X
curve to lower or higher peptide
concentrations. The resulting SI values would deviate from 1, and their
logarithms would be positive for preferred side chains with
MHC-stabilizing effects and negative for destabilizing side chains. We
used the average of the absolute values of log[SI] to
quantitate these deviations from the theoretical situations of complete
tolerance independent of the direction of the biases.
Figure 2:
Tolerance to amino acid variations in the
eight sequence positions of H-2K-binding octapeptides. The
average of the absolute values of log[SI] are
shown.
No absolutely
tolerant position was found for H-2K-binding peptides. The
analysis reveals three types of sequence positions differing in the
degree of tolerance to structural variations. Positions 4, 7, and 6 are
relatively tolerant, with tolerance decreasing in the indicated order.
Positions 1-3 are significantly more restricted. Positions 5 and
8 are the most restricted positions. These different attributes of the
eight sequence positions are also obvious from the SI panels in Fig. 1(a-h). Moreover, these panels reveal four
general features of the sequence positions of these octapeptides.
First, all amino acids are permitted in all sequence positions. Second,
the vast majority of the amino acids are destabilizing. The only
exception to this tendency was found for position 7, where about half
of the amino acids are stabilizing. Third, the more restricted
positions (1-3, 5, and 8) are characterized by pronounced
destabilizing effects. Fourth, strongly stabilizing amino acids were
found for positions 5, 7, and 8. Position 4 and, similarly, position 6
show neither strongly stabilizing nor strongly destabilizing effects.
The amino acid selectivity exemplified with these stabilization
measurements indicates preferences for particular side chains in
different sequence positions of octapeptides and suggests that physical
constraints (constitutional, volumetric, and steric) control peptide
selection by H-2K. To illustrate the influence of
constitutional constraints, hydrophobicity indices for the amino acids
as compiled by Roseman(1988) were included in Fig. 2. There is a
general preference for hydrophobic amino acids. The dominance of amino
acids with hydrophobic side chains is unequivocal for positions 1, 3,
5, and 8. Conversely, neutral or positively charged hydrophilic side
chains are preferred in position 7. Positions 2, 4, and 6 allow
hydrophobic as well as hydrophilic amino acids and appear to be less
constrained than other positions.
The influence of volumetric constraints is also easily detected. Side chain volumes were calculated from the volumes of amino acids taken from Zamyatnin(1972) by substracting the volume of glycine. The resulting values are also included in Fig. 1. Volumetric constraints are obvious for position 3 and less pronounced for positions 4, 5, and 8, where amino acids with large side chains are preferred. The remaining five positions show no such restrictions.
Structure, stability, and functional capacity of proteins are
determined by the amino acid sequences of their polypeptide chains. In
this respect, MHC molecules are exceptional as their conformational
integrity is dependent on foreign peptides of mostly 8 or 9 amino acids
that need to be incorporated into their protein structure. These
peptides are derived from diverse sources and can vary from one MHC
molecule to another. Thus, MHC-I proteins are heterotrimers of a
monomorphic light chain (-microglobulin), a
polymorphic heavy chain with a high degree of sequence diversity within
a species, and an extremely variable peptide. Antigen presentation by
MHC molecules can be regarded as protein-chemical duty dictated by the
necessity to form a stable protein structure. Selection of suitable
peptides by these molecules is therefore expected to be determined by
rules that also govern the interrelationship of primary structure and
protein conformation. Hydrophobic side chains of amino acids in
proteins are the major constituents of protein cores and are crucial
for stable conformation (Kauzmann, 1959; Tanford, 1962; Baldwin, 1986;
Privalov and Gill, 1988; Murphy et al., 1990), but they are
not necessarily buried inside the molecules (Richards, 1977; Miller et al., 1987). A large fraction of such amino acids are found
at protein surfaces. Nevertheless, a strict preference for hydrophobic
side chains in defined sequence positions of homologous monomeric
proteins strongly indicates that these side chains are buried (Rose et al., 1985). Equally relevant for a stable conformation is
the exclusion of hydrophilic side chains from the protein core.
Hydrophilic amino acids are mostly found at protein surfaces. Buried
hydrophilic side chains require partners for hydrogen bonds or salt
bridges in precisely defined positions in order to be tolerable inside
protein structures (Baker and Hubbard, 1984). As a consequence, little
variability is allowed for such positions (Lesk and Chothia, 1980).
With the peptide libraries used in this study, all possible
octapeptides were tested for their contribution to a stable MHC
conformation. The combination of single defined sequence positions with
randomized positions allowed specification of the effect of every amino
acid in such heterogeneous mixtures. The restricted tolerance to amino
acid variations found for all sequence positions of
H-2K-binding octapeptides indicates the influence of
physical constraints on peptide selection by MHC molecules (Bowie et al., 1990). Strong preferences for hydrophobic side chains
in positions 1, 3, 5, and 8 are indicative of constitutional
constraints. Apparently, a stable MHC conformation is achieved more
readily when the side chains of the amino acids in these positions are
buried inside the MHC molecule and thereby provide a large interaction
area with MHC residues (Murphy et al., 1990). The strong
preference for hydrophilic side chains in position 7 also reveals the
influence of constitutional constraints. Side chains of amino acids in
position 7 are likely to be exposed at the surface of the molecule.
Volumetric in addition to constitutional restrictions guide the
amino acid preference for position 3 and, less pronounced, for
positions 5 and 8. Positions 5 and 8 appear more restricted, with the
former requiring aromatic and the latter aliphatic side chains. These
two positions have been classified as anchor positions for the small
number of different amino acids found by pool sequencing of peptides
extracted from isolated H-2K molecules (Falk et
al., 1991). Various observations including results from pool
sequencing and stabilization assays with analogues of known epitopes
point to position 3 as a secondary anchor (Falk et al., 1991;
Jameson and Bevan, 1992). Results from crystal structure analyses of
H-2K
are in agreement with these interpretations (Fremont et al., 1992). Pockets within the peptide-binding site were
identified, which are deeper for peptide side chains in positions 5 and
8 and shallower for those in position 3. The pocket for secondary
anchors in H-2K
is deeper in other MHC molecules like HLA
A2.1 and accommodates one of the two dominant anchor residues (Madden et al., 1993). The volumetric flexibility of position 1 as
compared with the other three positions that strongly prefer
hydrophobic side chains implies a higher degree of freedom for the
orientation of the amino acid side chains. Serine is the only
hydrophilic amino acid in position 1 that has the capacity to stabilize
H-2K
. The hydroxy group of this amino acid, like the
hydroxy group of threonine, can form hydrogen bonds back to the main
chain of the peptide. Serine and threonine are occasionally found in
core regions of proteins (Lim and Sauer, 1989). Serine in position 1
could be buried or exposed depending on influences from other amino
acids in the peptides. Based on the analysis of crystal structures of
various proteins, Bordo and Argos(1991) have suggested permissive amino
acid substitutions that would have minimal impact on neighboring side
chains. Their classification correlates well with our stabilization
indices.
Steric constraints, in contrast to constitutional and
volumetric constraints, result from the conformation of the entire
protein or peptide and therefore are not as easily detected. However,
the strong preference for hydrophilic side chains in position 7 seems
to indicate such steric conditions. The ultimate amino acid is fixed in
its position through, first, the side chain buried deeply inside the F
pocket of the peptide-binding groove and pointing away from the surface
of the peptide-MHC complex and, second, the -carboxy group that is
bound by conserved side chains of the MHC molecule (Fremont et
al., 1992; Madden et al., 1993). These structural
conditions in the context of the extended conformation of MHC-bound
peptides leave little freedom for the side chain orientation of the
penultimate amino acid (Ramachandran and Sasisekharan, 1968). These
side chains are bound to point out of the groove. Therefore,
hydrophilic amino acids are preferred. Although there is no clear
preference for the particular constitution or size of amino acids
accepted in position 2, the strongly destabilizing effect of many amino
acids indicates an anchoring function for this position. Positions 2,
4, and 6 are the least constrained, allowing a high degree of freedom
for the choice of amino acids. Their side chains could be partially or
completely exposed at the surface of the molecule, depending on the
sequence context in the individual peptides.
The above
interpretation of the stabilization measurements can be summarized to
describe the structure of an ideal H-2K-binding
octapeptide. Positions 1, 3, 5, and 8 should be occupied by an amino
acid with a hydrophobic side chain that would be buried inside the
groove. For positions 5 and 8, aromatic and aliphatic residues are
preferred, respectively. Position 7 should harbor an amino acid with a
hydrophilic side chain that would be exposed. Positions 2, 4, and 6
could accommodate different amino acids, and their side chains could be
partially buried. In this model, positions 8 and 5 followed by
positions 1-3 and finally 7 would contribute most to the
stability of H-2K
. Residues in positions 1-3, 5, and
8 should, if at all, only indirectly affect recognition of the complex
by T cells (Chen et al., 1993; Falk et al., 1994). On
the other hand, side chains in position 7 followed by positions 4, 6,
and potentially 2 would contribute most to the interaction of the
peptide-MHC complex with the complementary T cell receptor.
The
structure of natural cytotoxic T lymphocyte epitopes may, however,
deviate from this ideal epitope (Fremont et al., 1992; Madden et al., 1993). In addition to the role of the side chains of
the ligands, interactions between MHC-I side chains and the main chain
of the peptides have been found to contribute substantially to the
binding (Matsumura et al., 1992). Also, the interactions of
the invariant terminal amino and carboxy groups of the peptides with
conserved MHC residues bear a major share of the stabilization energy
(Bouvier and Wiley, 1994). Moreover, a proper distribution of anchoring
side chains has been shown to compensate for the destabilizing
influences of single amino acids (Saito et al., 1993; Udaka et al., 1995). Nevertheless, it is expected that the more a
peptide deviates from the basic structure described above, the weaker
is its capacity to stabilize the H-2K conformation.
The stabilization indices presented in this report should help to predict the relative efficiency of peptides for binding to and for stabilizing MHC molecules. However, the capacity of every single amino acid side chain to contribute to a stable MHC conformation is influenced by other amino acids in the sequence. Mutual dependence of the contribution of the amino acids in epitope sequences is expected from the length of the peptides and from the extent of surface contact between peptide and MHC molecule (Fremont et al., 1992). This interdependence precludes precise prediction of the efficiency of peptides for binding to MHC-I molecules (Udaka et al., 1995; Horovitz, 1987; Jencks, 1981). Nevertheless, an increasing number of amino acids with high stabilization indices will result in increased binding efficiencies. Thus, SI values should provide useful guidelines for heuristic approaches to epitope identification. The development of algorithms for the assessment of peptide efficiency for binding to MHC molecules is in progress. With the help of corresponding query systems that bear on SI values, protein sequence data bases can be scanned to identify potential epitopes. However, an exact prediction of T cell epitopes will not be possible for three reasons. First, the impact of interdependence of the amino acids in MHC-binding peptides has not yet been elucidated. Second, peptide binding to MHC molecules occurs in a situation that at best is described as a steady-state situation. Consequently, also weakly bound peptides can be presented and give rise to T cell responses if they are generated at sufficiently high rates. Third, selection of T cell epitopes is dependent on the T cell repertoire and thereby influenced by processes that select self-MHC-restricted and self-tolerant T cells.
The rules that govern amino acid preferences for peptides bound by MHC molecules are reminiscent of rules for the structural basis of protein integrity and stability. Peptide-MHC complexes appear to be convenient model molecules for studies of the relationship of protein sequence and protein structure. The possibilities of multiple peptide synthesis and generation of defined peptide-MHC complexes would allow many questions in this field to be very efficiently addressed.